Thursday, February 27, 2014

EMC ScaleIO





Today, I was fortunate enough to attend a briefing by Boaz Palgi. He was the founder of ScaleIO, which was acquired by EMC in July 2013.



ScaleIO is software-defined, distributed shared storage. It is a software-only solution that enables you to create a SAN from direct-attached storage (DAS) located in your servers. ScaleIO creates a large pool of storage that can be shared among all servers. This storage pool can be tiered to supply differing performance needs. ScaleIO is infrastructure-agnostic.  It can run on any server, whether physical or virtual, and leverage any storage media, including disk drives, flash drives, or PCIe flash cards.



To start off the briefing, Boaz went into the market dynamics that helped shape ScaleIO.



The first change in the IT market was the dramatic increase in server resource capacity that started in 2006 with the dawn of the core processor architecture by AMD, current compute and storage systems are massively scalable and provide a tremendous amount of performance. This was also a driving factor for the adoption of virtualization, it increased host density ratios to the point that the infrastructure savings from server consolidation were to adventagous to ignore. Another driving factor was the push toward centralized datacenter models, distributed datacenters are the a thing of the past, companies are using immense enterprise datacenters with thousands of server resources running tens of thousands of applications. The last big change in the market place was the commoditization of infrastructure components led by companies like Google and Amazon.


As a result, today's IT leaders want to create an agile cost effective datacenter leveraging converged commodity infrastructure.


ScaleIO is all about convergence, scalability, elasticity, and performance. The software converges storage and compute resources into a single architectural layer, which resides on the application server. The architecture allows for scaling out from as little as three servers to thousands by simply adding nodes to the environment. This is done elastically; increasing and decreasing capacity and compute resources can happen “on the fly” without impact to users or applications. ScaleIO also has self-healing capabilities, which enables it to easily recover from server or disk failures. ScaleIO aggregates all the IOPS in the various servers into one high-performing virtual SAN. All servers participate in servicing I/O requests using massively parallel processing.



There are different configuration models you can design with ScaleIO. You can use a fully converged model where the application and storage are all on the same nodes with ScaleIO running alongside the applications. They can be asymmetric nodes, meaning the nodes can have different number of spindles. You can have a combination of application only servers and converged servers. The application servers would not be providing storage resources, but would be utilizing converged servers storage capacity. And the last model would not only take advantage of converged servers, but would also have ScaleIO taking advantage of storage-only servers. 

In the diagram below, you can see the Application servers (A), the Converged servers (C), and the Storage-only servers (S) in a three-layer configuration. 



ScaleIO runs on the hardware of your choice. It is designed to massively scale from three nodes to thousands of nodes. Unlike most traditional storage systems, as the number of servers grows, so do throughput and IOPS. The scalability of performance is linear with regard to the growth of the deployment. Lets say that you had 1 node that was 1,000 IOPS, then 10 nodes would grow to 10,000 IOPS, 100 nodes to 100,000 IOPS, and 1,000 nodes would grow to 1,000,000 IOPS. Whenever the need arises, additional storage and compute resources (i.e., additional servers or drives) may be added. This enables the storage and compute resources to grow together so the alignment and balance between them is preserved and maintained.

 



The system virtually manages and reconfigures itself as the underlying resources change. As you add capacity, ScaleIO goes to work in the background and rearranges the data on the servers to optimize performance and enhance resilience. All of this happens automatically in the background without operator intervention and with minimal impact to applications and users. At the end of a rebalance operation, the system is fully optimized for both performance and data protection. No explicit reconfiguration is needed.





A similar process will happen when nodes are removed. In this illustration, three servers are removed from an eight server cluster. After rebalance, the data is rearranged on the remaining five servers, spread or striped evenly, and is fully redundant.






ScaleIO uses massive striping; it seems very similar to the wide striping chunklet approach that the HP 3PAR storage utilizes. The volume chunks are spread across the cluster in a balanced manner. Because the application has the ability to utilize all the disks for the entire cluster, it can scale to a substantial amount of IOPS. The layout scheme that they use helps to eliminate I/O splits, which helps provide more IOPS with less CPU usage.

This data striping methodology also helps with data protection. Once a disk or node fails, the rebuild load is balanced across all the protection domain disks or nodes. It is much faster than traditional RAID based rebuilds. If the failing node returns during the rebuild, a smart and selective transition to "backwards" rebuild is performed once the failed disk or node is back online.
One aspect I found fascinating in the briefing is the capability of a node data migration. If you had three legacy application servers, lets say HP Proliant DL580 G5 servers and you were moving to the newly release HP Proliant DL 580 G8 servers. You would add the new nodes into the storage pool, the data would then migrate to the new nodes and rebalance, then the legacy servers would be removed from the storage pool so they could be decommissioned.
In enterprise datacenters, there are a broad range of requirements that exist for the various applications deployed in the organization. There is no one-size fits all. ScaleIO offers a set of features that gives IT departments complete control over performance, capacity and data location. Protection domains allow you to isolate specific servers and data sets. This can be done at the granularity of a single customer so that each customer can be under a different SLA. Storage pools can be used for further data segregation and tiering. For example, data that is accessed very frequently can be stored in a flash-only storage pool for the lowest latency, while less frequently accessed data can be stored in a low-cost, high-capacity pool of spinning disk.
With ScaleIO, you can limit the amount of performance—IOPS or bandwidth—that selected customers can consume. The limiter allows for resource distribution to be imposed and regulated to prevent application “hogging” scenarios. Light data encryption at rest can be used to provide added security for sensitive customer data. Explicit mapping from volumes to servers determines what data can be accessed from the server and provides extra isolation and granularity with regard to location management.

For IT leaders and architects, this gives them the building blocks to create service offerings to support their business partners. IT operations and application owners select a service profile that meets the application requirements. No more black and white choices, instead the business get a broad range of colors in the IT spectrum to fulfill their needs which should be dovetailed with a chargeback policy to provide cost transparency. That is Infrastructure as a Service!
ScaleIO makes much of the traditional storage infrastructure unnecessary. You can create a large-scale SAN without arrays, dedicated fabric, or HBAs. With ScaleIO, you can leverage the local storage in your existing servers that often goes unused, ensuring that IT resources aren’t wasted. And you simply add servers to the environment as needed. This gives you great flexibility in deploying various size SANs and modifying them as needed. It also significantly reduces the cost of initial deployment. 



The Server SAN Market (Wikibon Definition) is starting to get interesting; it shows the amount of momentum that is behind hyper-converged architecture and the market shift toward simplicity. ScaleIO appears to be a very mature offering with a lot of enterprise class features.
News: Top vBlog 2016 Trending: DRS Advanced Settings