Friday, October 25, 2013
Why does de-dup and thin provisioning matter in VDI?
VMware released a new Horizon View Large-Scale Reference Architecture document. The document was based on real-world test scenarios, workloads, and infrastructure system configurations using VCE Vblock Specialized System for Extreme Applications. This Vblock is composed of Cisco UCS server blades and EMC ExtremeIO flash-based storage array.
This VCE Vblock is specifically intended for solutions like VDI because the flash-based array ensures performance and responsiveness that is required by virtual desktops to provide a respectable user experience.
The VCE Vblock Specialized System for Extreme Applications uses EMC XtremIO storage arrays as primary storage for Horizon View virtual desktops and EMC Isilon NAS to store user data and Horizon View persona data.
The environment was scaled to support 7,000 users. They used a View POD design with a Management block and a Desktop block. The Desktop block consisted of 8 pools; six of the pools had 1,000 desktops and two of the pools had 500 desktops.
Here is the desktop specification used in the testing:
Now, the reference architecture has some spectacular performance numbers for the workload with the all-flash array, but what I found incredibly impressive was the amount of capacity required for the entire infrastructure using de-duplication, thin provisioning, and linked clones.
Lets consider the size per virtual desktop if these were full clones on a standard storage array that didn't provide de-duplication. Each virtual desktop would require roughly 44.2 GB of storage capacity. With 7,000 virtual desktops, we are talking about 309 TB of storage, and that doesn't include the management components.
For those virtual desktops, we would typically have 64 virtual desktops per datastore which would require 110 datastores.
7000/64 = 110 datastores
Now lets estimate that the average IOPS per user is 18. If we need 18 IOPS for 64 virtual desktops we will need 1,152 IOPS per datastore. If we have a 50/50 read-write ratio using RAID 5, we will actually need 2,880 IOPS per datastore when we calculate the write penalty of 4.
read IO + (write IO) x 4 (576 + 2,304 = 2,880 IOPS)
Using 15k fiber channel disks with 150 IOPS would require 20 drives per datastore and 2,200 drives for all 110 datastores. If we are using 300 GB 15k fiber channel drives, we are talking about 660 TB of storage!
Lets look at the total amount of storage required in the Horizon View Large-Scale Reference Architecture using the EMC XtremIO storage arrays with de-duplication, thin provisioning, and linked clones.
3.51 TB of used storage! Are you kidding me? With a full clone desktop not leveraging these technologies it required 309 TB of storage for the overall capacity and 660 TB of storage to support the IO workload. By utilizing these key technologies, it puts the price of a flash-based storage array in the range for most large to enterprise organizations and will actually lower the overall cost of a VDI deployment.
Here are some of the performance numbers from the documentation:
I definitely recommend taking a look at the document.