Today, I thought I would write about VMware Virtual Volumes and the impact I think it is going to have on the way we manage storage in the future. Since VVols are currently under development from VMware and storage providers, the final product is not available. However, this was a topic that was discussed at EMC World 2014 and I am sure will have much more coverage at VMworld 2014. VMware as a company is trying to address several key pain points with storage; which includes cost, storage management complexities, and the difficulty of ensuring predictable performance. VMware believes it is strategically positioned to solve these problems at the virtual layer.
How do existing environments look today? Almost 98% of organizations have a 1 to 1 mapping between a datastore and a LUN. That is how we have been working with storage and virtualization for a long time. All your data services; including snapshots, cloning, replication, and recovery are done at the datastore level. This was a storage paradigm that was introduced by vSphere.
The new method being developed by VMware is a "per VM" storage approach. The data operations will be taking place at the VM/VMDK level rather than on the entire LUN/datastore level. This will give the vSphere Administrator the ability to provision compliant storage policies, based on business requirements on a "per VM" basis. This is a completely new concept. We are now moving away from the thought process of LUNs and datastores.
You will be able to provision data services on a fine granular level. For example, think of the possibility of replicating one or two applications instead of an entire datastore. It has a dramatic impact on the efficiency of storage, which influences the business performance by reducing infrastructure costs.
Today, vSphere is aware of the VMDK IDs associated with the virtual machine. This not only needs to happen at the vSphere layer, but it also needs to go down a layer making the storage array aware of the VMDK and its association with the virtual machine.
Once the storage array becomes aware of the VMDK, the vSphere Administrator will be able to select the storage policy necessary to meet the business requirements. This storage paradigm removes the LUNs, and exposes the VMDK to the array. It offloads the data operations directly to the array.
This is a familiar concept if you are using VAAI, which offloads the data process down to the array, such as snapshot cloning and backups; however, now this will be done on a "per VM" basis. Firmware in the storage array communicates to vSphere, through an established set of APIs - VASA, and it will offload the operations of the Virtual Volumes. In another-words, when you create VVols, the command will be passed through the API to the storage arrays firmware, the storage array will interpret the commands, which includes the VMDK IDs associated with the virtual machine. Then the array will create the VMDK files, which are now called Virtual Volumes. This is a significant change; currently the VMDK creation is done with vSphere, which makes vSphere the only layer aware of the VMDK and virtual machine connection.
Now, because the storage array has the same type of awareness of the VMDK and virtual machine connection, you can apply data services on a VVols level. That is pretty powerful, again it gives you the capability to do service level management to meet business requirements on a fine granular level.
From a storage administrator perspective, there is no need to configure LUNs or NFS shares. This makes VVols agnostic to protocols. LUNs today are specific to protocols; such as Fiber Channel, FCoE, and iSCSI, but when you move to VVols that is no longer the case. It can handle all the different protocols. You create a single IO access called a Protocol Endpoint (PE), to setup a data path from VMs to VVols.
The PE is part of the physical storage fabric, so it is treated similar to a LUN. The ESXi host will will discover PE during a rescan, and then PE establishes the data path between the virtual machines and the virtual volumes. VVols are bound and unbound to a PE through vCenter.
The last item I want to touch upon is Storage Containers; the Storage Container is part of the storage fabric and is a logical unit of the underlying hardware. This is a logical entity similar to a LUN, however the size of the Storage Container is only limited to the underlying physical storage being provided. There is no artificial limitation, such as a LUN size and the number of LUNs. If you have an 8 TB array, you can carve out a single Storage Container that is 8 TBs.
Storage Containers help you create logical storage groupings based on service level expectations. So when managing a very large environment, you now have the option to create service tiers at the Storage Container level, as well as having more fine granular policies at the individual VVols level.
It really will be a dramatic change on how we manage storage in the future; we will have storage policy based management. This is going to give us controls to match the policies to the capabilities of the underlying storage platform. It will help improve clarity on service based storage options, underlying cost transparency, and help meet the requirements for the application to meet business needs.