Sunday, March 8, 2015

vRealize Operations 6.0 Analysis - Capacity Remaining

Like its predecessor; vRealize Operations 6.0 is a much better tool to utilize when looking at the health and capacity of your infrastructure than vCenter Server. It keeps all the metrics, it keeps five-minute intervals, and it keeps them for six months. 

You can customize the retention length of the data by going to the Administration link, clicking on Global Settings on the Navigation Panel, and then adjusting the Time Series Data.

Time Remaining Setting

This is a significant improvement over the performance statistics available in VMware vCenter Server, which shows you metrics for the past hour in 20-second increments, if you start to research information further back in time it reveals less metrics and the data points become more averaged out. For example, the past day has 20-second intervals, the past week shows 30 minute intervals, and the past month shows two-hour intervals in vCenter Server. A two-hour long average can hide a lot of peaks and valleys, it might be good for some general troubleshoot, but it isn't going to help you with the root cause of an application performance issue or the capacity remaining in your environment. It is simply too large an interval, you need a much finer data sampling. 

With vRealize Operations 6.0, you can retroactively go back and tell an application owner if he was having a performance problem at a certain time. It is going to give you a lot more confidence about providing relevant information to your IT business partners.

Another aspect you will recognize when you start using the new merged UI, the items that made the Health, Risk, and Efficiency badges are now under the Analysis tab. Several of these widgets, like Capacity Remaining, have been significantly revamped.

vRealize 6.0 Analysis Tab

In the vRealize Operations 6.0 Product UI, you can find the badge scores on the Home screen under the Recommendations dashboard or on the Summary page for an object.

vRealize Operations 6.0 Recommendations Dashboard

Today we are going to take a deeper look at Capacity Remaining tab. Capacity remaining is the % of usable capacity not consumed. Usable capacity is the white part of the box and the grey part of the box is the HA + buffer, which is reserved. Capacity remaining is calculated using both peak and average demand. The example shows us the peak of 19% capacity remaining because there was a spike that used 81% of available capacity, but the average consumption is 52%.

Capacity Remaining Description

As you notice in the image below, the capacity remaining on is 5%, which is indicated in the octagon badge. Just above the Capacity Remaining Breakdown, we can see the actual capacity remaining is 5.4%, the badge score has been rounded down to 5%. This gives us an orange badge color. The badge turns red for 0% remaining, orange for less than 5% capacity remaining, and yellow for less than 10% capacity remaining.

In the upper left corner, we get an indication of the 30 day trend for capacity remaining. This will increase as you start to add additional load on the host.

Capacity Remaining Badge Score

In the Capacity Remaining Breakdown bar; the orange bar indicates the amount of consumed resources, the grey bar is the 5% that is available, and the black bar is the 10% reserved for HA and buffers.

Capacity remaining takes into account a 10% HA buffer. This buffer can be adjusted up or down to suit your business requirements. For instance, if you have a 7-node cluster, you may want to adjust the capacity buffer to be 15% for N+1 or 30% for N+2.  Also, you have the option to disable the buffer for certain resource components, such as Network I/O and Disk space.

Capacity Buffer

The What Will Fit indicates the number of virtual machines we can place on the host or cluster. Since my capacity remaining is only 5% due to memory constraints, I am unable to place any virtual machines on my host If I had 80% capacity remaining, it would show me how many small, medium, large, and average profile machines I could fit on the host. If you click on any of the profiles, it will provide the memory, CPU, and disk space allocations for that build.
What Will Fit Profile

By default, there are four containers available for further analysis on capacity remaining; they are CPU, Memory, Disk Space, and vSphere Configuration Limit. Each one of the containers displays Total Capacity, Buffers, Usable Capacity, Peak Value, Stress Free Value, and what is Remaining.

Capacity Remaining Containers

Let's drill down on Memory, when we expand the container we see the sub-containers of Demand and Allocation. Demand is calculated by taking the total configured memory of 11.39 GB and removing the 10% buffer giving use the usable capacity of 10.25 GB. vRealize Operations analytics engine has determined that the stress free value for demand is 63.5% or 6.51 GB, which is the analysis of the last 30 days indicating the amount of resources required to run the current virtual machine demand without performance degradation or impact to the applications. This gives us 36.5% remaining for demand. On the chart, the blue line is the actual demand, we notice that it has been going up and down over the past few weeks; the red line is the stress free zone.

Memory Demand and Allocation

If we look at Demand and Allocation for CPU, the Total Capacity is configured differently; the demand is configured by total GHz and the allocation is configured by vCPUs, which includes the overcommit ratio. In my case it is 8 vCPUs per processor core. The remaining capacity for CPU demand is 15.65 GHz or 81.72% and we have 50.2 vCPUs that we can add to the host.

The Capacity Remaining Analysis has been significantly redesigned to give you better visibility and more information about the total amount or resources remaining in your environment.
News: Top vBlog 2016 Trending: DRS Advanced Settings