virtualirfan

vscsiStats

Motivation

To properly tune disk performance or just plain troubleshooting, detailed characteristics of disk I/O for workloads must be collected. The beauty of the disk subsystem is that, unlike cpu and memory which are largely not configurable, users are actually able to tune the underlying ‘machine’ parameters for disk. For example, RAID level, stripe size, number of spindles, flash versus spinning media, cache size and configuration, prefetch policy and distance, and numerous other parameters. However, anyone interested in optimizing the disk subsystem especially system administrators, virtualization architects, and software developers must start with knowing the characteristics of the input workload. If you can’t measure it, you can’t improve it.

Introduction to vscsiStats

vscsiStatsArch-1

vscsiStats provides essential disk IO workload characterization for virtual machines (VMs). vscsiStats collects live disk workload shapes as well as scsi traces if needed. Implemented in the VMware ESX/ESXi vmkernel since ESX 3.1, the implementation by me has exceptionally low overheads in CPU, memory and latency. There is a simple command-line tool in ESX to interact with the instrumentation system. No guest agents or software to install. No waiting, instant results. (Nice weekend read: research paper about vscsiStats).

Workload Shapes

StefanvonHalenbach_Fingerprint

Just like human fingerprints, each computer workload has unique disk characteristics which can be described using a few dimensions. Based on decades of disk performance tuning history, typical shape dimensions used are read/write ratio, block size, spatial locality (sequential versus random), IO interarrival period and active queue depth (outstanding IOs).

Histograms

To be able to ‘see’ workload shapes, we need a visualization method. In vscsiStats, that technique is histogram plots. A histogram bar height shows the total number of data points in the range from the previous value to this one. The higher a bar, the more frequently that range of values is found in the data. Histograms are much more informative than single numbers like mean, median, and standard deviations from the mean. For example, multimodal behaviors are instantly identifiable by plotting a histogram, but obfuscated by a mean. Why take one number if you can have a distribution? This is a really important point. Let’s take a fake data set below. Note that the average latency here is 5.3 us even though not a single data point exists at that latency. Averages are often misleading! By plotting the histogram, we get to see the bi-modality which is a critical piece of information for performance analysis. Moral of the story: always use histograms.

fake-example-histogram

 

weird-xaxis-scale

To make vscsiStats histograms practical, I decided to use x-axis bin sizes on a rather irregular scale. To this day, I’m glad that I did that because it provides higher resolution, albeit at the expense of causing occasional confusion. Luckily, the confusion is trivial to remove. Let’s take a look at an example. The I/O length histogram bin ranges are like this: …, 2048, 4095, 4096, 8191, 8192, … rather odd. Note the highlighted x-axis buckets show some block sizes as individual data points while others are larger ranges. Certain block sizes are really special since the underlying storage subsystems may optimize for them so we single those out from the start (else lose that precise information). For example, it is important to know if the I/O was 16KB or some other size in the interval (8KB,16KB).

Workload Shape of Microsoft Exchange 2007

Let’s apply what we’ve learnt so far to a real installation of Microsoft Exchange, a very common workload in today’s virtualized data centers. Below, each column is a particular dimension of the workload shape. From left to right, you’ll see histograms of Seek Distance (a measure of sequentiality), IO length. Then the third and fourth columns capture histograms over time. Notice Outstanding IOs (a measure of the level of parallelism in disk workload) and resultant latency of IOs, both over time. Finally, notice the rows are for the two types of IOs: reads and writes. (Not plotted are the interarrival time and distance from last 16 histograms which are more advanced analysis tools not covered in this article). As you can see vscsiStats is very powerful. Not only can you look at read/write ratio in the aggregate, you can precisely tell workload sequentiality, block sizes, OIOs and latency across reads versus writes.

Exchange2007-shapes-annotated

What are the workload shapes for your critical workloads? In this particular Microsoft Exchange 2007 installation, we see very interesting patterns:

  • Reads have a bimodal spatial locality pattern (i.e. there is both highly sequential parts and large random seek IOs in this VM). The writes show reasonable locality. This is invaluable information for a storage admin to tune this workload for maximal performance.
  • Steady read traffic over the period of this collection but writes are very bursty. Again, useful information to tune the storage array.
  • The read/write ratio is highly biased towards reads so this workload might be a great candidate for caching acceleration (a topic for another article)
  • The IO block size is largely 8K. You can use this information to tune cache block sizes.

BTW: all of this analysis was completed within 5 minutes. See below on how you can use this powerful utility in a super-easy-to-use manner.

How to use vscsiStats

Log in ESXi via ssh and use the super-simple command-line interface; help is self-explanatory

$ /usr/lib/vmware/bin/vscsiStats -h

Cormac has written a nice introductory article about getting started with vscsiStats so I’ll just point there to avoid duplication. I might write a new section here when I have more time to provide more help on each of the commands. In the meanwhile, tell Cormac I said hi 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *