I published an academic paper at the IEEE International Symposium on Workload Characterization (IISWC 2007) in September that I want to spend some time talking about. The paper was entitled “Easy and Efficient Disk I/O Workload Characterization in VMware ESX Server”. Here’s the abstract:
Collection of detailed characteristics of disk I/O for workloads is the first step in tuning disk subsystem performance. This paper presents an efficient implementation of disk I/O workload characterization using online histograms in a virtual machine hypervisor-VMware ESX Server. This technique allows transparent and online collection of essential workload characteristics for arbitrary, unmodified operating system instances running in virtual machines. For analysis that cannot be done efficiently online, we provide a virtual SCSI command tracing framework. Our online histograms encompass essential disk I/O performance metrics including I/O block size, latency, spatial locality, I/O interarrival period and active queue depth. We demonstrate our technique on workloads of Filebench, DBT-2 and large file copy running in virtual machines and provide an analysis of the differences between ZFS and UFS filesystems on Solaris. We show that our implementation introduces negligible overheads in CPU, memory and latency and yet is able to capture essential workload characteristics.