VMware vSphere 7.x Memory Reclamation-Part 5: Memory Compression Cache

In this post, we are going to discuss about memory compression cache technique used by ESXi host. In previous posts we have already discussed TPS and ballooning. So compression is the next technique in order that gets initialized when ESXi host is under memory contention.

ESXi provides a memory compression cache to improve virtual machine performance when memory overcommitment is used. If the virtual machine’s memory usage reaches to the level at which host-level swapping will be required, ESXi uses memory compression to reduce the number of memory pages it will need to swap out. Because the decompression latency is much smaller than the swap-in from disk latency, compressing memory pages has significantly less impact on performance than swapping out those pages.

Memory Compression is enabled by default. When an ESXi host’s memory becomes overcommitted, ESXi compresses memory pages and stores them in memory. You can set the maximum size for the compression cache and disable memory compression using the Advanced Settings dialog box in the vSphere Client.

Lets see how compression helps improving performance of virtual machines that are running on over-committed ESXi Host.

How Memory Compression Works:

As we have discussed earlier, We have two types of pages in memory as listed below that are compressed.

Large Pages (2 MB)
Small Pages (4 KB)

ESXi does not directly compress 2 MB large pages, rather 2 MB large pages are broken down to 4 KB pages first and later they are compressed to 2 KB pages. If a page’s compression ratio is larger than 75%, ESXi will store the compressed page using a 1 KB quarter-page space.

There are couple of conditions for pages that will be considered for compression. If memory pages are meeting below criteria then only memory pages are compressed.

Memory pages that are marked for swapping out to disk only those pages. And
Memory pages that can be compressed at least 50%.

Any page that is not meeting above criteria, will be swapped out to disk. Let’s understand how compression works with an example.

Let’s assume that ESXi needs to reclaim 8 KB physical memory (two 4 KB pages) from Virtual machines. If we consider host swapping, two swap candidate pages, page A and B, are directly swapped to disk as in image below (a).

Image VMware

With compression, a swap candidate page is compressed and stored using 2 KB of space in a per-VM compression cache. Hence, each compressed page yields 2 KB memory space for ESXi to reclaim.

In order to reclaim 8 KB physical memory, 4 swap candidate pages need to be compressed (as in above image: b). If memory requests comes in to access a compressed page, the page is decompressed and pushed back to the guest memory. The page is then removed from the compression cache.

What is Per-VM Compression Cache?

The memory for the compression cache is not allocated separately. The compression cache is located in the VM’s memory space. The compression cache size starts at 0 and, by default, can increase to a maximum of 10 percent of the VM’s memory size.

If the compression cache is full, one compressed page must be replaced in order to make room for a new compressed page. The page which has not been accessed for the longest time will be decompressed and swapped out. ESXi host does not swap out compressed pages.

If the pages belonging to compression cache need to be swapped out under severe memory pressure, the compression cache size is reduced and the affected compressed pages are decompressed and swapped out.

The maximum compression cache size is important for maintaining good VM performance. Since compression cache is accounted for by the VM’s guest memory usage, a very large compression cache may waste VM memory and unnecessarily create host memory pressure. In vSphere 7.0, the default maximum compression cache size is conservatively set to 10% of configured VM memory size. This value can be changed through Advanced Settings by changing the value for Mem.MemZipMaxPct as shown in below image.

Conclusion:

Memory compression outperforms host swapping because access to the compressed page causes only a page decompression, which can be significantly faster than the disk access which involves a disk I/O. So the best practice is to let the memory compression cache be enabled.