VMware vSphere 7.x Memory Reclamation-Part 1: Basics

Welcome everyone to this series on memory reclamation techniques in vSphere 7.x. This will be one more long series of posts after my previous series on uncovering virtual networking. In this series, we will try to understand the memory virtualization basics, memory reclamation techniques in vSphere (Specifically ESXi hosts), ESXi host Memory states, and sliding scale method.

Understanding memory virtualization and memory reclamation is really an important aspect for managing ESXi performance from memory management standpoint.

Creating number of virtual machines and allocating memory resources to these VMs on an ESXi host is one aspect. Ensuring that it does not result in performance issues due to improper memory management is also something we need to take care of. So let’s get started to understand how memory virtualization works along with memory reclamation techniques.

Memory virtualization:

Here we are going discuss and understand the memory layers. To begin with let’s have a quick look at memory layers in a physical system so that it makes better sense when we talk about memory layers in virtualization.

Physical system Memory layers

If you look at the image above, there are two layers in a physical system. First layer is the actual RAM that is physically connected to the system and then second layer is virtual memory which is managed by an Operating system. For example, in windows OS we use Pagefile.sys as virtual memory whereas in Linux OS we use swap partition. We all know how RAM is used in any system. But what about virtual memory?

How virtual memory is used?

There are couple of use cases where virtual memory is used.

  • As extension of RAM onto the physical disk when physical RAM is exhausted in usage.
  • At the time of launching an application, it may be launched with virtual memory in order to provide contiguous memory blocks, and then content is transferred to RAM for processing.

So if you look at the above image, colour coded blocks represent the content of virtual memory getting transferred onto physical memory as required for further processing.

In virtualization:

Now that we have memory layers discussed in physical systems, let’s have a look at memory layers in virtualized environment.

Memory layers in virtual

If you look at the above image, ESXi Host is connected with physical RAM aka Host addressing (HA). When a VM is created, you allocate amount of virtual RAM to the VM which is actually coming from underlying ESXi physical RAM as part of resource sharing. However, Operating System of a VM treats the virtual RAM allocated to VM as a physical RAM since the guest OS is not aware that it is running in virtualized environment. This layer is referred as Guest physical Memory or Guest Addressing (GA). The Operating System of a VM then manages the virtual memory onto the disk known as virtual addressing (VA).

NOTE: Though I am referring as virtual RAM or vRAM (as in image above), Memory is not virtualized due to performance reasons. It is used it just to represent allocation of memory. This is something similar to CPU concepts, though we refer vCPU, it is just a representation of CPU, vCPU is not the one who processes the instructions, it is the physical CPU who does it.

As we discussed in case of physical systems, virtual memory content is loaded onto the physical RAM, similarly in virtualization, content of virtual addressing (VA) is loaded in guest addressing (GA), but since this layer is also virtual, the content is ultimately loaded to actual ESXi RAM (HA). In short, content of VA is loaded in GA which is ultimately loaded into HA. Below image depicts this process.

With that discussed, I hope it clears the basic concepts on how memory layers are used in virtualized environment. Now it is time for us to actually start talking about memory reclamation. Below are some of the basic question you may have in your mind.

  • What is memory reclamation?
  • Why do we need memory reclamation?
  • When memory reclamation is performed?
  • What are the techniques of memory reclamation?

To answer all of these questions, let’s start exploring memory reclamation in detail.

What is memory reclamation?

Well, as the name suggests, memory reclamation on ESXi is the process of claiming back the memory consumed by virtual machines. Memory reclamation also helps to extent possible avoid doing swapping to the storage which results in degraded performance. I will talk this at the end of this post, so till then, Hold On.

If ESXi is supposed to allocate the memory to the VM on demand then why does it need to claim it back? In order to understand this, we need to explore some more concepts here.

Why do we need memory reclamation?

There are couple of scenarios, where memory reclamation is required. We will discuss these scenarios here shortly.

As we all know, when VM is being created, we need to allocate amount of memory to the virtual machine. Lets assume that we have a VM called as VM-1 allocated with 1 GB of memory at the of creating VM.

Remember a point that VM has no pre-allocated host physical memory. Memory is allocated to VM on demand. So when VM-1 is created, it does not have pre-allocated memory on ESXi host even though we allocated 1 GB memory at the time of creation. Instead, when VM-1 is powered on, hypervisor intercepts the VM’s memory accesses request and allocates memory to the VM on its first access to the memory as on demand provisioning.

As the demand grows, more memory is allocated to the VM until VM reaches its limit (Default or User defined). By default, allocated memory acts as limit as well. So in our case 1 GB will be the default limit since we do not have user defined limit.

Here I am talking about just VM-1, but we may have multiple VMs running on an ESXi host. So VMs will start consuming host memory as the demand grows across multiple virtual machines. This sounds cool and practical as well since ESXi memory will not be wasted due to over allocation.

However, there are other challenges due to which memory reclamation is required.

Scenario 1:

This scenario is specific to one of the memory reclamation technique called as Ballooning which I will discuss in another post. I am taking it here as it will give you better picture of how memory is used and related challenges in virtualized environment.

Let’s understand, how operating systems manage memory allocation. Below image provides us idea on how the memory pages are handled by an operating system.

OS Memory handling

Total Memory is the amount of memory that is presented (allocated) to the OS.

  • The total amount of memory can be divided into two parts:
    • Free memory: Memory that is not assigned to the guest OS or to applications.
    • Allocated memory: Memory that is assigned to the guest OS or to applications.
  • Allocated memory can be further subdivided into two types:
    • Active memory: Memory that was recently used by applications.
    • Idle memory: Memory that was not recently used by applications.

This works really well for a physical system. However, in virtualized environment this creates a challenge. How? Lets understand with an example.

For example, Say my current consumption of memory is 1 GB. I launched MS outlook application on my computer in the morning after powering on my system, it takes some amount of time to load all pages of that application. Once launched, I started using it to check my e-mails. But since Outlook also has memory requirement, my memory consumptions shoots up to 2 GB.

Now let’s just say, I closed the outlook by mistake, does that bring my memory consumption back to 1GB? And if I tried to re-open outlook again. Will it take same amount of time as first launch? Not really, it will open much faster than the first time.

Something changed in back end?

Well, when I launched MS Outlook for the first time, it loaded all the required pages into the Active memory Pages aka MRU (Most Recently Used). But when I closed the application, memory pages of that application in MRU are not deleted immediately from memory by the OS, rather operating system shifts those pages to Idle memory aka LRU (Least Recently Used) instead of deleting them. This results in memory pages not getting released. So consumed memory after closing the app may be still same to that of before closing.

This behaviour of OS is understandable considering application may require those pages if request comes in again like in my example I started outlook again and there are no other applications demanding memory pages.

Now this is really good approach of managing memory pages and ensuring performance by keeping pages in LRU. But this approach is good for physical systems. The challenge that we face in virtual environment due to this approach is:

  • ESXI host has no visibility inside VM. That means no visibility of Free list, LRU, and MRU memory pages that are managed by guest Operating system of a virtual machine.
  • So if multiple VMs are demanding memory resources and later keeping memory pages in LRU even after workload is no longer running, this results in unnecessary consumption of host memory of ESXi host which can cause memory contention when multiple VMs have demand for memory resources.
  • On the other hand, operating system of virtual machine is also not aware that ESXi server is under memory contention as virtual machine guest OS also does not have visibility of ESXi memory consumption and cannot detect the host’s memory shortage.

That explains one of the aspect why we need memory reclamation techniques such as ballooning. There is also another scenario where ESXI host needs to have memory reclamation techniques available.

Scenario 2:

This scenario is the primary reason why we need memory reclamation techniques. ESXi supports memory overcommitment in order to provide better memory utilization and higher ratio of consolidation. Let’s understand it with an example.

As per example in above image, we have ESXi host with 10 GB as total memory and there are 4 VMs each allocated with 3 GB. So the total allocated memory is 12 GB which is beyond the total memory available. This is typical scenario of memory overcommitment.

However, as we discussed, memory is allocated on demand, so it may not be frequent sight of getting full demand of 12 GB. Due to this simple reason, memory overcommitment is supported in virtualization.

By the way, If all VMs will demand their full allocation, total demand will be 12 GB. How that situation will be handled if no memory reclamation is available?

In such event, ESXi host will need to perform swapping to the storage. ESXi host has only upto 10 GB memory which can used (Hypothetically as ESXi also needs memory for itself). Rest of the required demand needs to be satisfied by means of swapping.

Here is an interesting point. Virtual Addressing (VA) that we have been talking about until now, is not the swap space for ESXi host. It is inside VM and for guest OS only. Then where exactly swapping will be performed?

As we discussed earlier about virtual memory, we use it as extension of RAM onto the disk. It is used to swap in and swap out the memory pages. Depending on the OS, you may have a shared location for swapping. For example, in windows we use pagefile.sys as shared location which is fine from guest OS standpoint as it is shared among applications of single system. Read more about it here at Microsoft Docs. This concept is also applicable in virtualized environment as well but only within the single VM.

How about multiple VMs on ESXi host?

Due to security concerns and isolation requirements between VMs, ESXi host does use shared location for swapping. Each VM will have its own dedicated swapfile as swap location as shown in image below. Generally swapfiles for VMs are stored in the VM folder, but if required can be shifted to other location such as SSD datastore for performance requirements.

Anyway, the point here is that the swapping to the shared storage will have consequences as it is over the network, performance will be impacted considerably. Memory reclamation techniques helps to avoid storage level swapping.

Note: Always follow the practice of not to overcommit unless required. In case you are planning to use overcommitment, do not overcommit beyond 20% as it may result in unexpected performance issues.

Note: Also ensure that required configurations for memory reclamation techniques (such as TPS) is done so that it can be used effectively.

Wrapping Up:

In this post, we discussed below listed points.

  • Memory virtualization basics (HA, GA, and VA)
  • What is memory reclamation?
  • Why do we need memory reclamation?

Just to give you bullet points about what we are going to discuss in upcoming posts in this series, below are the memory reclamation techniques.

  • Transparent Page Sharing (TPS)
  • Ballooning
  • Memory Compression
  • Host Swap Cache
  • Host level Swapping

I hope it was informative. See you in the next one. In the next post, we will try to understand ESXi memory states and sliding scale method before we discuss each of memory reclamation techniques. It is important to understand ESXi memory states and sliding scale method as not all the memory reclamation techniques run simultaneously. Memory reclamation techniques are initialised based on ESXi host memory states. And sliding scale method is used to determine these memory states based on Mem.MemMinFreePct value on ESXi host.

Here is the next part of this series, Mem.MemMinFreePct.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.