Large Pages – a problem of perception and measurement

This post is in response to Gabe’s recent post:

Large Pages, Transparent Page Sharing and how they influence the consolidation ratio

and then Frank Denneman’s reply here: Re: Impact of Large Pages on Consolidation ratios

Finished reading? Ok, let’s continue…

Let me try to summarize very briefly what happens.  When Large Pages are not considered, Transparent Page Sharing (TPS) runs across a host’s VMs periodically (every 60 minutes by default) and reclaims identical 4KB pages – think of it like de-duping your memory.  When using Large Pages (2MB), TPS does not try to reclaim those identical pages because of the “cost” of comparing these much larger pages.  That is until there is memory contention, at which point the Large Pages are broken down into 4KB blocks and identical pages are shared.

Large Pages have been proven to offer performance improvements.  The strategy that VMware are following is technically advantageous – why incur the expense of figuring out which Large Pages are identical if the host doesn’t need to reclaim memory.  If it can back all VM memory requests with physical memory, then it doesn’t need to worry about reclamation yet.

However, the problem I see is that of perception.  If you run lots of VMs with very similar memory usage on a host, you would expect to see the advantage of TPS kicking in and suitable memory savings.  However if the Large Pages don’t get shared until the host thinks it is under pressure, you won’t see those savings until there appears to be a problem.  The host will wait until it hits 94% memory usage (6% free), before it deems itself under memory contention and starts to break those Large Pages into smaller 4KB ones.  So in a VM environment with very similar VMs, you are consistently going to run your host at around 94% memory used.  This isn’t a technical issue.  All those identical memory pages can still be reclaimed, just as before, and you are gaining a performance gain of Large Pages.  This is a perception issue.

Most vSphere administrators probably don’t realize this, their manager almost certainly don’t realize this.  All they see is their hosts running out of memory – they have less than 10% memory free! Time for some more hosts.  And even for those that understand that when they get to this level there is the potential that there is still more potential memory available, it’s not that apparent how much saving they could expect and how far to push it.  Do we hide all memory usage in the vSphere client until it hits only 4% or 2% free?  Obviously not. I think this is a measurement issue.

3 thoughts on “Large Pages – a problem of perception and measurement

  1. Thanks for the summary and making real world sense of Large Page files and TPS.

    Its very easy to panic at 90% memory usage but if you use COWH in esxtop you can see how much memory will be reclaimed when TPS kicks in.

    Keep up the good work

    Navi

  2. Hi Forbes Guthrie. I understand your perception approach but there is something that botther me. Sometimes the host reach the 94% of memory usage and cross this boundary too fast to ESXi start doing any TPS and so ballooning and swapping start. When the TPS finally free some memory the host doesn’t empty the swap nor gives the balloned memory back to the VM’s. ESXi does it very slowly or if demanded – I don’t know for sure. The problem is that when the demand arrives the swap in and the ballon freeing is not fast enough and the VM suffers. We have seen this happenning.
    A similar behavior shows up when the ESXi hosts are rebooted. The VM’s (especially Windows) scan its memory, which is granted by the hosts. If you have overcommitment, balloning and swap will occur. In that case, maybe even TPS is worthless.
    I’m searching about how to disable Large Pages at ESXi to see the power of the TPS on our enviromment. I have disabled the “LPage.LPageAlwaysTryForNPT” and “Mem.AllocGuestLargePage” parameters but I still see high COWH value. I suppose the COWH shoud be near zero with Large Pages disabled.
    I’d be happy to hear your opinion about this swapping and balloning behavior.

    Att,

    Eduardo Aguiar

    1. Hi Eduardo,
      You’re right that ESXi can be too slow to recover memory with TPS once the large pages are broken down. Here’s a VCDX who recommends keeping large pages disabled all the time because of this:
      http://bsmith9999.blogspot.ca/2013/05/getting-your-memory-sabings-back-on.html
      Over-commitment is always a balance, and if you’re too aggressive then performance will suffer. Unfortunately it’s not easy to predict how much TPS will recover. My understanding is the COWH value shows how much TPS could share, not what it is actually sharing. So even once you’ve disabled large pages your COWH will be high (if not higher). What you want to look at is the SHRD value which show guest shared pages (ZERO is already included within SHRD) and the SHRDSVD value which shows the machine memory saved through page sharing.
      Hope that helps.

Leave a Reply