Load average can be a tricky measure of a server’s performance. This article hopes to provide some insight as to what all the various numbers you can find on “top” and other Linux programs mean. This article also explains virtualized machine statistics that don’t normally appear on a standard Linux top listing.

top command breakdown

top

1
2
3
4
5
top - 17:15:19 up 32 days, 18:24, 6 users,  load average: 0.00, 0.01, 0.05
Tasks: 119 total,   1 running, 118 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  32944828k total,  2474848k used, 30469980k free,      652k buffers
Swap:        0k total,        0k used,        0k free,  1843132k cached

top - 17:15:19 up 32 days, 18:24, 6 users

This shows you the command and the current system time; followed by the “uptime” in this case, 32 days, 18 hours and 24 minutes; finally the number of users logged in to the system, in this example, 6 users are logged in. This means via SSH, locally to the box, idling, in screen, etc.

load average: 0.00, 0.01, 0.05

This part shows you the load average, which can be a confusing topic, especially on virtual machines/the cloud.

The first number is “last minute” or “current load average” - the second number is “five minute average” and the last number is “15 minute load average”

Load average is a measure of the average number of processes that are waiting for their turn to do something on the CPU. Like a supermarket queue/line, you have to wait your turn before you can have the cashier’s full attention. The reason that load average goes up is because of the rest of the statistics and meters below this line, so going strictly off of load average may not tell the full story.

Here’s an example taken from a distcc node:

This box, as well as being a staging environment for scripts and hosting the cloud command line tools, also provides a distributed C compiler service to various machines on our network, since it has 8 CPUs and 32GB of ram and a ton of ramdisk. Under normal load, the load averages stay relatively low; while running Java scripts the load can go to 2 or higher. However, while running the compiler service under full load (10 running processes at 95% or higher CPU usage) the load average is at 0.75… How is this? Read on to discover the answer to this and other burning questions.

Tasks: 119 total, 1 running, 118 sleeping, 0 stopped, 0 zombie

Tasks: how many programs are listed when you type “ps aux” or your favorite variation.

The total number is useful to determine if you have a runaway apache server or PostgreSQL instance, but usually stays pretty steady.

The number of running processes shows you what’s currently using your CPU. Non-multithreaded applications can generally only use 1 CPU at a time, so seeing 1 process using 25% of your CPU with a load average of ~1 on a quad core server is commonplace.

The number of sleeping processes tells you what is running but not active; usually a majority of background tasks, system software, printer drivers, etc.

The number of stopped processes should generally be 0 unless you sent a SIGSTOP or kill -STOP to a process for troubleshooting. This number being other than 0 may be a cause for concern on production servers.

Zombie processes. My favorite of the processes. This simply means that a multi-threaded application started a child processes, and then got killed or terminated unexpectedly, leaving a dangling process known as a zombie. Apache can probably make these with a vengeance if something terrible happens. Generally this should also be 0.

CPU line

I am going to break this into two parts, because these are the actual important stats for our usage.

Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 99.9%id,

The first four statistics listed here are on every Linux machine, and most people are used to them.

  • %us shows the individual processor utilization (by userland processes such as apache, MySQL, etc.) to a maximum of 100%. So on a quad core with 1 process using 100% of 1 CPU, this will show 25%us. 12.5% on an 8 core means 1 core is pegged out.
  • %sy means system CPU usage. This generally stays low, having higher numbers here may indicate a problem with kernel configs, a driver issue, or any number of other things.
  • %ni means percentage of CPU used by userland processes that have been affected by using nice or renice commands, basically they’ve had their priorities changed from the default that the scheduler assigns to higher or lower. When you assign niceness to a process, a positive number means lower priority (1 being 1 step below normal) and a negative number means higher priority. 0 is the default, which means the scheduler gets to decide. You can adjust which scheduler your system uses, but that is an advanced topic for a later article. In addition, any percentage listed here does not overlap with %us, which only shows userland processes that haven’t had their priorities adjusted.
  • %id this is the result of subtracting the previous 3 numbers from 100.0%, and measures “idle” processing power.

0.0%wa, 0.0%hi, 0.0%si, 0.0%st

The second set of statistics has to do with virtualization, and here’s where we can track down exact issues that may contribute to a high load average.

  • %wa - this is iowait percentage. When a process or program requests some data, it first checks the processor caches (there are 2 or three caches there), then goes out and checks memory, and finally will hit disk. When it hits disk, it generally has to wait for the IO thread to finish reading the information into RAM before it can operate on it again. The slower the disk, the higher the IO Wait % will be for each process. This also occurs on disk writes if the system buffers are full and need to be flushed by the kernel - usually seen on high load database servers. IO Wait being over {100 / (number of CPUs * number of processes)}% consistently means that we may have a storage issue that needs to be looked in to. If you see a high load average, check THIS number FIRST. if this is high, then you know the processes are bottlenecking on your disk and not some other reason.
  • %hi This means “Hard Interrupt” - on a circuit board the electrons move through chips in a predictable manner. For instance, when a network card receives a packet, before transferring the information in the packet to the processor via the kernel, it will ask for an interrupt on an interrupt line on the motherboard. The processor reports to the kernel that the network card has information for it, and the kernel can decide what to do. A high “Hard Interrupt” stat is fairly rare on a virtual machine, but as hypervisors expose more “metal” to the VMs this may change. Extremely high network throughput, USB usage, GPU computations may cause this statistic to go well above a few percent.
  • %si This is a “soft interrupt” - rather than a piece of hardware or a device (driver) requesting an interrupt on the interrupt line of the motherboard, the Linux kernel in version 2.4 implemented a feature for software (applications) to request an interrupt and for the kernel to service this via it’s interrupt handler. This means that an application can request priority status, the kernel can acknowledge that it received the command, and the software will wait, patiently, until the interrupt can be serviced. Doing a tcpdump on a 1 gig link with high traffic can bring this to about 10%, as tcpdump’s allocated memory fills, it requests an interrupt to move the data off its stack and onto the disk, or screen, or wherever.
  • %st The most important number in the entire list, if you ask me, is IOSteal%. In a virtualized environment such as ours, many logical servers can run under one actual hypervisor. We assign 4-8 “virtual” CPUs to each Virtual Machine; although the hypervisors themselves may not have (number of VMs * Virtual CPUs per VM). The reason for this is, we don’t bottleneck on the CPU with our usage of VMs, so allowing a VM or two to utilize 8 CPUs on occasion will not adversely affect the pool, in general. However, if more than the number of physical (or logical CPUs in the case of hyperthreaded xeons) are currently being used by VMs’ virtual CPUs, then you will see iosteal go up.

iosteal% is a measure of how busy the hypervisor is, and VMs on a pool that show consistently high iosteal% (over 15%) may imply that we have to move some VMs to another part of the pool.

iowait% is a measure of disk performance. on NetApp backed storage, we may not be able to solve this without moving the volume to a less utilized one, or a different NetApp. For local disk (either SSD or SAS) it may mean the local hypervisor has too many disk intensive VMs on it, and we may need to move some VMs to another part of the pool.

Take Away

  • Load average doesn’t tell you anything, really.
  • %userland (%us) is important to load average because it means computations are being performed. MySQL, for instance will only have 1 thread, so will use (1/Number of vCPUs assigned to the VM) when under full load. PostgreSQL is multi-threaded, and can use more CPUs if allocated - be aware of this when creating VMs on a hypervisor to prevent:
  • %st - iosteal% is a measure of how busy the hypervisor is. Stacking 4 PostgreSQL and 6 tomcat servers on a single hypervisor may make sense from a business perspective, but you’re going to be competing for CPU time constantly.
  • %wa - iowait% is a measure of how much time your processes are spending going to the incredibly slow disks on whatever storage solution you are using. Disks, even SSDs, are incredibly slow. Increasing RAM to allow larger kernel buffers may mitigate this a bit. RAM is cheaper than disk when you consider how utterly blazing fast it is in comparison.

Further reading

iostat

If you have a lot of iowait or iosteal you can track down precisely what disk is causing it with the iostat command. Run it thusly:

Console - user@hostname ~ $

1
iostat -n 1 -[k|m]

Please see man iostat for other uses. The breakdown listed every second will tell you which disks are being read from/written to, as well as any iosteal or iowait associated with those disk accesses.

htop

My favorite overview of CPU and process usage on Linux system. It doesn’t show you the virtual statistics; but it does give you a process TREE view, as well as a breakdown of each CPU on the system and the usage, as well as swap and memory statistics to track down nasty memory leaks with pretty colors. It is my opinion that this package should be mandatory on all VMs.