Tuesday, 13 December 2011

Linux CPU/Memory Utilization - Tips/Tricks/Commands

As a common knowledge CPU is been allocated to multiple process in a round robin algorithm with a certain time slicing. But when any process starts consuming the CPU and not releasing due to heavy computation either in quality (depth of computation) or quantity, system's load average will increase causing other process to wait in the queue to get the CPU. This may cause in some deadlock, system CPU hogging. I am gonna share my experience on how to fix such issues on linux servers and more tricks around web systems.

Few command to figure out the CPU usages
$ top 
This will tell you overall usage, Make sure that you notice the load average. If load average is going more than 2 to 3, means system CPU is getting utilized a lot.
Second observation is the process, check the processes who is coming on the top and specially whose %CPU is printing out to be high in number, higher the value that particular program is consuming more CPU.

$ mpstat
Use this command to see the CPU utilization individually

$atop
Use this command to see the CPU utilization of every processor (or cores)

$ apt-get install sysstat
Install this package to track the system usages on regular intervals

$sar
This will show you history of CPU utilization, using which you can track when the CPU usage or IO wait was high

$sar -u 2 10
Prints the current system usages in every 2 seconds and repeats for 10 times
  • %user: Percentage of CPU utilization that occurred while executing at the user level (application).
  • %nice: Percentage of CPU utilization that occurred while executing at the user level with nice priority.
  • %system: Percentage of CPU utilization that occurred while executing at the system level (kernel).
  • %iowait: Percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.
  • %idle: Percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request.

Who is the hell is eating the CPU
$ ps -eo pcpu,pid,user,args | sort -k 1 -r 
Prints all the processes in reverse sorted order of consuming CPU

$ ps -eo pcpu,pid,user,args | sort -k 1 -r | head -10
Prints the most consuming process in reverse order and limits to 10 size

$iostat
To prints the statistics of IO usage since system reboot

$iostat -xtc 2 10
Prints the IO usage in every 2 seconds for 10 times

Check Memory Usages

ps aux | awk '{print $2, $4, $11}' | sort -k2r | head -n 10
Prints the which process is consuming how much memory

You can see the free and available memory using commands like
$free
$atop

---
Some concepts to resolve the common issues in web system.
1. Most of the time IO wait will be high on server to cause the server to be slow, so check the slow query log in DB and resolve such queries, try to optimize the mysql usage with caching/indexing

2. Make sure that tomcat or given enough memory and GC parameters to use the memory efficiently
Like this (as a Java runtime variables)
java -Xms512M -Xmx512M -Xss128K -XX:PermSize=64M -XX:MaxPermSize=128M -XX:NewRatio=4 -XX:+HeapDumpOnOutOfMemoryError -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Djava.awt.headless=true -jar xyz.jar

3. $jmap -histo:live process_id
Command to check the memory usage by a Java process - keep and eye that you are not loading too many objects in memory by mistake

4. $jstack process_id
Command to check java process running stack traces - keep and eye to watch any regular method exception is coming which is taking more time to execute(resolve that method execution, optimize it)