Examples of using SAR command for system monitoring in Linux

Sarath Pillai's picture
System Activity Monitor

System Activity Reporter is an important tool that helps system administrators to get an overview of the server machine with status of different important metrics at different points of time.

If suppose you are having an issue with the system currently, Like some of your customers are unable to list some data from the database. The first thing that most of the Linux system administrators do is to recall the same issue when it previously occurred, and If you remember the day of its previous occurrence then you can easily compare the internal system statistics with the current statistics.

SAR is very much helpful in doing exactly that.

The first thing that we need to do is check and confirm whether you have SAR utility installed on the machine. Which can be checked by listing all rpm's and finding for this utility.

SAR is one of the utility inside sysstat. You can easily download and install it in your machine very easily through YUM. (But yeah dont worry because most of the distribution comes prepacked with sysstat toolsmiley).

[root@myvm1 ~]# yum install sysstat

Yeah but make it sure that you have epel,rpmformge repository enabled for installing. Otherwise your distribution DVD will be a nice place to look for the package.

SAR (System Activity Reporter) will Give Information about the following things:

 

  1. System Buffer activity
  2. Information about system calls
  3. Block device information
  4. Overall paging information
  5. Semaphore and memory allocation information
  6. CPU utilization and process report

The main thing that we need to understand regarding SAR is that, everything is done using a cron. By default in many Linux distribution you will have a file named /etc/cron.d/sysstat.

Lets see how really SAR works.

If we start thinking about system monitoring, then the tool must have each and every data about the system's different aspects and must cover all time intervals. Which means a monitoring system must be able to provide the statistics of the machine for a given time.

There is no way, other than taking all the metrics and statistics of the machine at a definite time interval. Reducing the time interval for collecting the statistics will increase the amount of detailed statistics we have(because we will be having more data about the system).

SAR does exactly that. sar takes the statistics of different aspects of the machine at a definite time interval. So SAR runs through CRON.

[root@myvm ~]# cat /etc/cron.d/sysstat
# run system activity accounting tool every 10 minutes
*/10 * * * * root /usr/lib64/sa/sa1 1 1
# generate a daily summary of process accounting at 23:53
53 23 * * * root /usr/lib64/sa/sa2 -A
 
  • So it can be seen from the above cron file for SAR that its running "sa1" script located at "/usr/lib64/sa/" at every 10 minutes
  • And is also running a script /usr/lib64/sa/sa2 at the end of the day at around 23.53

So the first cron entry for SAR(/usr/lib64/sa/sa1) will run every 10 minutes which inturn will call the sadc utility to collect system stats and store it in a binary file (one file for a day)

And the second cron entry will dump all the contents of that binary file into another text file, and purges data older than a particular number of days, Normally 7 days by default(which is mentioned in the following file),

[root@archive ~]# cat /etc/sysconfig/sysstat
# How long to keep log files (days), maximum is a month
HISTORY=7
 

So you can modify that HISTORY entry easily by editing the file.

So although the system statistics is being collected every 10 minutes through cron(modify the cron to run every 1 minute for more accurate information) If you want to see the stats, then you need to run the command as below.

The simple sar command output is as shown below.

12:00:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
12:01:01 AM     all     73.28      0.00      1.25      0.00      0.00     25.47
12:02:01 AM     all      7.83      0.00      0.44      0.00      0.00     91.73
12:03:01 AM     all     61.65      0.00      0.70      0.00      0.00     37.66
12:04:01 AM     all     57.85      0.00      0.82      0.00      0.00     41.34
12:05:01 AM     all      4.25      0.00      0.41      0.00      0.00     95.34
12:06:01 AM     all      4.20      0.00      0.22      0.00      0.00     95.58
12:07:01 AM     all      5.05      0.00      0.33      0.00      0.00     94.63
12:08:01 AM     all      4.76      0.00      0.06      0.00      0.00     95.18
12:09:01 AM     all     37.57      0.00      0.37      0.00      0.00     62.05
12:10:01 AM     all     70.04      0.00      0.80      0.00      0.00     29.16
12:11:01 AM     all      5.03      0.00      0.12      0.00      0.00     94.84

 

It can be seen from the output that its reporting me the output of the collected stats for every minute(which means i have my cron at 1 minute interval), and will show the details of the whole day(or will show details collected till when you typed the command).

Understanding the output of SAR command

%user: This shows the total time that the processor is spending on different process YCX5UKN5ZKEJ

%sys: this shows the percentage of time spend by the processor for operating system tasks(because the previous user shows the time spend for user end process)

%iowait: the name iowait itself suggests that its the time spend by processor waiting for devices(input and output)

%nice: Most of you guys must be knowing that a user can change the priority of a process in linux by changing the nice value in Linux. This table shows the time spend by CPU for process whose nice value has been changed.

%steal:  This column shows the amount to time spend by a CPU (which is virtualized), for resources from the physical CPU

%idle: This suggests the idle time spend by the processor.

By default sar stores all of its data under /var/log/sa/ and a days are named as shown below.

s01 - for first day of the month

s02-for second day of the month

s03,s04..and so on.

-d option in SAR command

This -d option can be used to report each and every activity related to different devices attached to the system(block devices). A typical output of the sar command with -d option is shown below.

12:00:01 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
12:01:01 AM    dev3-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

12:01:01 AM   dev3-64      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:01:01 AM    dev8-0     55.62      9.98   8317.87    149.72     13.68    245.96      2.61     14.52
12:01:01 AM   dev8-16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:02:01 AM    dev3-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:02:01 AM   dev3-64      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:02:01 AM    dev8-0      1.55      0.00     35.29     22.70      0.01      7.26      1.86      0.29
12:02:01 AM   dev8-16      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:03:01 AM    dev3-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:03:01 AM   dev3-64      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

 

DEV: this column names devices on the machine, according to major and minor numbers of a Linux block device. You can check this by doing an ls -l in /dev directory. as shown below.

brw-r-----  1 root disk   8,    0 Nov 16 16:29 sda
brw-r-----  1 root disk   8,    1 Nov 16 16:29 sda1
brw-r-----  1 root disk   8,    2 Nov 16 16:29 sda2

in the above shown "ls -l" outut for "sda", major number is "8",and minor number is "0"...So you can easily identify the disk mentioned in the sar command output.

tps: tps stands for transfer per second, so it shows the transfer per second to that particular device

rd_sec/s: this shows you the total number of sectors on that device which is being read

wr_sec/s: if rd_sec/s is sectors being read per second then obviously wr_sec is sectors being written per second.

avgrq-sz: this column shows the average.

await: this shows the total number of time that the processor waited for requests regarding IO

%util: this column shows the usage of cpu in percentage when the request was generated

Show Memory usage in SAR command

the -r option available in sar command is very much useful. it shows the memory,swap,cached memory etc at every interval or required time interval.

02:20:01 AM kbmemfree kbmemused  %memused kbbuffers  kbcached kbswpfree kbswpused  %swpused  kbswpcad
02:30:01 AM    609500   1487652     70.94    242420    777560   1075980       364      0.03       360
02:40:01 AM    609500   1487652     70.94    242424    777568   1075980       364      0.03       360
02:50:01 AM    609500   1487652     70.94    242424    777592   1075980       364      0.03       360
03:00:01 AM    608980   1488172     70.96    242424    777600   1075980       364      0.03       360
03:10:01 AM    608584   1488568     70.98    242424    777628   1075980       364      0.03       360
03:20:01 AM    608584   1488568     70.98    242424    777648   1075980       364      0.03       360
 
 
in the above output most of the columns are self explanatory(and most of the outputs are in KB).
kbmemfree: this shows the amount of free memory
Kbmemused: memory used
%memused: percentage of memory used
kbbuffers: buffer memory used by the kernel.
kbcached: cached memory used by the kernel
all other entries for memory are swap(free,used,percentage etc)
 

How to fetch metrics of a particular day using SAR in linux

 
As mentioned before all the metrics for a particular day are saved in sa<day of month> wise. So if i want to know my metrics for 27 th day the month i can easily find that out as shown below.
 
[root@archive ~]# sar -f /var/log/sa/sa27
Linux 2.6.18-194.el5xen (archive.r)     11/27/2012
 
02:20:01 AM       CPU     %user     %nice   %system   %iowait    %steal     %idle
02:30:01 AM       all      2.58      0.00      0.70      1.12      0.05     95.55
02:40:01 AM       all      2.56      0.00      0.69      1.05      0.04     95.66
02:50:01 AM       all      2.64      0.00      0.65      1.15      0.05     95.50
03:00:01 AM       all      3.27      0.00      0.71      1.12      0.04     94.86
 
in the above command we have passed /var/log/sa/sa27 as an argument as i needed stats for that day..pass the sa<day of month>as you require in the above command
 

How to fetch SAR metrics for a specific time on a particular date

this can be achieved by passing another argument as shown below.

[root@archive ~]# sar -f /var/log/sa/sa27 -s 02:20:00 -e 03:20:00
Linux 2.6.18-194.el5xen (archive.r)     11/27/2012
 
02:20:01 AM       CPU     %user     %nice   %system   %iowait    %steal     %idle
02:30:01 AM       all      2.58      0.00      0.70      1.12      0.05     95.55
02:40:01 AM       all      2.56      0.00      0.69      1.05      0.04     95.66
02:50:01 AM       all      2.64      0.00      0.65      1.15      0.05     95.50
03:00:01 AM       all      3.27      0.00      0.71      1.12      0.04     94.86
03:10:01 AM       all      2.72      1.06      0.75      1.09      0.04     94.33
Average:          all      2.76      0.21      0.70      1.11      0.04     95.18
 
in the above shown example i asked sar to fetch the metrics between 2:20:00 and 3:20:00 on 27th day of the month
 
You can even pass any other metric option along with the time interval...such as -d or -r
 
Have you noticed ? SAR can accurately show us the machine statistics of a particular day at a particular time...so its much easier to identify the bottlenecks.
 
Using -A option along with the above command will show ALL (all the metrics collected by sar).
 
sar -f /var/log/sa/sa27 -s 02:20:00 -e 03:20:00 -A
 
The output will be elaborate. and you will get almost everything in sar from that -A option on your screen!
 

Show network statistics using sar command

sar command even shows network statistics. This can be done by using the -n DEV option in sar command.

[root@archive ~]# sar -n DEV
Linux 2.6.18-194.el5xen (archive.r)     11/27/2012
 
02:20:01 AM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
02:30:01 AM        lo      0.01      0.01      0.77      0.77      0.00      0.00      0.00
02:30:01 AM      eth0     12.30      0.12   1285.06     27.59      0.00      0.00      0.00
02:30:01 AM      eth1     14.45      0.00   1399.34      0.00      0.00      0.00      0.00
02:30:01 AM      sit0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
02:40:01 AM        lo      0.01      0.01      0.77      0.77      0.00      0.00      0.00
02:40:01 AM      eth0     10.65      0.12   1139.38     27.00      0.00      0.00      0.00
02:40:01 AM      eth1     13.96      0.00   1352.87      0.00      0.00      0.00      0.00

IFACE: stands for the nic card interface name

rxpck/s: this shows the total packets received per second

txpck/s:transmitted packets per second

rxcmp/s: compressed packets received

txcmp/s: compressed packets transmitted

rxmcst/s: packets multicasted per second.

 

Some other Metrics that can be determined using sar

-y option in sar:  can be used to determine tty details

-X option in sar to get details of a particular process. You need to pass pid as an argument to this option.

-n SOCK option in sar: this option will report all socket details.

 

There are couple of more options available in SAR.You can refer for the complete list of arguments in man pages..Hope this introductory article was helpful in getting started and using SAR..

Rate this article: 
Average: 4.1 (427 votes)

Comments

Well written, couldn't explain in a better way. please keep doing the good work.

slashmaster's picture

Hi Udit,

Thanks for the comment.. Keep visiting slashroot..

Regards

Good job!

Really good to start with.

I have a doubt why there is different in output between sar -n DEV command vs /proc/net/dev file output?

Which is more acurate, sar output seems to be small figure compared to one in /proc/net/dev, do you have any comments on that?

Thanks
Manoj

Sarath Pillai's picture

Hi Manoj,

the proc file /proc/net/dev shows the total no of packets and bytes received by a network interface(its a total number, hence a large value). If you see the values in the columns named packets & bytes in /proc/net/dev, the values goes on increasing each time you do a cat /proc/net/dev(its a more real time data).

However the output you see in sar -n DEV command is a per second statistics calculated. If you want to compare both of these values (values in sar -n dev and /proc/net/dev ) i would suggest to calculate the difference (by subtracting the values you find each time you do a cat /proc/net/dev ).

The difference in values that you find by doing two consecutive(typed per second) cat /proc/net/dev will be somewhere near the per second values shown by sar -n dev.

Thanks for your comment manoj..hope this helps..

Hi

I have a doubt regarding cpu performance monitoring in linux

"sar -u 1 1" command provide you the CPU utilization of user,system,nice,steal and idle. If we calculate all those values,then we get 100% total which is fine.

What is %CPU values of each process in the output of "ps -e -o pcpu" command?

If we add the %CPU of all the process, Shall we get the 100% cpu?

If not, how to get the %CPU with respect to ps command?

Thanks in advance

Vallinayagam.K

Sarath Pillai's picture

Hi vallinayagam.k,

If you want to sort the processes currently running on your system by percentage of CPU they are using, then a better command for that is "top", rather than PS.

However if you want to achieve the same result, of listing processes by the percentage of cpu with ps command, you can do that by the following method.

ps -eo pcpu,pid,user,args | sort -k 1 -r | head -40

The above command will show you the top 40 processes sorted by cpu usage.

The CPU percentage you see in these output is the amount of processing time that the operating system has provided to that particular process.

Hi Sarath,

Thanks for your prompt reply.

So the %CPU shown in the top output for each processes implies the actual percentage of cpu utilization by that process for that particular time (say 3 sec), whereas %CPU shown in the "ps -eo pid,%cpu" command implies the % of processing time allocate by the CPU (i.e out of 100 cpu%, % of CPU time allocated for process) for each processes, but it may not be used by the processes.

And %CPU of "PS" will not tally with %CPU of "TOP"

Correct me, if i am wrong.

Thanks in advance
Vallinayagam.K

Sarath Pillai's picture

Hi Vallinayagam.K,

To be more accurate both the ps command output and the top command output shows you the same values as far as percent of cpu usage by a process.

However top command is much better coz it will give you a more real time statistics compared to ps command.

You can even mange to get a real time statistics using ps command, but for that you need to rerun the same ps command over and over again. Hence for a more accurate monitoring you can take the help of something called as watch command. Watch command will repeat the command passed to it, at an interval of seconds you mention. For example,

watch -n 1 'ps -eo pcpu,pid,user,args | sort -k 1 -r | head -40

the above shown watch command will update the statistics every 1 seconds, basically it will rerun the command every one second and keep on updating the output on the screen (this is default property of top command.)

hi ,

what is sar????

Hi Friends,

Can somebody help me on this?

I would like to know about the network status from sar output that means how exactly i know that my network traffic is bad from the "sar -n DEV" output what is the maximum threshold value from which we can identify that our network traffic is really bad?

For example i ran tcpdump command on one terminal and on another terminal i captured the sar output, Please see below & let me know the below values shown are fine that my eth0 is normal? if yes how do you say that?

[root@server ~]# sar -n DEV 1
Linux 2.6.32-71.el6.i686 (server.com) 05/10/2015 _i686_ (1 CPU)

11:28:49 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s
11:28:50 AM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00
11:28:50 AM eth0 277.78 526.26 17.50 137.97 0.00 0.00 1.01

if we issue the command sar we are not getting the data

ex :

sar -f /var/log/sa/sa12
Linux 2.6.32.45-0.3-default (workstation) 12/12/13 _s390x_

only the above is showing ..

Hi Sarath,
since i come from windows background and the this document is very much impressive and helped me lot.

Thanks
Mohanraj Jayaraman

Hi,
Sometimes when i check the statistics of a given period of time , the sum of %idle, %system, %user and %iowait is more tan 100%, why is that?

Thanks
Mario Ruiz

Excellent guide, thanx!!!

how can i monitor the cpu usage over the past 2 months and redirect the result to a text file or log file for future analysis

very informative, thank you :-)

what is aux .... what is its purpose????

Very thorough information. Great job. Really appreciate it!!

Hi,

sa2 -A handles the saXX files and turn them out to sarXX files. So, as sarXX are ascii files, is there any command or application to handle them?

very informative, thank you :)

That's too easy to remember also :)

curious to know if we can convert and export this data to .csv or pdf

Thank you for providing this great guide on SAR.
Keep on writing nice articles!
Added to favorites!

Very NICE tutorial, Thanks very much!!! Great job too!!!

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.