Network Traffic Analysis With Linux Tools

Sarath Pillai's picture
libcap traffic analysis

Network Packet analyzing tools available in the market are very much expensive to deploy and most of the times they needs to be implemented by making some changes in the current network infrastructure.

In this article we will try look at how we can achieve a considerable amount of network packet analysis with the help of some of the normally available tools in Linux. Sometimes on high traffic servers it is very much necessary for an administrator to quickly analyze the traffic patters so that he can make some decisions based on that.

PCAP Basics

 

PCAP is an API for capturing network packets for analysis. Windows implementation of PCAP is called WinPCAP. Almost all available network packet capturing tools uses pcap to capture the data. In linux or Unix like operating system's if you have tcpdump installed on the machine, you will have the PCAP library in the location /usr/lib/libpcap.so for use.

PCAP api is written in C programming Language.

Normally application programmer's use's this PCAP library inside their application for network packet capturing. The most noticeable feature of PCAP library is that you can save the captured traffic data to a file with .pcap extension which can be read later, by a pcap implemented tool.

Network Traffic Analysis in Linux

For this tutorial we will be using some of the tools that are freely available for install in linux.

  1. TCPDUMP command
  2. BASH (the default shell in most of the distribution)
  3. Some shell scripting commands.

As i told before, you can capture the network traffic data using tcpdump(because it implements pcap library with libpcap) and write it to a file, which i have also shown practically on my post about the tcpdump command

 

[root@myvm1 ~]# tcpdump -s0 -i any -w traffic.pcap
tcpdump: WARNING: Promiscuous mode not supported on the "any" device
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
8 packets captured
17 packets received by filter
0 packets dropped by kernel
[root@myvm1 ~]#

Using the above tcpdump command i have used -s0 option to capture the full packet size, -i is used for capture from all interfaces, and with -w i have asked tcpdump command to write the output to a file called traffic.pcap, which can be later read by pcap tools.

You can also use wireshark tool to do the same job.

Now we will start using some of the excellent Linux tools available, to represent the output from the captured tcpdump output in the file traffic.pcap.

[root@myvm1 ~]# tcpdump -nn -r traffic.pcap | head
reading from file traffic.pcap, link-type LINUX_SLL (Linux cooked)
00:55:45.421932 IP 192.168.159.128.22 > 192.168.159.1.49587: P 1822700485:1822700537(52) ack 3115204597 win 202
00:55:45.421939 IP 192.168.159.128.22 > 192.168.159.1.49587: P 52:184(132) ack 1 win 202
00:55:45.421944 IP 192.168.159.1.49587 > 192.168.159.128.22: . ack 184 win 16199
00:55:46.241056 IP 192.168.159.128 > 192.168.159.128: ICMP echo request, id 6454, seq 14082, length 8

Using the above command i have read the content of the traffic.pcap file which we have captured, and redirected the output of that command to "head" command.

Note:   In the above shown command i have used piping feature in the shell to redirect the output to "head" command. Head command will list the first 10 lines, rather than outputting the whole output on the screen. This command is very handy to use when you have got a very large file in Linux, and you only want to have a look at the top 10 lines.

Now lets understand the output of the tcpdump command. The tcpdump command output consists of four fields to note.

  1. The first filed shows the timestamp.
  2. Second field in tcpdump shows the protocol
  3. The third field shows the source IP and also the source port. Its in the form source ip.source port
  4. the fourth field shows the destination ip and destination port. This is also shown the same way as destination ip.destination port

The main reason i explained the output fields of tcpdump is the fact that, it contains most of the traffic details you require to analyze.(we will analyze some more later).

So the main thing to do here is that, capture all the network traffic to all the interfaces for a brief period of time, and then analyze it from that file, to find the anomalies.

We will cut and sort the output of the required filed from the traffic.pcap file(which needs to be captured for a brief period of time to get good results).

Lets have a look at source Ip addresses' which are taking too many requests. We will use "cut" command in linux to achieve the results. Lets do the sorting as shown below.

We will cut out the source ip, from rest of the output using "cut" command in Linux.

[root@myvm1 ~]# tcpdump -nn -r traffic.pcap | cut -f 3 -d " " | head
reading from file traffic.pcap, link-type LINUX_SLL (Linux cooked)
192.168.159.128.22
192.168.159.128.22
192.168.159.1.49587
192.168.159.128
192.168.159.128
192.168.159.128.22
192.168.159.1.49587
192.168.159.128

In the above shown example i needed only the third field in the output, which is the source ip field, so field three 3rd field is sorted out with -f option in cut command. Now we need that third field till there is an adjacent space in the third field. So for that we have used -d " " option.

But the above command result consists of the source port also, which need to remove from the output and find the source ip's sending traffic. And then we need to sort those source ip's in count.

[root@myvm1 ~]# tcpdump -nn -r traffic.pcap | cut -f 3 -d " " | cut -f 1-4 -d "." | sort | uniq -c | sort -n
reading from file traffic.pcap, link-type LINUX_SLL (Linux cooked)
      7 192.168.159.1
     17 192.168.159.128

In the above shown command we removed the "port" portion by again redirecting the output to cut command, which will intern only display fields 1 to 4 (which is ip address), and each and every field is separated by a "." Now again this output is passed to sort command which will sort the output based on unique values. Now this output is passed to "uniq" command which will count the unique values, and then sort will rearrange them in numerical order...

From the result of the above command, its clear that i have got 17 requests from "192.168.159.128". and 7 requests from the ip "192.168.159.1"(note the fact that this list is only for that period when you captured the tcpdump output to that traffic.pcap file).

Now if you want to organize only the output of "TCP & UDP" then you can do that with the following command.

 

[root@myvm1 ~]# tcpdump -nn -r traffic.pcap -p 'tcp or udp'| cut -f 3 -d " " | cut -f 1-4 -d "." | sort | uniq -c | sort -n
reading from file traffic.pcap, link-type LINUX_SLL (Linux cooked)
      7 192.168.159.1
      7 192.168.159.128

You can also sort the destination ports with the same set of commands as shown below.

[root@myvm1 ~]# tcpdump -nn -r traffic.pcap -p 'tcp or udp'| cut -f 5 -d " " | cut -f 5 -d "." | sort | uniq -c | sort -n
reading from file traffic.pcap, link-type LINUX_SLL (Linux cooked)
      7 22:
      7 49587:

Sort the source port as shown below.

[root@myvm1 ~]# tcpdump -nn -r traffic.pcap -p 'tcp or udp'| cut -f 3 -d " " | cut -f 5 -d "." | sort | uniq -c | sort -n
reading from file traffic.pcap, link-type LINUX_SLL (Linux cooked)
      7 22
      7 49587

How to check data inside the packet captured

Now lets ask tcpdump to read the payload and report us the content. I will be listing the unencrypted ftp port 21 and we will see whats going on inside.

[root@myvm1 ~]# tcpdump -Ann -r traffic.pcap 'dst port 25 or dst port 21' | head -20
reading from file traffic.pcap, link-type LINUX_SLL (Linux cooked)
02:19:57.916852 IP 192.168.159.1.50730 > 192.168.159.128.21: P 2484026965:2484026972(7) ack 2999050325 win 16325
E../..@...,Z.........*....>U...UP.?.....CWD /
 
02:19:57.917801 IP 192.168.159.1.50730 > 192.168.159.128.21: P 7:12(5) ack 38 win 16316
E..-..@...,[.........*....>\...zP.?.....PWD
.
02:19:57.918515 IP 192.168.159.1.50730 > 192.168.159.128.21: P 12:20(8) ack 47 win 16313
E..0..@...,W.........*....>a....P.?..X..TYPE A
 
02:19:57.919694 IP 192.168.159.1.50730 > 192.168.159.128.21: P 20:26(6) ack 77 win 16306
E.....@...,X.........*....>i....P.?.....PASV

As you can see, for the above output i have asked tcpdump to only show me the output for those packets whose destination ports are 25 or 21, so that i can see ftp and mail traffic data.

Now for example if you want to look inside a lot of mail traffic that is being passed through the network you can do that the above shown way.

Also, if you want to search for some particular string from the above output, you can do that easily by passing the output again to grep command containing the string required.(note that you will only be able to see the unencrypted traffic in this manner.)

The above shown method of capturing the packets using tcpdump and looking for the contents inside is very much useful when you have a DNS server running in you infra. Because you can identify which domain is highly queried in your DNS server, which are the top TLD's under request, source ip requesting a particular domain etc etc. An example of looking inside DNS traffic is shown below.

 tcpdump -Ann -r traffic.pcap 'dst port 53' | grep  '(com|in|org)'
 

Now lets check the output of all http traffic using tcpdump from our captured file traffic.pcap.

[root@myvm1 ~]# tcpdump -Ann -r traffic.pcap 'dst port 80' | head
reading from file traffic.pcap, link-type LINUX_SLL (Linux cooked)
02:35:12.079040 IP 192.168.159.1.51358 > 192.168.159.128.80: S 3067739367:3067739367(0) win 8192 <mss 1460,nop,wscale 2,nop,nop,sackOK>
E..4/.@................P.......... .................
02:35:12.079513 IP 192.168.159.1.51358 > 192.168.159.128.80: . ack 4028188194 win 16425
E..(/.@................P......F"P.@)..........
02:35:12.080541 IP 192.168.159.1.51358 > 192.168.159.128.80: P 0:282(282) ack 1 win 16425
E..B/.@...
............P......F"P.@)....GET / HTTP/1.1
Host: 192.168.159.128
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
[root@myvm1 ~]#

You can modify the output you see above of http traffic, as per your needs with grep,cut,sort,uniq,etc .

Like for example you can get all the user agents using the same above command from http traffic. And also sort it to find the no of requests from various browsers as shown below.

[root@myvm1 ~]# tcpdump -Ann -r traffic.pcap 'port 80' | grep -Ei 'user-agent' | head -10
reading from file traffic.pcap, link-type LINUX_SLL (Linux cooked)
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0

Now lets sort that user agent output so that we can find which browser is taking the highest request.

[root@myvm1 ~]# tcpdump -Ann -r traffic.pcap 'port 80' | grep -Ei 'user-agent' | sort | uniq -c | sort -n
reading from file traffic.pcap, link-type LINUX_SLL (Linux cooked)
      2 User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20100101 Firefox/17.0
     12 User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11
     23 User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)

You can analyze a lot of stuff in the traffic, which are unencrypted...The sorting of the result according to the requirement is the main part here.Also remember that you can sort anything, as per requirement from the traffic, all depends on your bash skills.

The added advantage of using the above method for analyzing the packets is the fact that you can capture the packets for a long duration and then store for later analysis and get an awesome overview of the traffic.

Hope this post was useful in understanding packet analysis with the help of simple tcpdump command in Linux.

 

Rate this article: 
Average: 4.8 (4 votes)

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Type the characters you see in this picture. (verify using audio)
Type the characters you see in the picture above; if you can't read them, submit the form and a new image will be generated. Not case sensitive.