How Does ARP(Address Resolution Protocol) Work?

Sarath Pillai's picture
Address Resolution Protocol

Two machines in a network can only communicate with each other, if they know each other’s physical address. Although computer programs use IP addresses to send and receive messages, the actual underlying communication always happens over the physical address.


Let’s first understand how communication happens over the wire. Let’s try pinging google's publicly available dns server from a machine, and try capturing network packets and see what are the source and destination addresses.

tcpdump is a tool used for capturing network packets and see the details. If you are new to tcpdump, i would recommend going through the below article to understand the basics of tcpdump.

 

Read: Tcpdump command examples in Linux

 

root@ip-10-12-2-73:~# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=39 time=9.16 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=39 time=9.28 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=39 time=9.31 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=39 time=9.32 ms

 

 

Now at the same time, as the above ping is working, let's try capturing network packets using another shell session on the same server. The command am going to use for capturing network packets is tcpdump -n host 8.8.8.8

-n host 8.8.8.8 will only capture packets where either the source or the destination is 8.8.8.8(Also it will show IP addresses in the output rather than dns names. This is because of -n parameter). 

 

root@ip-10-12-2-73:~# tcpdump -n host 8.8.8.8
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
21:39:41.531390 IP 10.12.2.73 > 8.8.8.8: ICMP echo request, id 16331, seq 1, length 64
21:39:41.540342 IP 8.8.8.8 > 10.12.2.73: ICMP echo reply, id 16331, seq 1, length 64
21:39:42.531815 IP 10.12.2.73 > 8.8.8.8: ICMP echo request, id 16331, seq 2, length 64
21:39:42.540840 IP 8.8.8.8 > 10.12.2.73: ICMP echo reply, id 16331, seq 2, length 64

 

The output of tcpdump command is pretty straight forward. It shows a continues series of ICMP echo requests going out from our server(indicated by 10.12.2.73), and subsequent replies coming back from google(indicated by 8.8.8.8).

 

As 8.8.8.8 is not in the same network, my server cannot reach there directly without a gateway. So my ping requests to 8.8.8.8 should flow via my gateway. Gateway address can be found using the command route -n

 

root@ip-10-12-2-73:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.12.2.1       0.0.0.0         UG    0      0        0 eth0
10.12.2.0       0.0.0.0         255.255.255.0   U     0      0        0 eth0

 

Our gateway here is 10.12.2.1. This is clearly indicated by the very first line in the above output. For reaching anywhere(indicated by 0.0.0.0), the packets should flow via the gateway address of 10.12.2.1. 

 

So even if we need to reach 8.8.8.8, we need to go via 10.12.2.1(as it is our gateway). But why is tcpdump output not showing any trace of 10.12.2.1(gateway)?

 

Tcpdump is showing that the source address is 10.12.2.73 and destination is 8.8.8.8. As 8.8.8.8 is not part of our local network, we will have to go via our gateway address of 10.12.2.1. So somewhere the destination address should be 10.12.2.1 right?. Else how will our packets reach our gateway?

Our ping is working perfectly. So its surely using the gateway to reach 8.8.8.8 (as there is no other way out). But where the hell is the gateway address in the packet. The packet is showing the destination address of 8.8.8.8. But then how is it reaching the gateway?

 

This is exactly where physical addresses(MAC Addresses) steps in.

 

As the ping to 8.8.8.8 is going on, lets execute tcpdump on another session once again(this time with an additional option -e.)

 

root@ip-10-12-2-73:~# tcpdump -e -n host 8.8.8.8
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
21:47:56.820194 12:6e:eb:de:b3:ed > 12:6f:56:c0:c4:c1, ethertype IPv4 (0x0800), length 98: 10.12.2.73 > 8.8.8.8: ICMP echo request, id 16347, seq 1, length 64
21:47:56.829102 12:6f:56:c0:c4:c1 > 12:6e:eb:de:b3:ed, ethertype IPv4 (0x0800), length 98: 8.8.8.8 > 10.12.2.73: ICMP echo reply, id 16347, seq 1, length 64
21:47:57.821516 12:6e:eb:de:b3:ed > 12:6f:56:c0:c4:c1, ethertype IPv4 (0x0800), length 98: 10.12.2.73 > 8.8.8.8: ICMP echo request, id 16347, seq 2, length 64
21:47:57.830386 12:6f:56:c0:c4:c1 > 12:6e:eb:de:b3:ed, ethertype IPv4 (0x0800), length 98: 8.8.8.8 > 10.12.2.73: ICMP echo reply, id 16347, seq 2, length 64

 

This time along with the ip addresses, we are able to see physical addresses(mac addresses) as well in the output. Indicated by 12:6e:eb:de:b3:ed > 12:6f:56:c0:c4:c1 & 12:6f:56:c0:c4:c1 > 12:6e:eb:de:b3:ed.

 

root@ip-10-12-2-73:~# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 12:6e:eb:de:b3:ed  
          inet addr:10.12.2.73  Bcast:10.12.2.255  Mask:255.255.255.0
          inet6 addr: fe80::106e:ebff:fede:b3ed/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
          RX packets:1200693 errors:0 dropped:0 overruns:0 frame:0
          TX packets:945763 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2452613050 (2.4 GB)  TX bytes:447161879 (447.1 MB)

 

From the above ifconfig command output, we can confirm that 12:6e:eb:de:b3:ed is our server's mac address(indicated by HWaddr 12:6e:eb:de:b3:ed in the ifconfig output).

 

But what is 12:6f:56:c0:c4:c1 ? We can use a command called arp -n -a to find out what is 12:6f:56:c0:c4:c1. ARP stands for address resolution protocol. It does the job of translating IP addresses to MAC addresses. So arp -n -a will show all the mac addresses and their equivalent IP addresses that our server is aware of. We will jump into arp and its working shortly.

 

root@ip-10-12-2-73:~# arp -n -a
? (10.12.2.40) at 12:f7:fd:48:aa:79 [ether] on eth0
? (172.17.0.2) at 02:42:ac:11:00:02 [ether] on docker0
? (10.12.2.43) at 12:48:08:aa:a5:bb [ether] on eth0
? (10.12.2.8) at 12:ab:ed:67:34:79 [ether] on eth0
? (10.12.2.94) at 12:47:87:c2:60:8d [ether] on eth0
? (10.12.2.1) at 12:6f:56:c0:c4:c1 [ether] on eth0

 

Voila!. 12:6f:56:c0:c4:c1 is the mac address of our gateway(10.12.2.1). So basically even if the destination IP address is 8.8.8.8, the destination mac address will always be of the gateway server.

 

MAC addresses(Physical addresses) are part of layer 2. IP addresses are part of layer 3(source address: ). The content of layer 3 is encapsulated inside layer 2. Layer 2 will have the source mac address of our server, and the destination mac address of the gateway. This is how the packet reaches the gateway. Gateway will peal the physical layer 2, and as soon as it finds the destination as 8.8.8.8, it will forward that packet again to its gateway(ie: our gateway will forward the packet to the next gateway, well depending upon the routes.)..

 

This is how the packet travels and reaches its final destination of 8.8.8.8. The second last network device in the path to reaching 8.8.8.8, will know the mac address of 8.8.8.8 using ARP protocol.,

 

The bottom line is...If you want to reach a particular destination IP address, the system will be doing a translation of that IP address to equivalent mac address. Because the real communication happens using physical addresses. ARP(Address Resolution Protocol) is used in order to find the physical address associated with an IP address.

 

ARP Request and Response Pattern

 

Above shown diagram explains how a computer finds out the mac address associated with an IP address using Address Resolution Protocol. The Very first request shown in the above diagram depicts an "ARP request from 10.12.2.73" to find out the MAC address of 10.12.2.1.

This ARP request is a broadcast request. This is the reason why destination MAC address in this request is set to 00:00:00:00:00(broadcast mac address). When the network device to which all the computers in this network is connected recieves such a request with the destination address of 00:00:00:00:00, it will forward that request to all the computers in that network(well that is what broadcast means. Send it to everybody connected).

Although every computer in the network recieves that request. Only the computer that has the ip address of 10.12.2.1 will respond back. Everybody else in the network will discard this request after verifying the destination IP address. Only the computer who's IP address matches the destination IP address in the ARP request will respond back.

While responding back, it will send its own mac address. This way 10.12.2.73 finds out the mac address associated with 10.12.2.1. Below terms about ARP is worth noting...

 

  1. ARP Cache: After finding the MAC address associated with an IP, the computer stores it in a table for future reference. All subsequent communication to that IP address can use the mac address from the this table. This is table is also called as ARP Cache.
  2. ARP Cache Timeout: The entries added to ARP table for future reference will be valid for a specified amount of time. This indicates that time.
  3. ARP Request: We already saw that above. Its the broadcast request send by a computer to find out the mac associated with an IP address.
  4. ARP Response: As shown in the above diagram, this is the response from the destination host, containing both IP and MAC.

 

 

How to View the ARP Cache Table in Linux?

Almost all distribution of linux comes with a command line utility called arp. You can use this to view the arp table entries(shown below.)

root@ip-10-12-2-73:~# arp -n -a
? (10.12.2.40) at 12:f7:fd:48:aa:79 [ether] on eth0
? (10.12.2.43) at 12:48:08:aa:a5:bb [ether] on eth0
? (10.12.2.8) at 12:ab:ed:67:34:79 [ether] on eth0
? (10.12.2.94) at 12:47:87:c2:60:8d [ether] on eth0
? (10.12.2.1) at 12:6f:56:c0:c4:c1 [ether] on eth0

 

 

You can also use the below command to view arp table.

root@ip-10-12-2-73:~# arp -n -e
Address                  HWtype  HWaddress           Flags Mask            Iface
10.12.2.40               ether   12:f7:fd:48:aa:79   C                     eth0
10.12.2.43               ether   12:48:08:aa:a5:bb   C                     eth0
10.12.2.8                ether   12:ab:ed:67:34:79   C                     eth0
10.12.2.94               ether   12:47:87:c2:60:8d   C                     eth0
10.12.2.1                ether   12:6f:56:c0:c4:c1   C                     eth0

 

How to manually add an entry in ARP cache Table in Linux?

arp command has an option for doing this in linux. See below.

 

root@ip-10-12-2-73:~# arp  -s  10.12.67.43  12:48:08:bb:a5:bb

 

Make sure the IP address is a valid one in the above command.

 

How to manually delete an entry from ARP Cache Table in Linux?

You can delete an entry from ARP cache table using -d option of arp command in Linux(shown below.)

 

root@ip-10-12-2-73:~# arp  -d  10.12.67.43

 

How to remove all entries from ARP Cache table in Linux?

The standard ip command can be used for this purpose.

 

root@ip-10-12-2-73:~# ip neigh flush all

 

Where are ARP entries stored in the system?


Its in memory. So you can access it from /proc file system. See below..

 

root@ip-10-12-2-73:~# cat /proc/net/arp
IP address       HW type     Flags       HW address            Mask     Device
10.12.2.40       0x1         0x2         12:f7:fd:48:aa:79     *        eth0
10.12.2.43       0x1         0x2         12:48:08:aa:a5:bb     *        eth0
10.12.2.8        0x1         0x2         12:ab:ed:67:34:79     *        eth0
10.12.2.94       0x1         0x2         12:47:87:c2:60:8d     *        eth0
10.12.2.1        0x1         0x2         12:6f:56:c0:c4:c1     *        eth0

 

How to add multiple entries to arp cache table at once?

arp utility supports taking entries from a file. You can pass the file path as an argument to arp command. By default /etc/ethers file is taken. The file content should look like the below..

 

12:f7:fd:48:aa:79 10.12.2.40
12:48:08:aa:a5:bb 10.12.2.43

 

Now you can give the path of the file as argumet.

 

root@ip-10-12-2-73:~# arp -f /etc/ethers

 

The linux kernel ARP module supports lots of fine tuning options. Most of which are modified using files located inside /proc. You can find ARP related kernel files inside /proc/sys/net/ipv4/neigh/default.

 

root@ip-10-12-2-73:/proc/sys/net/ipv4/neigh/default# ls
anycast_delay           gc_interval    locktime       retrans_time_ms
app_solicit             gc_stale_time  mcast_solicit  ucast_solicit
base_reachable_time     gc_thresh1     proxy_delay    unres_qlen
base_reachable_time_ms  gc_thresh2     proxy_qlen     unres_qlen_bytes
delay_first_probe_time  gc_thresh3     retrans_time

 

 

You can grab explanations to these files and their entries from here: ARP Kernel Module Options

Rate this article: 
Average: 4.7 (12 votes)

Comments

Nice explanation!

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Type the characters you see in this picture. (verify using audio)
Type the characters you see in the picture above; if you can't read them, submit the form and a new image will be generated. Not case sensitive.