How Does ARP(Address Resolution Protocol) Work?
Two machines in a network can only communicate with each other, if they know each other’s physical address. Although computer programs use IP addresses to send and receive messages, the actual underlying communication always happens over the physical address.
Let’s first understand how communication happens over the wire. Let’s try pinging google's publicly available dns server from a machine, and try capturing network packets and see what are the source and destination addresses.
tcpdump is a tool used for capturing network packets and see the details. If you are new to tcpdump, i would recommend going through the below article to understand the basics of tcpdump.
root@ip-10-12-2-73:~# ping 22.214.171.124 PING 126.96.36.199 (188.8.131.52) 56(84) bytes of data. 64 bytes from 184.108.40.206: icmp_seq=1 ttl=39 time=9.16 ms 64 bytes from 220.127.116.11: icmp_seq=2 ttl=39 time=9.28 ms 64 bytes from 18.104.22.168: icmp_seq=3 ttl=39 time=9.31 ms 64 bytes from 22.214.171.124: icmp_seq=4 ttl=39 time=9.32 ms
Now at the same time, as the above ping is working, let's try capturing network packets using another shell session on the same server. The command am going to use for capturing network packets is tcpdump -n host 126.96.36.199
-n host 188.8.131.52 will only capture packets where either the source or the destination is 184.108.40.206(Also it will show IP addresses in the output rather than dns names. This is because of -n parameter).
root@ip-10-12-2-73:~# tcpdump -n host 220.127.116.11 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 21:39:41.531390 IP 10.12.2.73 > 18.104.22.168: ICMP echo request, id 16331, seq 1, length 64 21:39:41.540342 IP 22.214.171.124 > 10.12.2.73: ICMP echo reply, id 16331, seq 1, length 64 21:39:42.531815 IP 10.12.2.73 > 126.96.36.199: ICMP echo request, id 16331, seq 2, length 64 21:39:42.540840 IP 188.8.131.52 > 10.12.2.73: ICMP echo reply, id 16331, seq 2, length 64
The output of tcpdump command is pretty straight forward. It shows a continues series of ICMP echo requests going out from our server(indicated by 10.12.2.73), and subsequent replies coming back from google(indicated by 184.108.40.206).
As 220.127.116.11 is not in the same network, my server cannot reach there directly without a gateway. So my ping requests to 18.104.22.168 should flow via my gateway. Gateway address can be found using the command route -n
root@ip-10-12-2-73:~# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.12.2.1 0.0.0.0 UG 0 0 0 eth0 10.12.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
Our gateway here is 10.12.2.1. This is clearly indicated by the very first line in the above output. For reaching anywhere(indicated by 0.0.0.0), the packets should flow via the gateway address of 10.12.2.1.
So even if we need to reach 22.214.171.124, we need to go via 10.12.2.1(as it is our gateway). But why is tcpdump output not showing any trace of 10.12.2.1(gateway)?
Tcpdump is showing that the source address is 10.12.2.73 and destination is 126.96.36.199. As 188.8.131.52 is not part of our local network, we will have to go via our gateway address of 10.12.2.1. So somewhere the destination address should be 10.12.2.1 right?. Else how will our packets reach our gateway?
Our ping is working perfectly. So its surely using the gateway to reach 184.108.40.206 (as there is no other way out). But where the hell is the gateway address in the packet. The packet is showing the destination address of 220.127.116.11. But then how is it reaching the gateway?
This is exactly where physical addresses(MAC Addresses) steps in.
As the ping to 18.104.22.168 is going on, lets execute tcpdump on another session once again(this time with an additional option -e.)
root@ip-10-12-2-73:~# tcpdump -e -n host 22.214.171.124 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 21:47:56.820194 12:6e:eb:de:b3:ed > 12:6f:56:c0:c4:c1, ethertype IPv4 (0x0800), length 98: 10.12.2.73 > 126.96.36.199: ICMP echo request, id 16347, seq 1, length 64 21:47:56.829102 12:6f:56:c0:c4:c1 > 12:6e:eb:de:b3:ed, ethertype IPv4 (0x0800), length 98: 188.8.131.52 > 10.12.2.73: ICMP echo reply, id 16347, seq 1, length 64 21:47:57.821516 12:6e:eb:de:b3:ed > 12:6f:56:c0:c4:c1, ethertype IPv4 (0x0800), length 98: 10.12.2.73 > 184.108.40.206: ICMP echo request, id 16347, seq 2, length 64 21:47:57.830386 12:6f:56:c0:c4:c1 > 12:6e:eb:de:b3:ed, ethertype IPv4 (0x0800), length 98: 220.127.116.11 > 10.12.2.73: ICMP echo reply, id 16347, seq 2, length 64
This time along with the ip addresses, we are able to see physical addresses(mac addresses) as well in the output. Indicated by 12:6e:eb:de:b3:ed > 12:6f:56:c0:c4:c1 & 12:6f:56:c0:c4:c1 > 12:6e:eb:de:b3:ed.
root@ip-10-12-2-73:~# ifconfig eth0 eth0 Link encap:Ethernet HWaddr 12:6e:eb:de:b3:ed inet addr:10.12.2.73 Bcast:10.12.2.255 Mask:255.255.255.0 inet6 addr: fe80::106e:ebff:fede:b3ed/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1 RX packets:1200693 errors:0 dropped:0 overruns:0 frame:0 TX packets:945763 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2452613050 (2.4 GB) TX bytes:447161879 (447.1 MB)
From the above ifconfig command output, we can confirm that 12:6e:eb:de:b3:ed is our server's mac address(indicated by HWaddr 12:6e:eb:de:b3:ed in the ifconfig output).
But what is 12:6f:56:c0:c4:c1 ? We can use a command called arp -n -a to find out what is 12:6f:56:c0:c4:c1. ARP stands for address resolution protocol. It does the job of translating IP addresses to MAC addresses. So arp -n -a will show all the mac addresses and their equivalent IP addresses that our server is aware of. We will jump into arp and its working shortly.
root@ip-10-12-2-73:~# arp -n -a ? (10.12.2.40) at 12:f7:fd:48:aa:79 [ether] on eth0 ? (172.17.0.2) at 02:42:ac:11:00:02 [ether] on docker0 ? (10.12.2.43) at 12:48:08:aa:a5:bb [ether] on eth0 ? (10.12.2.8) at 12:ab:ed:67:34:79 [ether] on eth0 ? (10.12.2.94) at 12:47:87:c2:60:8d [ether] on eth0 ? (10.12.2.1) at 12:6f:56:c0:c4:c1 [ether] on eth0
Voila!. 12:6f:56:c0:c4:c1 is the mac address of our gateway(10.12.2.1). So basically even if the destination IP address is 18.104.22.168, the destination mac address will always be of the gateway server.
MAC addresses(Physical addresses) are part of layer 2. IP addresses are part of layer 3(source address: ). The content of layer 3 is encapsulated inside layer 2. Layer 2 will have the source mac address of our server, and the destination mac address of the gateway. This is how the packet reaches the gateway. Gateway will peal the physical layer 2, and as soon as it finds the destination as 22.214.171.124, it will forward that packet again to its gateway(ie: our gateway will forward the packet to the next gateway, well depending upon the routes.)..
This is how the packet travels and reaches its final destination of 126.96.36.199. The second last network device in the path to reaching 188.8.131.52, will know the mac address of 184.108.40.206 using ARP protocol.,
The bottom line is...If you want to reach a particular destination IP address, the system will be doing a translation of that IP address to equivalent mac address. Because the real communication happens using physical addresses. ARP(Address Resolution Protocol) is used in order to find the physical address associated with an IP address.
Above shown diagram explains how a computer finds out the mac address associated with an IP address using Address Resolution Protocol. The Very first request shown in the above diagram depicts an "ARP request from 10.12.2.73" to find out the MAC address of 10.12.2.1.
This ARP request is a broadcast request. This is the reason why destination MAC address in this request is set to 00:00:00:00:00(broadcast mac address). When the network device to which all the computers in this network is connected recieves such a request with the destination address of 00:00:00:00:00, it will forward that request to all the computers in that network(well that is what broadcast means. Send it to everybody connected).
Although every computer in the network recieves that request. Only the computer that has the ip address of 10.12.2.1 will respond back. Everybody else in the network will discard this request after verifying the destination IP address. Only the computer who's IP address matches the destination IP address in the ARP request will respond back.
While responding back, it will send its own mac address. This way 10.12.2.73 finds out the mac address associated with 10.12.2.1. Below terms about ARP is worth noting...
- ARP Cache: After finding the MAC address associated with an IP, the computer stores it in a table for future reference. All subsequent communication to that IP address can use the mac address from the this table. This is table is also called as ARP Cache.
- ARP Cache Timeout: The entries added to ARP table for future reference will be valid for a specified amount of time. This indicates that time.
- ARP Request: We already saw that above. Its the broadcast request send by a computer to find out the mac associated with an IP address.
- ARP Response: As shown in the above diagram, this is the response from the destination host, containing both IP and MAC.
How to View the ARP Cache Table in Linux?
Almost all distribution of linux comes with a command line utility called arp. You can use this to view the arp table entries(shown below.)
root@ip-10-12-2-73:~# arp -n -a ? (10.12.2.40) at 12:f7:fd:48:aa:79 [ether] on eth0 ? (10.12.2.43) at 12:48:08:aa:a5:bb [ether] on eth0 ? (10.12.2.8) at 12:ab:ed:67:34:79 [ether] on eth0 ? (10.12.2.94) at 12:47:87:c2:60:8d [ether] on eth0 ? (10.12.2.1) at 12:6f:56:c0:c4:c1 [ether] on eth0
You can also use the below command to view arp table.
root@ip-10-12-2-73:~# arp -n -e Address HWtype HWaddress Flags Mask Iface 10.12.2.40 ether 12:f7:fd:48:aa:79 C eth0 10.12.2.43 ether 12:48:08:aa:a5:bb C eth0 10.12.2.8 ether 12:ab:ed:67:34:79 C eth0 10.12.2.94 ether 12:47:87:c2:60:8d C eth0 10.12.2.1 ether 12:6f:56:c0:c4:c1 C eth0
How to manually add an entry in ARP cache Table in Linux?
arp command has an option for doing this in linux. See below.
root@ip-10-12-2-73:~# arp -s 10.12.67.43 12:48:08:bb:a5:bb
Make sure the IP address is a valid one in the above command.
How to manually delete an entry from ARP Cache Table in Linux?
You can delete an entry from ARP cache table using -d option of arp command in Linux(shown below.)
root@ip-10-12-2-73:~# arp -d 10.12.67.43
How to remove all entries from ARP Cache table in Linux?
The standard ip command can be used for this purpose.
root@ip-10-12-2-73:~# ip neigh flush all
Where are ARP entries stored in the system?
Its in memory. So you can access it from /proc file system. See below..
root@ip-10-12-2-73:~# cat /proc/net/arp IP address HW type Flags HW address Mask Device 10.12.2.40 0x1 0x2 12:f7:fd:48:aa:79 * eth0 10.12.2.43 0x1 0x2 12:48:08:aa:a5:bb * eth0 10.12.2.8 0x1 0x2 12:ab:ed:67:34:79 * eth0 10.12.2.94 0x1 0x2 12:47:87:c2:60:8d * eth0 10.12.2.1 0x1 0x2 12:6f:56:c0:c4:c1 * eth0
How to add multiple entries to arp cache table at once?
arp utility supports taking entries from a file. You can pass the file path as an argument to arp command. By default /etc/ethers file is taken. The file content should look like the below..
12:f7:fd:48:aa:79 10.12.2.40 12:48:08:aa:a5:bb 10.12.2.43
Now you can give the path of the file as argumet.
root@ip-10-12-2-73:~# arp -f /etc/ethers
The linux kernel ARP module supports lots of fine tuning options. Most of which are modified using files located inside /proc. You can find ARP related kernel files inside /proc/sys/net/ipv4/neigh/default.
root@ip-10-12-2-73:/proc/sys/net/ipv4/neigh/default# ls anycast_delay gc_interval locktime retrans_time_ms app_solicit gc_stale_time mcast_solicit ucast_solicit base_reachable_time gc_thresh1 proxy_delay unres_qlen base_reachable_time_ms gc_thresh2 proxy_qlen unres_qlen_bytes delay_first_probe_time gc_thresh3 retrans_time
You can grab explanations to these files and their entries from here: ARP Kernel Module Options