How Does UDP Work ?
IETF(Internet Engineering Task Force) is a community of engineers and computer scientists, who work towards bringing different new internet technologies, standards and specifications. A RFC is a document that is published by IETF.
It is generally written formally for peer review. Mostly it discusses different protocol methods and its complete details and features. It mostly works as a standard document for engineers to implement, give feedback, submit a new protocol related information and its concepts(sort of proposal as well - Hence the name “Request for Comments”). Why am I discussing about IETF and RFCs here?.
Its because an RFC named RFC 768 is probably the shortest(only a couple of paragraphs with few details about a protocol). RFCs are generally very detailed with hundreds of pages. But this one is really very short and compact.
This is because the protocol it deals with is called UDP(User Datagram Protocol), which has less complexities and is pretty straight forward(unlike its counterpart TCP which is a bit complex, due to a lot of reliability mechanisms ).
The TCP/IP Reference model consists of 5 different layers. These layers do different jobs to enable proper network communication. The top layer is the Application layer. This layer provides a method for software/applications to use protocols supported like HTTP, FTP, SMTP to communicate over the network. Then comes the transport layer, where TCP and UDP(our topic of discussion) comes into picture. TCP provides reliable communication, while UDP does not provide any guaranteed delivery mechanism. UDP also does not provide any sort of order of delivery mechanism. Then comes IP layer which provides method to deliver things to a destination IP address. Then towards the end comes Data Link Layer where physical addresses (MAC) are added on top to forward the message to the gateway through the physical layer.
Layer | Functionalities |
Application Layer | Here is where application layer protocols like HTTP, FTP, SMTP sits. Application programs uses the supported protocol to initiate communication |
Transport Layer | Adds a lot of features for proper delivery of messages. Basically it ads reliability with TCP(in some cases unreliable - not so complex option is also available by using UDP). |
Network Layer | This is where IP addresses comes into picture. This layer does not provide any sort of reliability as such. |
Data Link Layer | Adds MAC address details of Source and Destination(gateway) |
Physical Layer | This is where networking hardware comes into picture. |
Irrespective of whether you use TCP or UDP, IP is the protocol that makes them work over the network(this is because TCP and UDP sits at transport layer, and IP sits at network layer). If you see the above table, the communication starts at the application layer, then goes downwards through different layers. Each layer will go on adding its own fields and headers on top of the data submitted by its previous layer. At the source the layers add their bits and pieces of information, and at destination each layer peels the information before submitting it to the upper layer.
So basically you can select either TCP or UDP, depending upon your requirement, but IP will be used to make network communication possible.
Related: TCP Connection Explained
We wont be discussing TCP in this article. This is because its a bit complex, and requires an article of its own. UDP is the least used out of the two. This is because most of the applications that we use on a day to day basis requires reliability. UDP is basically a message oriented protocol.
In a way UDP and IP are very similar. Because IP also does not provide any sort of delivery guarantee or reliability mechanisms.
Let's bring tcpdump into the picture and see what happens when we make a UDP connection. Tcpdump is a tool to capture network packets coming in and leaving out of the system. Its available on almost all Linux distro.
Related: Network Packet Analysis with Tcpdump
Tcpdump will help us see the details and contents of the network traffic. To understand this, we need to first simulate a UDP request to somewhere, and during the same time let's capture the network packets.
Let's do a DNS request to a remote server(on one terminal), and on another terminal let's capture the packets and see the details.
Related: How does Domain Name System(DNS) Work?
On First Terminal(Execute the below command):
ubuntu@testing:~$ sudo tcpdump -n -vvv host 8.8.8.8 and port 53
On Second Terminal(Execute the below command):
ubuntu@testing:~$ dig @8.8.8.8 www.google.com
You should see a stream of messages in the first terminal as soon as you execute the above mentioned command on the second terminal. The messages will look like the below.
18:40:39.758842 IP (tos 0x0, ttl 64, id 4636, offset 0, flags [none], proto UDP (17), length 56) 192.168.40.27.55625 > 8.8.8.8.53: [udp sum ok] 63851+ A? google.com. (28) 18:40:39.812844 IP (tos 0x0, ttl 59, id 53901, offset 0, flags [none], proto UDP (17), length 104) 8.8.8.8.53 > 192.168.40.27.55625: [udp sum ok] 63851 q: A? google.com. 3/0/0 google.com. [32s] A 172.217.26.174, google.com. [32s] A 172.217.26.174, google.com. [32s] A 172.217.26.174 (76)
The very first line indicates contents of IP packet. It does not have anything related to UDP protocol. The string "proto UDP(17)" indicates an 8 bit field to identify the next protocol level. There are different decimal notations for different protocols. If it is TCP instead of UDP, it would have been 6 instead of 17 that you see now in our output.
Without that field(the protocol field) in the IP header, the receiving end will not know the type of protocol that IP packet is carrying. It could be even ICMP, GRE etc. In our case its UDP, hence the number 17. So bottom line is that it does not contain anything related to UDP, but it only tells that the contents of that IP packet has UDP data.
Remember the fact that UDP and its details along with application data is encapsulated inside IP packet(as we discussed earlier, the destination will peel every bit of data associated with each layer, and submit it to the next layer moving upwards towards the application layer).
"id 4636" is part of IP identification field. This is actually useful when fragmentation occurs.
When the IP packet is big, and the intermediate networking devices does not support sending it, the IP packet then needs to be fragmented. Then send different fragments to the destination. At the destination there should be some sort of identification to re-assemble the recieved fragments. All fragmented packets will have the same identification field number. Hence the receiver will consider all of them as part of the same packet. If no fragmentation occurs(like for example in our case), most of the IP headers will have unique identification numbers.
"tos 0x0" indicates type of service.
TOS (Type Of Service) indicates how the packet should be treated. Basically some packets might need some special attention(say for example a Voice Phone call).
"ttl 64" indicates Time To Live.
the maximum number of networking devices that this IP packet can travel through, before reaching the final destination. If there are 68 devices that comes between the source and destination, our IP packet will be dropped at the 64th device(because our ttl is 64) and will not reach the destination. The default value is different across different systems.
Recommended: What Role does TTL(Time to Live) Play in Traceroute
"offset 0" is also something related to fragmentation of IP packet. By default its always set to 0. If there is some sort of fragmentation, then the fragmented packets will all have same id field(as discussed earlier), and also will have an offset field that indicates where the data should fit into while re-assembling.
Let's consider a fragmentation example. Suppose the first packet in the fragmented series has "Identification Field: 100 & Offset Value: 0" and the second packet in the fragmented series has "Identification Field: 100 & Offset Value: 170". This means the second IP fragment data will fit right after (170 x 8 = 1360 bytes ) of the first packet's data during re-assembling.
Now let's get to our main topic. UDP. That is the second line in the tcpdump output shown (192.168.40.27.55625 > 8.8.8.8.53: [udp sum ok] 63851+ A? google.com. (28))
There are 5 main components in that line to discuss. That's all about UDP. The IP addresses you see there are also actually part of IP packet. IP addresses are not part of UDP. IP addresses exist in IP layer(the layer that we were discussing in the first line. The source IP address 192.168.40.27, which is my laptop's IP address. 8.8.8.8 is google's public DNS address, towards which we initially sent the DNS request using the second terminal).
The Header fields of UDP are blow
- Source Port : This is a random port number selected while sending a UDP request(in our case its 55625, which is evident from the second line).
- Destination Port: This is the destination port number of the application that we are sending our request to. DNS uses a default port number of 53, which is the same in our case.
- Length: The total amount of actual user data sent from the requesting application. In our case our DNS request sent by dig tool) in bytes + the UDP header length. The very last field in the second line is a number (28). That is 28 bytes of user data. And UDP packets have 8 bytes of header field data. Which means the Length of our UDP packet is 28 + 8 = 36
- Checksum: UDP checksum is a bit complex to calculate. I will be writing a dedicated article on how to calculate UDP and TCP checksum(well checksum calculation is done in the same way for both TCP and UDP). Although the checksum value is not shown in the tcpdump output above, it looks something like 0xaab0 OR 0x8921 or similar.
Related: How Is UDP and TCP Checksum Calculated?
- Payload: This is the actual request sent by the client application. In our case its the dns request generated by dig(63851+ A? google.com. is the actual payload here. A stands for A record request for google.com. 63851+ is DNS transaction id. It helps the dns client to identify the response.).
One thing to keep in mind about UDP checksum is the fact that its optional. There is no enforcement at the protocol level for UDP checksum. Checksum is something that will help identify whether there was a change of data during transit, or if it was tampered, altered etc. It provides a sort of "Error Detection" mechanism in UDP.
Why does UDP have a checksum field for Error detection?
That's a good question. And a valid question. Because UDP promises that its light weight and does not have any sort of reliability OR correction mechanism. If UDP is connection less & unreliable, why does it have a checksum field ?
UDP does not care about packets that are dropped & packets that are delivered out of order. But one thing UDP cares about is the integrity of the packets that are received(although optional, there is a provision for integrity verification). But still, what is the use of having an integrity verification mechanism using checksum, if it can't correct an integrity problem?.
Agreed it cant correct an integrity issue. But it can discard a data gram whose checksum is invalid!. Basically the receiving end wont accept a packet that has wrong checksum. There is no mechanism to inform this back to the sender, but it will discard it silently.
If UDP is connection less, how does it identify a response?
8.8.8.8.53 > 192.168.40.27.55625: [udp sum ok] 63851 q: A? google.com. 3/0/0 google.com. [32s] A 172.217.26.174, google.com. [32s] A 172.217.26.174, google.com. [32s] A 172.217.26.174 (76))
Above shown is the UDP response to our dns request(the very last line in the tcpdump output). Here the source IP and source ort is 8.8.8.8 & 53. Destination IP and destination port is 192.168.40.27 & 55625.
The destination port in the response is same as the source port in our initial request(the second line in the tcpdump output).
Which means, the response is directed towards the exact same source port from where the initial request was sent. This way the client program identifies the correct response.
Ideally the client program waits for the response, keeping the source port open. Only when the response is received, the source port(i must say socket) is closed.
Use Cases Where UDP is best compared to TCP
Imagine you want to telecast a live streaming video to millions of users(may be a cricket match). TCP involves a lot of overhead to serve such kind of requests. As far as live streaming is concerned, if TCP gets too many requests, the operating system must wait for all the data that are unacknowledged. Which means, if there are millions of requests, the operating system will keep all these un ACKed data in buffer. So TCP is a bad idea in such a situation.
If Quick and simple response is your need, then UDP is the best. For example DNS, NTP etc.
Think of scenarios like gaming. Here new state of the game is continuously replacing the old state. Which means old state is of no use as far as the client is concerned(so forget about sending lost packet again by connection oriented TCP). Here UDP is a good option.
Comments
Thanks for explaining! its
Thanks for explaining! its quite informative.
Add new comment