Error detection is one of the key principles in communication. Whenever an error is detected, the communication become flawed and the integrity of the message is compromised. The checksum algorithm allow communication between computers to be performed with the integrity required for the data to be delivered from one point to another on a network.
Every communication protocol may (or may not) contemplate a way to handle errors. There is no way to guarantee data will be delivered correctly every single time. And when it comes to millions of bits per second being transmitted over a wire and being retransmitted through thousands of intermediaries with different technologies, it becomes visible the idea of an error handling specification being mandatory.
Fix the error
There ar multiple ways of handling errors, in the past I’ve written about the CRC (Cyclic Redundancy Code) which allow us to detect, not only if there’s been an error but, with bit-specific precision, where the error happened. As this may seem more practical, when an error occurs, it isn’t just one bit (pun intended), it surely occurred on two (or more) bits, so trying to detect and, hopefully, fix the erroneous bit becomes an impossible job.
Nah, just send it again, mate
Given the complexities about fixing errors, networks nowadays are focused on delivering the information, if it detects an error, it triggers a re-transmission and that’s it. Finding the best algorithm for error detection has been one of the most challenging tasks for the IEEE (among many other standards authorities) and they chose the Checksum Algorithm.
If we asked ourselves for a simple algorithm to handle errors, our imagination might try and generate many complex algorithms but let’s enumerate the key aspects of the ideal error detection algorithm:
- Should not depend on a preexisting values either on the client nor the server.
- Should be easy to implement as a logical circuit (in order to perform these checks at the fastest speed possible).
- Should be fast enough to have a minimal impact on the transmission speed without requiring hardware updates.
Oh my god, who can save us now?
The checksum algorithm is very, very, very, simple, just look at it:
- Take a chunk of data with n bits.
- Separate the data in 16-bits-long words.
- Sum every word adding carries back in.
- The result will be the checksum.
Once performed backwards and adding the checksum per se, on the sum operation, the result will be 0’s if everything is correct. Basically it generates a hash from the information provided, this hash can be used as a footprint which match only the original data, if the data modified, lost and/or erroneous the footprint won’t match. Simple, right?
How does implementation looks
When implementing the checksum algorithm, one must make sure the process is well defined by the operation in both ways. So you must first take your data, add a checksum zero-ed field and store the resulting checksum. And, on the receiver side, the operation is performed with the previously generated checksum in place and the result should be total 0’s.
Simple, fast, and easy to implement.
As hard as it may sound, implementation does not require multiple functions, it doesn’t even require 2 functions, it is just one function
to rule them all, and it is simple as:
This implementation is simplified for an overview of the algorithm, optimized versions are used in the real production environments, but this code allow us to grasp the functionality and operation of the algorithm in a easier way. Obviously it is written in C++ because this is my blog and my ego looks bigger than your expanded pupils with those bitwise operations that I squeezed into the code and, as a master jedi, I use in my everyday coding as my main tongue is binary and fluent lisp. Okay, bad joke, I need friends, please keep on reading.
Where is it implemented?
As we know, the Ethernet frame (as defined on the 802.3 standard) implements a CRC algorithm. The ones that implement a checksum are: IP, TCP and UDP. Each one of them, have a checksum field on their headers, the main difference comes with the uses they have. Some checksums validate the entire PDU (which is the data coming from a higher layer in the TCP/IP model) and some others just validate the header, as the data will be validated on a higher layer.
Let’s check the headers and where are located the checksum fields.
The following is the architecture of an IPv4 Header:
As you can notice, the field checksum (located from the bit X through the bit Y) is used to store the checksum of the IP header only. It does not, by any means, validate the data in the PDU of the higher layers, so its calculation is pretty straightforward:
- Get the
IHLfield from the header and multiply it by 4 to get the number of bytes of the length of the header.
- Read the length-bytes from the starting point of the header until the end.
- Separate the information in 16-bit long words.
- Run the checksum algorithm on those words.
- The result should be, if everything is correct a FFFFFFFF.
This only applies to Internet Protocol version 4, widely known just as IPv4, as in the newer version (IPv6) the Flow Label and the Next Header pointer would render error detection useless and, as it would be performed anyways on the Transport Layer (with TCP and/or UDP), repetitive.
TCP and UDP
For TCP and UDP, the analysis is almost the same, so I am gonna explain how is it performed. But first, let’s take a look at how does the actual headers look like.
The TCP header:
And the UDP header:
Now, to begin the analysis on this layer, we need to review something called the Pseudo-Header. The pseudo-header, stands for a dynamically-generated header which isn’t provided within the data per sé, in fact, we must assemble it by ourselves. Calculation of checksum will then be performed, basically, on: the pseudo-header plus the TCP/UDP PDU, as in:
And the structure of the pseudo-header is:
Which, as you can see, is basically just some fields from the IP header plus the length of the PDU (it actually won’t matter if it is TCP or UDP). Then, the process will be as straightforward as in the IP:
- Take the pseudo-header and append it to the pdu.
- Divide the data in 16-bit-long words.
- Run the checksum algorithm with those words.
- If everything is correct, the result will be 0’s.
As you can see, the checksum algorithm is fundamental on the implementation of multiple networking protocols. Understanding how it works, why it works, but most importantly, where it works, allow us to understand how does the information is transmitted through the wire without taking specific actions towards errors, as the upper layers take care of this.
If you feel like something’s missing, you’d like to discuss this article or got any comments and/or feedback, you can find me on Twitter as @humbertowoody.