< TCP/IP Fundamentals
BeginnerRust

Fragmentation

In the first lesson, we found the IP address of your friend Alice and we had a nice chat. But now she asks you to send her a postcard as an image file, and you have been trying to do that—unsuccessfully. Turns out, you can't send a large file in a single network packet!

Let's see why exactly this happens and devise a solution to overcome this limitation, learning how to successfully deliver your picture. After completing this lesson, you’ll learn how to split large messages into smaller packets using Rust iterators and how to deal with out-of-order delivery.

Let's suppose that this image is of 260 kilobytes in size. As we learned in the previous lesson, a single IP packet fits 65 kilobytes of data. In real networks, though, this number is even smaller as it is limited by the constraints of the physical Ethernet or Wi-Fi networks. We call this limit the Maximum transmission unit size, or MTU for short. MTU depends on the physical network configuration, and in the majority of Ethernet or Wi-Fi-based networks it sits at 1500 bytes. In practice, this means that most IP packets are smaller than 1.5 kilobytes.

What happens if we have an MTU of 1500 but send an IP packet of 65 kilobytes in size?

To find out, let's get back to the table deconstruction of an IP packet:

0151631
VersionHeader lengthType of ServiceTotal length

Identification

Flags

Fragment offset

Time to live (TTL)ProtocolHeader checksum
Source IP address
Destination IP address

There are three fields which are relevant to our question: identification, flags, and the fragment offset.

Because we know the MTU size in advance, we can cut IP packets into many smaller pieces called fragments. Each fragment becomes its own IP packet, carrying a slice of the original message payload with its own unique identification number. The offset says which part of the original payload this fragment carries. This information allows the receiving end to reconstruct the original packet from its fragmented form. Finally, flags indicate if the IP packet is fragmented, and if it is, the final fragment will have flags set to zero while the fragment offset will have a non-zero value.

Let's see how a single 16 KB IP packet is split up when we adjust the MTU size:

Packet (1536 bytes)
Identification: 0
Offset: 0
Flags: fragmented
Packet (1536 bytes)
Identification: 1
Offset: 1536
Flags: fragmented
Packet (1536 bytes)
Identification: 2
Offset: 3072
Flags: fragmented
Packet (1536 bytes)
Identification: 3
Offset: 4608
Flags: fragmented
Packet (1536 bytes)
Identification: 4
Offset: 6144
Flags: fragmented
Packet (1536 bytes)
Identification: 5
Offset: 7680
Flags: fragmented
Packet (1536 bytes)
Identification: 6
Offset: 9216
Flags: fragmented
Packet (1536 bytes)
Identification: 7
Offset: 10752
Flags: fragmented
Packet (1536 bytes)
Identification: 8
Offset: 12288
Flags: fragmented
Packet (1536 bytes)
Identification: 9
Offset: 13824
Flags: fragmented
Packet (1024 bytes)
Identification: 10
Offset: 15360
Flags: none

This kind of fragmentation happens without our knowledge. It's a part of the network stack implementation in operating systems, so we don't have to deal with it at the IP level.

Moreover, in the real-world Internet, we will rarely experience IP fragmentation. The reason is that IP fragmentation is fragile: for example, if one single fragment is not delivered, the original message has to be dropped, fragmented again, and retransmitted because there's no way to retransmit only a part of it. Instead, packets are generally made smaller than the MTU using path MTU discovery, where packets are made progressively smaller until they are accepted at the destination.

And remember: we're dealing with IP payloads, so if we send a UDP datagram, the IP payload will include the UDP header. Because of this, we can't see UDP source and destination port numbers before completely reconstructing the original message:

IP packet (1500 bytes)
Identification: 0
Offset: 0

Payload (UDP header):

015

Source port

Destination port

UDP length: 3500

UDP checksum

UDP payload

IP packet (1500 bytes)
Identification: 1
Offset: 1500

Payload:

0...

UDP payload

These limitations have led to completely disabling fragmentation in the newer Internet Protocol version (IPv6). Naturally, you ask: why are we even talking about it then? It's because while the IP fragmentation is not that useful to us, it's still a nice idea we can reuse for our own purpose: we can do the fragmentation at the level of our application. That's what we will do on the next page.

Now that we know how MTU works, let's assume that the MTU size in our virtual network is 65 kilobytes. It will make it easier to deal with larger files for the purposes of this lesson.

Now you can paint a postcard you want to send to your friend:

If we look under the hood of this image file, we'll discover that it's just a sequence of bytes. There are different ways to encode images as data, but in this case we can assume that it's a bitmap, which means that every pixel is encoded with 4 component numbers, denoting the color intensity of red, green, blue, and alpha (which is used for opacity).

This image is 300 x 300 pixels in size, and each pixel is encoded with 4 bytes. This means that the entire image will be of 300 x 300 x 4 = 360,000 bytes in size, or about 351 kilobytes. It should fit into 4 network packets.

Since the image is a sequence of bytes, we can put them into an array and work with it just as with any other array of numbers. For example, let's see what will happen if we preserve only the red component of the image:

The same idea applies to splitting the image into smaller parts. Rust has a powerful toolkit to work with arrays and sequences, iterators. With iterators, you can transform sequences of numbers in different ways—split them, create new sequences out of existing ones, combine them and calculate a single result (for example, summing them), etc. For our case of splitting an image into small parts, there's a function called slice::chunks.

Let's see how it works:

Now you can see how with a simple function call we can construct a series of packets with the size we need.

Let's see how this works within our virtual network—we already know that Alice expects us to send her the 300 x 300 image file, so let's do that, using the familiar methods from the first lesson:

As we have seen on the previous page, while we have successfully sent a picture to our friend Alice, she gets it in a scrambled form.

This happens because there is no guarantee that messages sent over the network will be received in the same order they were sent. This is called packet reordering and it can happen due to a number of reasons—for example, because network packets are sent through different routes for efficiency, or because network equipment processes them in a parallel manner.

The solution to this problem is to include a packet number in the beginning of a message. This way, Alice will know how she needs to reorder the received sequences of bytes in order to reconstruct the original picture.

With Rust, we can do this by using an iterator function called enumerate, which gives us an index number for each iteration. Let's see how it works:

Now you can see how each chunk is indexed. On the receiving end, we can restore the original order of packets by using the iterator function sort_by:

Let's put all these techniques together to finally send Alice an image file in multiple sequenced parts:

Congratulations, you have finished the second lesson in the TCP/IP series!

In the next one, we will learn how to deal with unreliable networks. In real-world networks, not all packets we send are guaranteed to be delivered, so we need to make sure that the receiver gets a complete message. We will learn how we can find out which packets were not delivered and how to retransmit them.

If you would like to follow updates, you can subscribe to our mailing list.

  1. Fragmentation
    1
  2. Splitting packets
    2
  3. Sequencing
    3