Up to now I have reviewed message encoding and association initialisation. Now it's time to see how SCTP does some real work - user data transfer. It is implemented via DATA and SACK chunks. Two peers can exchange user data only when the association is established, which means it should be in ESTABLISHED, SHUTDOWN-PENDING or SHUTDOWN_SENT state. SCTP receiver must be able to receive at least 1500 bytes in a single packet, which means its initial a_rwnd value (in the INIT or INIT ACK chunk) must not be set to a lower value. Similar to TCP, SCTP supports fragmentation of user data, when it exceed the MTU of the link. The receiver of the segmented data will reassemble all chunks, before passing it to the user. On the other side, more than one DATA chunk can be bundled in a single message. Data fragmentation will be discussed in more details later.
DATA chunk is described in Section 3.3.1. This is the chunk, which does the actual work - transferring user data to the peer. A sample packet with DATA chunk can be seen on fig. 1. It usually has the following flags:
E-bit - Ending fragment bit: If set, indicates that the chunk contains the last fragment of a user message.
B-bit - Beginning fragment bit: If set, indicates the first fragment of user message.
U-bit - Unordered bit: Indicates unordered data chunk, if set to 1. When set, the stream sequence number parameter must be ignored.
I-bit - Immediate bit: Described in RFC 7053. Won't be discussed in this article.
DATA chunk parameters are:
TSN: The Transmission Sequence Number for the DATA chunk. As described in the association initialisation post it identifies the chunk for acknowledgement and retransmission purposes. Each DATA chunk has unique TSN, which can be used to identify the chunk later, for purposes like retransmission or duplicated data detection. DATA chunks from the same sender have cumulative TSN values.
Stream Identifier: identifies the stream to which the payload belongs.
Stream sequence number: represents the sequence number of the payload in the stream.
Payload protocol identifier: identifies the payload type. The value is optional and is not used by the SCTP protocol itself.
SACK chunk is defined in Section 3.3.4. Its purpose is to acknowledge series of DATA chunks. It may also modify the advertised receiver window (a_rwnd) or indicate duplicated DATA chunks. Important feature of SACK chunks is to indicate a gap in the transmission. This means that one or more DATA chunks are not received by the peer and they need to be retransmitted. Gaps will be discussed in the next section. On fig. 2 you can see the SACK chunk which acknowledges the DATA from the previous section. It contains the following parameters:
Cumulative TSN ACK: This parameters indicates the TSN of the latest DATA chunk received, before the gap (in case there is any).
Advertised receiver windows credit (a_rwnd): The new value of the a_rwnd credit.
Number of gap acknowledgement blocks: How many gap acknowledgements are included in the chunk. On fig. 2 there are no gaps.
Number of duplicated TSNs: How many duplicated TSNs are included in the chunk. None on fig. 2.
Data transfer procedures
In this section we will review some data transfer related procedures, implemented with SACK chunks.
How are gaps reported
It is possible to have gaps in the received data. This means that the receiver has received DATA chunk with TSN bigger than expected. In this case it will send a SACK chunk with 'Cumulative TSN ACK' set to the TSN of the DATA chunk just before the gap. Then 'Number of gap acknowledgement blocks' will be set to the number of gaps. All gap acknowledgement blocks will be inserted right after 'Number of duplicated TSNs' parameter. Each block has two integers:
Gap Ack Block Start: It contains the offset (please note that the value is offset, not a TSN) of the first DATA chunk in this block. This means that when we add the value of the 'Cumulative TSN Ack' parameter to this offset, we get the TSN of the first received message after the gap.
Gap Ack Block End: It contains the offset (again - offset) of the last DATA chunk in the block. When we add the offset to the value of the 'Cumulative TSN Ack' parameter, we get the TSN of the last received DATA chunk.
Let's see how this works in the real world with a simple example. There is an established association between hosts A and B and A send to B eight DATA chunks. The initial TSN value for A is 100. So the TSNs of the DATA chunks are from 100 to 107. Let's say that chunks 100, 101, 104, 105 and 107 are delivered successfully and for some reason chunks 102, 103 and 106 are not received by B. This means that there are two gaps in the transmission. The whole data flow is shown on fig. 3 and the gaps are marked. At this point B decides to reply with SACK chunk. Let's see what should be set in the chunk parameters:
Cumulative TSN Ack: As you already know, this parameter contains the TSN of last received DATA chunk, before the gaps. In our case this is the TSN of the second message - 101.
Advertised receiver windows credit (a_rwnd): We are not interested in this parameter right now, so we'll just skip it.
Number of gap acknowledgement blocks: We have got two gaps, so the value of this parameter is 2.
- First gap acknowledgement block:
Gap Ack Block Start: The first received DATA chunk after the gap is with TSN 104. Cumulative TSN Ack is 101. So the value of this parameter is 104 - 101 = 3.
Gap Ack Block End: The block contains two DATA chunks with cumulative TSNs. The TSN of the last chunk is 105. This means that the value of this parameter is 105 - 101 = 4.
- Second gap acknowledgement block:
Gap Ack Block Start: The same logic works here. There is one lost DATA chunk, after the previous block. The TSN of the received chunk is 107. The value of the parameter is 107 - 101 = 6.
Gap Ack Block End: This block contains only one chunk, so the value of the parameter is again 6. Both start and end values are 6, which indicates that the block contains only one chunk.
How duplicate TSNs are reported
You can also have duplicated DATA chunks received for some reason (e.g. lost SACK). SCTP uses the SACK chunks to report duplicated DATAs to the sender. You remember that each SACK has 'Number of duplicated TSNs' parameter, which specifies how many Duplicated TSN blocks the chunk contains. These blocks are always added right after the Gap Ack blocks. The Duplicated TSN block has got only one value - the duplicated TSN itself.
The SCTP stack keeps the count of the duplicated TSNs internally. When a SACK chunk is prepared, all duplicates are added to the message and the internal count is reset. The Duplicated TSN block hasn't got a field, which shows the count, so if a chunk is received more than twice, there are multiple occurrences of the TSN in the Duplicated TSN block.
Let's see how this works in practice. Fig. 4 shows an example flow between two nodes - A and B. We assume the association has already been established and the diagram contains only DATA and SACK chunks.
A sends to B the DATA chunk with TSN 100 two times. After that B decides to send SACK and it includes the duplicated TSN in the message - it was received two times, which makes one duplicate. Then A sends DATA with TSN 100 one more time and another DATA with TSN 101 three times. B sends SACK again. Because 100 was already received, it is included in the list of duplicated TSNs. Because the count of duplicated TSNs was reset, after the last SACK, TSN 100 is set only once in the Duplicated TSNs block. After that B receives a DATA chunk with TSN 101 three times. This makes two duplicates, so 101 is added twice to the Duplicated TSNs block.
Ordered and unordered data delivery
Usually SCTP delivers user's data in a specific stream 'in order', which means that receiver gets the messages in the same order, in which they are sent. For example if A sends messages 1, 2 and 3, B will receive them in exactly the same order - 1, 2 and 3, not for example 2, 1, 3. All ordered DATA chunks has his U-bit (in chunk flags) set to 0, which means ordered delivery, and the 'Stream sequence number' parameter will be incremented by one for each new message. You can see the DATA chunk on fig. 1 - U-bit is set to zero and 'Stream sequence number' is also zero, meaning that this is the first message from the sender on this stream. The sequence number of the next chunk will be 1 and so on. If SCTP stack has delivered to the user DATA chunk with Stream sequence number zero and after that receives new chunk with sequence number 2, the stack is supposed to hold it until the chunk with sequence number 1 is received and deliver them in order to the user.
SCTP also supports unordered data delivery. In this case the U-bit is set to 1 and the Stream sequence number parameter is ignored. The receiver is supposed to deliver each unordered chunk to the user as soon as possible, without holding it for any reason.
Fragmentation and reassembly of user data
If the user message size exceeds the MTU size of the association, SCTP should fragment the message or return error, if it doesn't support fragmentation. The payload is divided so that the size of each fragmented message plus the SCTP protocol overhead is less than the MTU of the association. Each chunk has individual in-sequence TSN. In the chunk flags of the first DATA, the B-flag is set to 1 and the E-flag is set to 0. For the last chunk, the B-flag is set to 0 and the E-flag is set to 1. All chunks between the first and the last one have both bits set to 0. The receiver uses the same logic to detect the first and the last chunk of the fragmented message, then reassembles it and delivers it to the user.
In this post we reviewed the user data transfer procedures in SCTP, including some error indications like gaps in the transmission and duplicated messages. I hope that up to know you have got pretty good idea how an SCTP association is established and how user messages are transfered. Of course if you feel you need more information about any of the topics, read the specification by yourself. This is the best way to make yourself feel comfortable with the protocol.
In the following post we will review how SCTP heartbeating association tear down work. Thanks for reading.
This post is part of my "SCTP in Theory and Practice:A quick introduction to the SCTP protocol and its socket interface in Linux" e-book. If you find the content in this post interesting - I think you will like it.
The book covers two topics - how SCTP works in theory and how to use it in Linux.
The best way to learn how SCTP works is to read and understand its specification - RFC 4960. However this document is not an easy read - the purpose of the document is to describe a full SCTP implementation and contains details which you usually don't need, unless you plan to write your own SCTP stack.
The role of the first five chapters of the book is to give you structured and easy to read explanation about how different parts of the protocol work. You will see how an SCTP association is established on network packet level, how data transfer works, how multi-homing is implemented and so on. Additionally each section contains references to specific sections from RFC 4960, which cover the topics in question. This approach will save you a lot of time reading the document.
The rest of the book focuses on SCTP from programmer point of view. You will learn how to write client-server applications in Linux. You will learn the difference between one-to-one and one-to-many style sockets and how to implement multi-homing. Each chapter contains working client and/or server implementation in C and line-by-line code review.
All source code and PCAP files used in the book are available as extra content.
Think you will like it? You can buy it on Leanpub. I really appreciate your support!