This might seem weird, but I’m a little confused, and I’d like an explanation, and perhaps an of the whole process of TCP data transfer from when there is a stream of data that needs to be sent.
I’m currently reading TCP documentation (RFC 9293), and in [Section 3.4](https://www.rfc-editor.org/rfc/rfc9293.html#name-sequence-numbers) it says:
>A fundamental notion in the design is that **every octet of data sent over a TCP connection has a sequence number**
So far, so good. In [Section 3.7](https://www.rfc-editor.org/rfc/rfc9293.html#name-segmentation) it says:
>The term “segmentation” refers to the activity TCP performs when ingesting a stream of bytes from a sending application and packetizing that stream of bytes into TCP segments.
The section later states that the segment’s size varies, but cannot exceed the [maximum segment size](https://www.rfc-editor.org/rfc/rfc9293.html#name-maximum-segment-size-option). I was confused and wondered if the segments are the octets, so I googled which is which. [RFC 879](https://www.ietf.org/rfc/rfc0879.txt) states that:
>The rule must match the default case. If the TCP Maximum Segment Size option is not transmitted then the data sender is allowed to send IP datagrams of maximum size (576) with a minimum IP header (20) and a minimum TCP header (20) and thereby be able to stuff **536 octets** of data into **each TCP segment**.
Meaning that octets go into segments, and given that octets have sequence numbers, then each segment would have multiple sequence numbers for each octet they send. However, googling “do TCP segments have sequential numbers” gives me:
>**At offset 32 into the TCP header is the sequence number**. The sequence number is a counter used to keep track of every byte sent outward by a host. If a TCP packet contains 1400 bytes of data, then the sequence number will be increased by 1400 after the packet is transmitted.
By [IBM](https://www.ibm.com/docs/en/zos-basic-skills?topic=4-transmission-control-protocol-tcp), meaning that the segments themselves also have their own sequence numbers?
My current understanding of the whole process is this:
1. A stream of data needs to be sent
2. TCP divides them into octets
3. TCP then packages them into segments
4. Segments get sequence numbers
5. Segments are sent sequentially.
Is my understanding correct? Please write an if possible.
The segments/packets are the lowest level for TCP. They are the only part that has a sequence number in it. TCP header+data. So if sending 1000 bytes and max segment size is 512, it gets sent as two segments/packets. Each one has a sequence number.
The sequence number is used by the receiving TCP system to help reassemble packets in-order. The don’t necessarily have to be sent out sequentially (and may not arrive sequentially).
An octet and a byte is the same. So everywhere you read ‘octet’ you can replace that with ‘byte’ and it is still valid. So the data from the application is a stream of bytes which then gets split into segments by the TCP driver. Each byte in the stream is identified by its sequential sequence number. So each TCP package contain the sequence number of the latest byte that have been sent, as well as the latest byte that have been received. So each side knows at all times how much data have been sent and received by the other end. This allows it to identify any sequences of data that have been lost and resend it.
Every TCP packet has 2 main sequence numbers in its header. One is label “SEQ” and the other is labelled “ACK”. These represent “My Sequence” and “Your sequence” numbers, respectively, as seen from the point of view of the sender. When I send data, my sequence number goes up, but I only send 1 sequence number, and what I send is the sequence number of the data I already sent to you and not including what’s in this packet as the main data payload.
Suppose I send you a packet with 1400 bytes of data, and the value of SEQ is exactly 100,000. The first byte has sequence number 100,001, the second byte has sequence number 100,002, and so on. These are just implied. From your standpoint, you acknowledge that you received all these bytes by sending me a single TCP packet, containing 0 bytes of actual data (unless you have something to send to me), and with an ACK value of 101,400. This tells me you’ve seen my sequence numbers up to 101,400 and I can update my records and clear out some memory of data that was waiting for acknowledgement.
If I sent another 1400 bytes, this second packet will have an SEQ value of 101,400 and I will expect to receive an ACK value of 102,800 to know you got both packets.
This allows for a few simple tricks. If I sent you 2 packets rapidly of 1400 bytes each, you may acknowledge both at once in a single packet whose ACK value is 102,800.
TCP is a “stream” protocol. Data comes as a sequence of bytes to the application, but the edges of packets don’t mean anything. So if TCP transmitted 2 different packets each 200 bytes long, and both got lost on the network, the sender is completely allowed to retransmit both as a single 400 byte packet when it tries again. In either case, the receiver might acknowledge either 200 or 400 bytes at a time, depending on whether they eventually did receive the original 200 byte packets, or just the combined 400 byte packets. It doesn’t matter. The sender has to be able to deal with that.
The “maximum segment size” is about specifying what the largest TCP packet size allowed is. Typically it’s the network MTU (often 1500 bytes) minus 20, the size of an IPv4 packet header. If a packet is too big, it either won’t arrive at all or a router may fragment it into multiple smaller packets, but the re-assembly process sucks. It’s better for TCP to send more, smaller packets instead. So it needs to know what the right size would be.
Segments are probably sent sequentially since that’s how the data is arriving from the application. Also because of how most systems respond to lost data and how “out of order” data might be confused, it is generally in the sender’s best interest to send data in order, hoping for it to arrive in order. It’s best for the receiver, and least likely to confuse each party that some data might have gotten lost in transit.
“Octet” is basically another word for “byte”. It literally means “8 thingies that are related”. When a large amount of data is sent through the TCP/IP stack, it is indeed chopped up into segments of appropriate size. There isn’t much “division into octets” because that’s how the driver *gets* the data: usually as an array of bytes in the memory, straight from the OS or whatever program.
> every octet of data sent over a TCP connection has a sequence number
Taken literally, that would be stupid and double the size of the message for no purpose. The sequence number is a feature of one whole TCP segment, and marks the number of the first byte it contains. The rest of the bytes in the packet follow sequentially, so there is no need to explicitly attach a number to each, the sequence number and the length together identify exactly where the chunk of data in question should go when reassembling the message. There is no such thing as a partial packet loss, if something is fucky with the packet the entire thing is requested to be re-sent.