| categories:SRE
TCP Protocol
Window size
- Window scale factor - a value from 0 to 14
- Multiplier - 2^Scale factor
- Calculated window size - (Window size) * 2^Scale factor
- Maximum segment size - May be different for a client and a server (not negotiable)
If the calculated window size is -1, it means that we missed the handshake and don’t know the value now.
Sequence numbers
Sequence numbers help to identify retransmissions.
SYN packet holds a Ghost byte to increment the Acknowledgement number.
If a missing packet comes faster than the initial Round-Trip time - it is marked as Out-Of-Order. Otherwise, it’s a retransmission.
Receive window
TCP receive window is a buffer to store unprocessed packets.
Zero window is a symptom of performance issues, stuck process, or internal application problem. Usually happens during a large file transfer.
Depending on the bandwidth, a receive window calculation looks like this:
- 10Mbps with 20mSec latency
Optimal receive window = 10 Mbps * 0.02 / 8 = 25KB
- 10Gbps with 100 mSec latency
Optimal receive window = 10Gpbs * 0.1 / 8 = 125 MB
Packets with the same Ack number and different Window size Wireshark marks as [TCP Window Update]
Retransmission
Retransmission happens when we didn’t receive an ACK for a sequence number within the retransmission timer.
Fast retransmission requires SACK supported. After three Dup ACKs, the server does fast retransmission of the missing packet.
Spurious retransmission happens when the server didn’t receive ACK, and we capture traffic on the client. This way, we see a duplicate ACK from the client’s perspective.
Hundreds of Duplicate acknowledgment packets means High RTT on the network.
2 major causes for retransmissions:
- Congested Network Connections - Discards and Buffer Bloat
- Link level errors - Hardware issues, cabling.
iPerf3 - a tool to test the network throughput
Congestion control
It’s the opposite of the receiver throttling. Even if the client advertises a large receive window, the server needs to understand the optimal window size.
Congestion window (cwnd
) is measured in the number of the MSS or bytes in flight. It is never
advertised and always changing due to the nature of the network load.
The maximum amount of data in flight is a minimum value of either the congestion window or the receive window.
Different algorithms tweak the following components:
- Initial window - how many segments we send at the beginning
- Slow start - how we increase the number of MSS
- Congestion avoidance - a threshold when we change the optimisation focus
- Fast recovery - what happens when we hit the packet loss
Popular algorithms:
- NewReno - still used, bad performance on high-bandwidth networks
- CTCP - Win7
- Cubic - Windows 10 and MacOS, aggressive utilization
- Westwood - Wireless, lossy networks
- BBR - developed by Google, possible to use on Linux
Congestion detection signals:
- Packet loss
- Latency increase