The Misconception That TCP Lets You Receive in the Same Units You Send — Designing Reception Around a Byte Stream

· · TCP, Socket, Network, .NET, C#, Protocol Design, Operations, Legacy Asset Reuse

1. What to Understand First

There is a very common misconception in TCP programming.

It is this:

For every unit the sender passes to Send / Write, the receiver can read that same unit with Receive / Read.

For example, suppose the sender transmits like this:

Send("LOGIN\n")
Send("GET /items\n")
Send("QUIT\n")

You then assume the receiver can read it back in three calls:

Receive() => "LOGIN\n"
Receive() => "GET /items\n"
Receive() => "QUIT\n"

But with TCP, that is not guaranteed.

In reality, any of the following can happen:

Receive() => "LOGIN\nGET /items\nQUIT\n"
Receive() => "LOG"
Receive() => "IN\nGET /ite"
Receive() => "ms\nQUIT\n"
Receive() => "LOGIN\nGET /items\n"
Receive() => "QUIT"
Receive() => "\n"

All of these are perfectly normal for TCP.

What TCP guarantees, roughly speaking, is that “the bytes you send arrive in order, without duplication, without loss.” What it does not guarantee is that “the units the application passed to Send are preserved as the units the receiver gets from Receive.”

That is why any application using TCP needs a mechanism on the receiving side to determine “of the bytes I just received, where does one message start and end?”

This is called framing in the application protocol.

This article lays out the misconceptions that arise around TCP’s Send and Receive, and the correct way to handle them in .NET / C#.

The code in this article is published on GitHub as a complete buildable, runnable sample set (a library, a loopback TCP demo, and unit tests reproducing fragmentation, coalescing, and mid-stream disconnection).

tcp-send-receive-message-framing - komurasoft-blog-samples (GitHub)

2. TCP Carries Bytes, Not “Messages”

The first thing is to stop thinking of TCP as a message queue.

TCP treats the data handed to it by the application as one continuous byte stream.

Even if the sender calls Send three times like this:

Send("ABC")
Send("DEF")
Send("GHI")

From TCP’s point of view, this is ultimately a flow of 9 bytes:

ABCDEFGHI

The application-level boundaries

ABC | DEF | GHI

are not preserved anywhere in that flow.

The receiver, at some moment, reads “whatever is in the receive buffer right now.” So the reception results can look like any of these:

Sender’s calls Example of what the receiver sees
Send("ABC"), Send("DEF") One Receive() returning "ABCDEF"
Send("ABCDEF") Two Receive() calls returning "AB", "CDEF"
Send("ABC"), Send("DEF"), Send("GHI") Three Receive() calls returning "A", "BCDEFG", "HI"
UTF-8 characters such as Send("\u3042") A multibyte character can even be split mid-character

The important point is that none of this is “abnormal.”

Many of the bugs that look like “received data is occasionally missing,” “multiple messages get glued together,” or “text gets garbled” are not TCP malfunctions — they are design mistakes where the receiver treats TCP as message-oriented.

3. Why It Looks Like You Can Receive per Send

The reason this misconception persists is that, in local environments and with small data, things often happen to look as expected.

In development environments, these conditions tend to line up:

  • Client and server are on the same machine or a nearby network
  • The data volume is small
  • The peer reads promptly
  • CPU and network have headroom
  • Testing is manual, with little timing jitter
  • Receive happens immediately after Send

Under these conditions, it can look as if one Send maps to one Receive.

But conditions change in production:

  • Data accumulates in OS send/receive buffers
  • Multiple small sends get batched together
  • Large sends get split by TCP segments or receive buffer limits
  • The receiving thread’s scheduling is delayed
  • Layers like TLS, proxies, load balancers, and VPNs sit in between
  • Network latency and congestion occur
  • The Nagle algorithm and delayed ACK come into play

The result is the nasty kind of bug: “it worked in development but occasionally breaks in production.”

In network programming, this “happens to work” state is the most dangerous one.

4. Common Fragile Reception Code

For example, code like this is dangerous:

byte[] buffer = new byte[4096];
int read = await stream.ReadAsync(buffer, cancellationToken);

if (read == 0)
{
    // The peer closed the connection normally
    return;
}

string message = Encoding.UTF8.GetString(buffer, 0, read);
await HandleMessageAsync(message, cancellationToken);

This code assumes “one ReadAsync yields one message” — but TCP does not uphold that assumption.

There are three main problems.

First, a single message can be fragmented.

Sent:      {"command":"login","user":"komura"}\n
Received1: {"command":"login",
Received2: "user":"komura"}\n

In this case, trying to parse Received1 alone as JSON fails.

Second, multiple messages can be coalesced.

Sent1:    {"command":"login"}\n
Sent2:    {"command":"get"}\n
Received: {"command":"login"}\n{"command":"get"}\n

In this case, trying to parse it as a single JSON document fails.

Third, the split can land in the middle of a character encoding boundary.

In UTF-8, a single character can span multiple bytes. There is no guarantee that the boundary of a ReadAsync coincides with a character boundary.

So if you immediately convert every received byte chunk with Encoding.UTF8.GetString, the result can be corrupted whenever a multibyte character is split.

The rule is not “stringify as soon as it arrives,” but “accumulate bytes until a message boundary is identified, and decode only once a full message is assembled.”

5. Never Use DataAvailable to Detect End of Message

Code like this is also common:

var ms = new MemoryStream();
byte[] buffer = new byte[4096];

while (stream.DataAvailable)
{
    int read = await stream.ReadAsync(buffer, cancellationToken);
    if (read == 0)
    {
        break;
    }

    ms.Write(buffer, 0, read);
}

byte[] message = ms.ToArray();

This is dangerous too. What DataAvailable represents is “at this instant, is there readable data in the local receive buffer” — it does not mean an application-level message has completed.

For example, if a message is 100 bytes, DataAvailable may become true the moment the first 40 bytes arrive, and may temporarily turn false right after you read those 40 bytes. The remaining 60 bytes might arrive a moment later.

If you interpret DataAvailable == false as “end of message” here, you end up processing a partial message as a whole one.

DataAvailable may have uses for optimizing read loops or non-blocking-style checks, but it is safer never to use it for protocol boundary decisions.

6. The Correct Mindset: Separate “Receiving” from “Interpreting”

TCP reception becomes much easier to design when you separate these two concerns:

Receiving:    read the bytes arriving from TCP and append them to a buffer
Interpreting: carve one application-level message out of the buffer

Receive / Read is purely a “read bytes” operation. Where one message ends, by contrast, must be decided by the application protocol.

The four standard approaches:

Approach Description Suited for
Fixed length Every message is a fixed number of bytes Legacy equipment, binary telegrams, control systems
Delimiter A message runs until a specific byte sequence such as \n Commands, logs, NDJSON, simple protocols
Length prefix Put the payload length first, then read exactly that many bytes Binary, JSON, MessagePack, Protocol Buffers, etc.
Self-describing format The format itself carries length or terminators, like HTTP’s Content-Length or chunked encoding Existing protocols, communication needing extensibility

Personally, when building a custom protocol, the length prefix approach is the first one I consider. The payload can contain newlines or arbitrary binary, the receiver implementation is unambiguous, and it is easy to enforce a maximum size.

7. Length Prefix Basics

In the length prefix approach, messages take this form:

[4-byte payload length][payload]

For example, if the payload is UTF-8 JSON and is 31 bytes long, you send:

00 00 00 1F 7B 22 63 6F 6D 6D 61 6E 64 ...
^---------^ ^------------------------------^
   length                payload

The receiver processes in this order:

  1. Read exactly 4 bytes
  2. Extract the payload length from those 4 bytes
  3. Validate that the length is not invalid
  4. Read exactly that many payload bytes
  5. Process the completed payload as one message
  6. Read the next frame

The crucial point here: even the 4-byte header can be fragmented.

Received1: 00 00
Received2: 00 1F 7B 22 63 ...

So being “just the header” does not mean one Read will yield 4 bytes.

Same for the payload. It is entirely normal for Read to return fewer bytes than requested. When the required byte count is known, you must write a loop that reads until it is fully satisfied.

8. Reception in .NET: A Length Prefix Implementation

Here is an example of reading length-prefixed frames in .NET / C#.

The first 4 bytes are interpreted as a big-endian int payload length.

using System.Buffers.Binary;
using System.IO;

public static class LengthPrefixedProtocol
{
    private const int HeaderSize = 4;
    private const int MaxPayloadSize = 1024 * 1024; // 1 MiB. Choose based on your use case

    public static async ValueTask<byte[]?> ReadFrameAsync(
        Stream stream,
        CancellationToken cancellationToken)
    {
        byte[] header = new byte[HeaderSize];

        int headerBytes = await ReadUntilFullOrEndAsync(
            stream,
            header,
            cancellationToken);

        if (headerBytes == 0)
        {
            // The peer terminated normally before the next frame started, not mid-frame
            return null;
        }

        if (headerBytes != HeaderSize)
        {
            throw new EndOfStreamException("Frame header was truncated.");
        }

        int payloadLength = BinaryPrimitives.ReadInt32BigEndian(header);

        if (payloadLength < 0 || payloadLength > MaxPayloadSize)
        {
            throw new InvalidDataException(
                $"Invalid payload length: {payloadLength} bytes.");
        }

        byte[] payload = new byte[payloadLength];

        int payloadBytes = await ReadUntilFullOrEndAsync(
            stream,
            payload,
            cancellationToken);

        if (payloadBytes != payloadLength)
        {
            throw new EndOfStreamException("Frame payload was truncated.");
        }

        return payload;
    }

    private static async ValueTask<int> ReadUntilFullOrEndAsync(
        Stream stream,
        Memory<byte> buffer,
        CancellationToken cancellationToken)
    {
        int totalRead = 0;

        while (totalRead < buffer.Length)
        {
            int read = await stream.ReadAsync(
                buffer[totalRead..],
                cancellationToken);

            if (read == 0)
            {
                break;
            }

            totalRead += read;
        }

        return totalRead;
    }
}

The caller looks like this:

while (true)
{
    byte[]? payload = await LengthPrefixedProtocol.ReadFrameAsync(
        stream,
        cancellationToken);

    if (payload is null)
    {
        // The peer disconnected cleanly at a frame boundary
        break;
    }

    await HandleMessageAsync(payload, cancellationToken);
}

With this implementation, it does not matter how many bytes each ReadAsync returns. Even if bytes come back one at a time, the loop continues until the header and payload are fully read.

Conversely, if data for several messages has accumulated in the OS receive buffer, only the first frame is carved out according to its payload length, and the next loop iteration reads the next frame.

Note that on current .NET you may have Stream.ReadExactly / ReadExactlyAsync available, in which case you can delegate the read-until-complete logic to the standard API. Even so, you still need to design how the application distinguishes connection termination: normal termination before a frame starts versus abnormal termination mid-frame.

9. The Sending Side

The sender writes following the same frame format.

using System.Buffers.Binary;
using System.IO;

public static class LengthPrefixedProtocolWriter
{
    private const int HeaderSize = 4;
    private const int MaxPayloadSize = 1024 * 1024;

    public static async ValueTask WriteFrameAsync(
        Stream stream,
        ReadOnlyMemory<byte> payload,
        CancellationToken cancellationToken)
    {
        if (payload.Length > MaxPayloadSize)
        {
            throw new InvalidDataException(
                $"Payload is too large: {payload.Length} bytes.");
        }

        byte[] header = new byte[HeaderSize];
        BinaryPrimitives.WriteInt32BigEndian(header, payload.Length);

        await stream.WriteAsync(header, cancellationToken);
        await stream.WriteAsync(payload, cancellationToken);
    }
}

This code calls WriteAsync separately for the header and the payload. Here is where another misconception easily arises: even though the sender writes the header and payload in two WriteAsync calls, the receiver will not necessarily read them in two pieces.

The receiver might see this:

Read() => [4-byte header + part of the payload]
Read() => [rest of the payload]

Or perhaps this:

Read() => [first 2 bytes of the header]
Read() => [last 2 bytes of the header + entire payload + the next frame's header]

That is exactly why the receiver decides based on “how many bytes have been read according to the frame format,” not “how many times Read was called.”

10. When Using Socket.Send Directly, the Sender Must Check the Return Value Too

If you are using NetworkStream.Write / WriteAsync, you can basically treat it as an API that writes the specified range.

When using Socket.Send directly, however, the return value demands attention.

Socket.Send returns “the number of bytes that were sent.” Especially with non-blocking sockets, it can succeed having sent fewer bytes than requested.

So if you use Socket.Send directly, the sender also needs a “repeat until everything is sent” loop.

using System.Net.Sockets;

public static async ValueTask SendAllAsync(
    Socket socket,
    ReadOnlyMemory<byte> buffer,
    CancellationToken cancellationToken)
{
    while (!buffer.IsEmpty)
    {
        int sent = await socket.SendAsync(
            buffer,
            SocketFlags.None,
            cancellationToken);

        if (sent == 0)
        {
            throw new IOException("Socket was closed while sending data.");
        }

        buffer = buffer[sent..];
    }
}

Be aware, though, that “sent” here does not mean “the peer application has processed that message.” Success of the send API is distinct from a success response at the application protocol level.

If, in business terms, you need to confirm “the order was accepted,” “the file was saved,” or “the command was executed,” you must define an ACK or response message in your protocol — TCP send success is not it.

11. Caveats for the Delimiter Approach

Text protocols sometimes use newline delimiters.

LOGIN komura secret\n
GET item-001\n
QUIT\n

This approach is easy to understand and fits well with logs and command formats.

But you must take care of the following:

  • Define escaping rules for when the delimiter appears inside the payload
  • Decide how to handle \r\n versus \n
  • Decide the maximum length of a line
  • Do not buffer unboundedly while waiting for a delimiter
  • Make sure a split mid-UTF-8-character does not corrupt anything

In particular, avoid this code:

int read = await stream.ReadAsync(buffer, cancellationToken);
string text = Encoding.UTF8.GetString(buffer, 0, read);

foreach (string line in text.Split('\n'))
{
    await HandleLineAsync(line, cancellationToken);
}

This code accounts neither for the possibility that the received range ends mid-line, nor for the possibility of a split in the middle of a UTF-8 character.

If you use newline delimiting, at minimum either “accumulate bytes, search for the newline byte, and decode only once a full line is assembled,” or use an API that reads lines over a stream, such as StreamReader.ReadLineAsync.

Even with StreamReader.ReadLineAsync, though, you should still design for maximum line length, timeouts, cancellation, and connection termination.

12. Caveats for the Fixed-Length Approach

With fixed-length telegrams, you decide something like “a message is always exactly 128 bytes.” You see this in older business systems, control systems, and device integration.

The principle is the same with fixed length.

If

1 message = 128 bytes

then the receiver loops until it has read all 128 bytes.

byte[] message = new byte[128];
int read = await ReadUntilFullOrEndAsync(stream, message, cancellationToken);

if (read != message.Length)
{
    throw new EndOfStreamException("Fixed-length message was truncated.");
}

await HandleMessageAsync(message, cancellationToken);

Here too, a single ReadAsync does not necessarily return 128 bytes.

Fixed length is easy to implement because the boundaries are explicit, but it has drawbacks: handling variable-length data is awkward, future extension is hard, padding is a nuisance, and character encoding conversions change byte counts.

13. Disabling Nagle Does Not Solve the Message Boundary Problem

When you want small data sent immediately, you may consider Socket.NoDelay = true. This disables the Nagle algorithm.

But NoDelay is a setting about send latency and efficiency — “how small sends get batched” — not a setting that “preserves Send units as Receive units.”

In other words, even with NoDelay = true, none of these problems goes away:

  • One Send splitting across multiple Receive calls
  • Multiple Send calls coalescing into one Receive
  • Splits in the middle of a character
  • The receiver being unable to determine message boundaries

NoDelay is meaningful as a latency tuning knob, but it is no substitute for framing.

14. The Same Thinking Applies with TLS and SslStream

When you add TLS with SslStream, the handling from the application’s point of view is fundamentally the same.

TLS has an internal unit called the TLS record, but that is not an application message boundary.

With SslStream.ReadAsync too, there is no guarantee that one call returns exactly one application message.

Therefore, with or without TLS, design one of these at the application layer:

  • Length prefix
  • A delimiter such as newline
  • Fixed length
  • An existing protocol format

TLS is a layer for encryption and authentication — not a layer that creates message boundaries for you.

15. Error Handling to Keep in Mind in the Receive Loop

In TCP reception, it is important to handle not just the happy path but disconnections and premature termination explicitly.

When Read / Receive returns 0, it generally means the peer has finished sending normally.

At the application protocol level, however, you need to distinguish two cases:

State Treatment
0 bytes before reading the next frame Can sometimes be treated as normal termination
Termination mid-header or mid-payload A partial telegram — treat as an error

With the length prefix approach, think of it like this:

Disconnect at a frame boundary:
  may be treated as normal termination

Disconnect after receiving only 2 of the 4 header bytes:
  protocol error

Disconnect after receiving only 60 of a declared 100-byte payload:
  protocol error

Building in this distinction makes log investigation considerably easier.

Instead of just “the peer disconnected,” being able to emit

Frame payload was truncated. expected=100 actual=60

makes it much easier to suspect an abnormal termination on the peer side, a timeout, or a protocol mismatch.

16. Always Set a Maximum Size

In the length prefix approach, the payload length comes first.

The danger here is the peer specifying an enormous length.

FF FF FF FF

If you use this directly for array allocation, the application tries to allocate a massive amount of memory and becomes unstable.

So the receiver must always enforce a maximum size.

private const int MaxPayloadSize = 1024 * 1024;

if (payloadLength < 0 || payloadLength > MaxPayloadSize)
{
    throw new InvalidDataException(
        $"Invalid payload length: {payloadLength} bytes.");
}

The maximum is a business decision. For commands, 64 KiB may be plenty; for images or files, you should probably consider a different transfer mechanism or streaming. What matters is never designing for “accept arbitrarily large input in theory.”

17. In String Protocols, Count Bytes, Not Characters

TCP carries bytes, not strings.

So when you put a payload length into a length-prefixed frame, you normally use the byte count, not the character count.

For example, take this string encoded as UTF-8:

こんにちは

That is 5 characters, but 15 bytes in UTF-8.

If you put 5 as the protocol length, the receiver reads only 5 bytes of payload and cuts off in the middle of a character.

The sender must always compute the length from the encoded byte array.

string json = "{\"message\":\"こんにちは\"}";
byte[] payload = Encoding.UTF8.GetBytes(json);

await LengthPrefixedProtocolWriter.WriteFrameAsync(
    stream,
    payload,
    cancellationToken);

The receiver fully reads the frame payload as bytes before turning it back into a string.

byte[]? payload = await LengthPrefixedProtocol.ReadFrameAsync(
    stream,
    cancellationToken);

if (payload is not null)
{
    string json = Encoding.UTF8.GetString(payload);
    await HandleJsonAsync(json, cancellationToken);
}

In this order, it does not matter if a Read happens to split in the middle of a UTF-8 character.

18. Also Watch for Application-Level Interleaving from Concurrent Writes

One more thing that is surprisingly often overlooked: concurrent writes.

Suppose multiple tasks write frames to the same TCP connection at the same time:

_ = WriteFrameAsync(stream, messageA, cancellationToken);
_ = WriteFrameAsync(stream, messageB, cancellationToken);

Without coordination, application-level interleaving like this can occur:

A's header
B's header
A's payload
B's payload

The receiver, having read A’s header, expects A’s payload next. When B’s header cuts in, the protocol breaks.

So it is safest to serialize writes to a single connection. Use, for example, a SemaphoreSlim or a send queue so that frame-level writes never interleave.

private readonly SemaphoreSlim _sendLock = new(1, 1);

public async ValueTask SendFrameSafelyAsync(
    Stream stream,
    byte[] payload,
    CancellationToken cancellationToken)
{
    await _sendLock.WaitAsync(cancellationToken);

    try
    {
        await LengthPrefixedProtocolWriter.WriteFrameAsync(
            stream,
            payload,
            cancellationToken);
    }
    finally
    {
        _sendLock.Release();
    }
}

TCP preserves byte ordering — so if the application writes interleaved byte sequences from multiple tasks, TCP will faithfully deliver that “interleaved order.”

19. Deliberately Fragment and Coalesce in Tests

If you test TCP reception the obvious way, you tend to miss the “happens to work” state.

So in tests, deliberately create these patterns:

Test aspect Example
Bytes arrive one at a time Both header and payload are Read one byte at a time
Disconnect mid-header Only 2 of the 4 header bytes arrive before termination
Disconnect mid-payload Only 60 of a declared 100 payload bytes arrive before termination
Multiple frames coalesced Two frames sit in a single internal buffer
Enormous declared size Send a payload length exceeding the maximum
Zero-byte payload Confirm whether payload length 0 is allowed
UTF-8 splits Byte sequences of Japanese text or emoji are split mid-character

In unit tests, you do not need real TCP sockets — substituting a Stream that “can only be read in a specified chunk size” makes the reception logic easy to verify.

Integration tests over real TCP are needed too, but first extracting the receive parser as pure logic over a Stream makes everything more testable.

The quality bar for network code is not “it works when sent normally” — it is “it behaves as designed when fragmented, when coalesced, and when cut off mid-stream.”

20. A Checklist for Fixing Existing Code

When reviewing existing TCP code, these angles make problems easy to find:

Aspect What to check
Receive units Is one Read / Receive being treated as one message?
Return values Is the byte count returned by Read / Receive always used?
Accumulation Are bytes accumulated until a full message is assembled?
Boundaries Is there a rule — fixed length, delimiter, length prefix?
Character encoding Is data stringified before the message is complete?
Maximum length Are lengths and line lengths bounded?
Disconnection Are frame-boundary disconnects distinguished from mid-frame ones?
Sending Is the return value of Socket.Send being ignored?
Concurrency Can writes from multiple tasks to the same connection interleave?
Logging Can expected / actual byte counts be emitted?
Tests Are there tests for fragmentation, coalescing, and mid-stream disconnects?

The most dangerous code looks like this:

int read = socket.Receive(buffer);
string message = Encoding.UTF8.GetString(buffer);
Handle(message);

There are multiple problems:

  • The value of read is not used
  • The entire buffer is stringified
  • One Receive is treated as one message
  • There is no message boundary
  • Mid-character splits are not considered

At minimum, you need to move to this way of thinking:

Append only the read bytes obtained by Receive to the receive buffer
  ↓
Check whether one frame can be carved out of the buffer per the protocol
  ↓
If it can, process it
  ↓
Keep leftover bytes as the start of the next frame
  ↓
If not enough, wait for the next Receive

21. Conclusion

In TCP communication, you cannot count on receiving in the same units you sent. This is not exceptional behavior — it is fundamental to using TCP.

The key points:

  • TCP provides an ordered byte stream, not messages
  • The units of Send / Write calls are not preserved as the receiver’s Receive / Read units
  • One send can split across multiple receives, and multiple sends can coalesce into one receive
  • The receiver must define message boundaries as part of the application protocol
  • For custom protocols, the length prefix approach is often the easiest to work with
  • Include in your design: loops that read the required byte count to completion, maximum sizes, mid-stream disconnects, character encodings, and concurrent writes
  • NoDelay and DataAvailable are not substitutes for message boundaries

Network code looks easy when you only watch the happy path. In reality, communication only becomes stable once you have decided where to draw the boundaries, how to wait when data is short, how to keep the remainder when there is too much, and how to handle being cut off midway.

If you use TCP, Receive does not return messages — it returns just a portion of a byte stream. The responsibility for turning that into messages lies with your application’s protocol design.

References

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

These topic pages place the article in a broader service and decision context.

This article connects naturally to the following service pages.

Author Profile

Profile page for the article author.

Go Komura

Representative of KomuraSoft LLC

Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.

Back to the Blog