Shared Memory Pitfalls and Practical Best Practices
· Go Komura · Shared Memory, IPC, Concurrency, C++, C#, Windows Development
Image frames, inspection results, time-series logs, market depth data, huge buffers. When you want to exchange large data at low latency within the same machine, shared memory looks very attractive.
The slightly dangerous part, though, is that shared memory approaches you wearing the face of “fast IPC.” In reality, shared memory is “IPC that reduces copies in exchange for pushing the responsibility for consistency back onto your application.”
- Fast
- Flexible
- But the protocol is yours to build
- And when it fails, the symptoms are spectacular
That is roughly the four-piece set you get.
In this article, with Windows file mappings and POSIX shm_open / mmap in mind, we sort out where shared memory trips you up in practice and how to design so that the accident rate goes down.
Whether you use C/C++ or C#’s MemoryMappedFile, the essentials are almost identical. 1
1. The Conclusion First (In One Breath)
Stated rather bluntly, but in a way that is useful in practice:
- Shared memory is a mechanism that shows the same byte sequence to multiple processes; it is not synchronization itself 23
- What it is fast at is moving large data within the same machine. If all you have is small control messages, pipe / socket / named pipe / queue is very often easier
- With shared memory, being visible and being safe to read are separate problems
- Do not build your design on
volatile. Atomicity, ordering, and waiting need to be considered separately 45 - Putting raw pointers,
HANDLEs, file descriptors,std::string,std::vector,std::mutexdirectly into shared memory will, almost always, make you cry later - Data placed in shared memory is safer when pushed toward fixed-width integers + explicit layout + a versioned header
- Just putting magic / version / size / state / generation / heartbeat in a leading header dramatically changes how easy incidents are to investigate
- The hard parts of shared memory are not speed but initialization, lifetime, recovery, permissions, and ABI
- On Windows the skeleton is
CreateFileMapping/OpenFileMapping/MapViewOfFile; on POSIX it isshm_open/ftruncate/mmap63 - The least accident-prone starting point is an SPSC (single-producer single-consumer) ring buffer or a double buffer
In short: shared memory is fast, but if you use it carelessly, you catch the “it feels like it synchronizes itself” disease. Avoiding that is the first battle.
2. What Shared Memory Shares — and What It Does Not
Roughly speaking, shared memory is a mechanism that maps the same physical pages into the virtual address spaces of multiple processes.
Windows uses a file mapping object and views; POSIX mmaps a shared memory object. 273
Two points matter here.
- What is shared is the byte content, not the virtual addresses themselves
- Being coherent and being synchronized are different things
The Windows documentation also says that views created from the same file mapping object are coherent at a given point in time. But that does not mean a reader can always read a consistent, fully-updated record. 8
For example, even if the writer intends to write
length- then
payload - then the
ready flag
in that order, a reader that reads with no synchronization at all may see the new length combined with the old payload.
Shared memory does not fix this for you automatically.
So what shared memory shares is bytes. What it does not share is meaning, ordering, completion notification, and recovery policy. All of those you must design yourself.
3. Where Shared Memory Fits — and Where It Does Not
| Situation | Fit | Reason |
|---|---|---|
| Passing large frames or buffers within the same machine | Good fit | Easy to reduce the number of copies |
| High-frequency sensor values, images, audio, market depth, etc. | Good fit | Easy to aim for low latency and high throughput |
| Exchanging only small commands and responses | Poor fit | The synchronization cost of control is relatively heavy |
| Communicating with other machines | Not a fit | Shared memory fundamentally assumes a single host |
| Long-term coexistence of different languages and versions | Hard | Requires ABI and versioning design |
| Persistence is also required | Depends on the goal | File-backed mappings are viable, but persistence and IPC responsibilities mix easily |
In practice, the separation control via messaging, data payload via shared memory is very strong. For example:
- The UI process notifies the worker process “use the next frame” via an event / pipe / socket
- The actual frame payload lives in shared memory
That configuration tends to be peaceful.
4. The Four Things to Decide First
When designing shared memory, the first four things to decide are these.
4.1 Separate the Control Plane from the Data Plane
Decide up front what goes into shared memory.
- data plane: images, audio, record sequences, bulk data
- control plane: start, stop, errors, reconnection, reinitialization, notifications
Just separating these two makes the shared-memory side of the design considerably simpler.
4.2 Narrow the Concurrency Model
- SPSC: 1 producer / 1 consumer
- MPSC: many writers / 1 consumer
- SPMC: 1 writer / many readers
- MPMC: many writers / many readers
The difficulty rises in roughly that order. Going straight to MPMC is quite valiant. The memory-ordering goblins usually show up later.
4.3 Decide Ownership and Lifetime
- Who creates it
- Who initializes it
- Who deletes it
- Who recovers it when a participant dies midway
If this is vague, the air gets murky at every startup-order issue and restart.
4.4 Decide the ABI and Versioning
- Layout
- Type sizes
- Alignment
- Reserved areas
- Version / feature flags
- Compatibility guarantees
Shared memory is not an API; it is an ABI (binary interface) problem. Get this wrong and you end up with the nasty kind of failure where source compatibility exists but things only break at runtime.
5. Common Pitfalls
5.1 Not Synchronizing
This is the most common one.
“We’re looking at the same memory, so if I write it, they can read it.”
Sometimes they can. But that does not mean they can read it at the right time, in the right units, in the right order.
On both Windows and POSIX, access to shared memory is assumed to be combined with a separate synchronization mechanism. The Windows documentation says access to shared views should be coordinated with mutexes / semaphores / events. 2 The POSIX material also says access to shared memory requires synchronization. 9
5.2 Trying to Fix It with volatile
volatile is not a magic spell that rescues your shared-memory design.
At minimum, atomicity and mutual exclusion are separate problems. 45
For example, a design that places volatile bool ready; and busy-loops on it:
- wastes CPU
- leaves the ordering guarantee between payload and ready vague
- is not portable
- easily picks up intermediate states
Pretty much nothing good comes of it.
Furthermore, Windows’s WaitOnAddress is for threads within the same process.
It is safer not to think of it as a cross-process waiting mechanism. 10
5.3 Letting Readers See Intermediate States
When shared memory fails, the symptoms look quite mundane.
- Only the header is new
- Only the payload is old
- Only the length has been updated
- A pair of two fields is inconsistent
If all you do is atomically update a single scalar, things are relatively simple, but if you publish a record made of multiple fields, you need a commit procedure.
Typically one of the following:
- Protect the whole thing with a mutex
- Use a double buffer and flip “the currently valid buffer index” at the end
- Use a ring buffer with per-slot state / sequence
- For 1 writer / many readers, take snapshots with a sequence counter
Even “set the ready flag last” is still an immature design unless you decide with what memory ordering that flag is written and read. In shared memory, the publication timing itself is the protocol.
5.4 Placing Pointers or Complex Objects As Is
This is another frequent pattern.
- Raw pointers
HANDLE- File descriptors
std::stringstd::vectorstd::unordered_mapstd::mutexCRITICAL_SECTION
Placing these straight into shared memory and trying to use them from another process. A small hell usually begins.
The reason is simple: virtual addresses and process-local resources only have meaning within that process’s context. For Windows views too, mapping the same mapping in another process does not guarantee the virtual addresses match. 711
So if you need references, the basic approach is to hold them as offsets from the base address.
typedef struct ShmRef {
uint64_t offset; // position relative to the start of the segment
uint32_t length;
uint32_t kind;
} ShmRef;
This way, each process can resolve base + offset to its own address.
5.5 Breaking the ABI
Shared memory is a binary promise, not source code. Which means every one of the following differences matters.
- Sizes of
int/long - Representation of
bool - The underlying type of
enums - Size of
wchar_t - 32-bit / 64-bit differences
#pragma pack- Compiler / language differences
- Alignment / padding
- Little-endian / big-endian
Within a single host, endianness is usually consistent, but just adding ARM64 support or a mixed toolchain can cause perfectly ordinary mismatches.
So for structures placed in shared memory, we strongly recommend:
- Fixed-width integers such as
uint32_t/uint64_t - Explicit padding / reserved fields
- A header with
version,header_size,record_size,total_size static_assert(sizeof(...))where needed- No non-trivial objects
5.6 Initialization Races
Shared memory breaks easily on the assumption that “whoever created it must have initialized it.”
On Windows, if CreateFileMapping hits an existing name it returns the existing object, and GetLastError() reports ERROR_ALREADY_EXISTS.
The initial pages of a pagefile-backed mapping start out zeroed. 8
On POSIX, a new shared memory object starts with length 0 and gets its size via ftruncate. Newly allocated bytes are zero-initialized. Creation with O_CREAT | O_EXCL is atomic. 3
If, without knowing these differences, you
- use it immediately after opening
- have no initialization-complete flag
- let participants initialize concurrently
- never check for version mismatch
then it breaks depending on startup order.
At minimum, place these states in the leading header:
INITIALIZINGREADYBROKEN
And only the creator initializes; joiners wait for READY.
This etiquette alone makes the world considerably quieter.
5.7 Not Thinking About Crash Recovery
What do you do when the writer dies in the middle of updating shared data? Ship to production with this undefined and the look on everyone’s face during an outage suddenly gets serious.
A Windows mutex becomes abandoned when the owning thread exits without releasing it, and waiters receive WAIT_ABANDONED. That means the shared resource may be in an indeterminate state. 12
With POSIX robust mutexes too, when the owner dies, EOWNERDEAD is returned, and after repairing the state you call pthread_mutex_consistent(). 1314
What matters here is not to “just keep going.” Recovery requires at least one of the following:
- A generation number
- The last committed sequence
- A heartbeat
- A dirty / clean flag
- Journal-style two-phase commit
- A full reinitialization procedure for corruption
5.8 False Sharing and Cache-Line Contention
Shared memory is often said to be fast. But if hot counters are packed into the same cache line, the line ping-pongs between CPUs and things slow down merrily.
The classic example:
- The producer updates
write_index - The consumer updates
read_index - Both sit on the same cache line
In that case:
- Split hot fields onto separate cache lines
- Separate frequently updated fields from rarely updated ones
- Aim for one writer per cache line
These alone change things considerably. You often hear about aligning to 64 bytes; treat that as “64 bytes is a common value on many CPUs, not an absolute law.”
5.9 Taking Names, Permissions, and Security Lightly
Named shared memory is convenient, but careless names and permissions cause accidents.
On Windows:
- There are
Global\andLocal\namespaces - Creating a
Global\file mapping from outside session 0 requiresSeCreateGlobalPrivilege - Object names share a namespace with events / semaphores / mutexes / waitable timers / jobs
In other words:
- You name it
"Global\\MyApp"and figure the service and the desktop app can share it - But it fails on permissions
- And on top of that, a mutex with the same name was created first and you get
ERROR_INVALID_HANDLE
The kind of very Windows-flavored mud that comes up.
On the POSIX side too, treating shm_open’s mode or umask lightly makes things unintentionally visible too broadly, or conversely unopenable. 3
Shared memory is not “just memory, therefore safe.” From any process with read permission, it is quite plainly visible. If you put confidential information in it, you need to think about paging / swap / dumps / permissions, just as with ordinary memory.
5.10 Resizing and Upgrading Carelessly
“I’d like to grow the shared memory a bit later” is a fairly dangerous request.
- A Windows mapping object has a size fixed at creation 8
- On POSIX too, unless you keep
ftruncateandmmapconsistent, participants’ mapped lengths stop matching 316
In practice, it is safer to make the size immutable within a generation. If you need to grow:
- Create a segment with a new version / name / generation
- Switch participants over
- Close the old segment
That lowers the accident rate.
5.11 Cramming Even Notifications into Shared Memory
A common pattern:
- Write
ready = 1into shared memory - The peer does
while (!ready) Sleep(1);
This works at first. But later it comes back as:
- Wasted CPU
- Latency jitter from
Sleep(1) - Missed updates that are hard to notice
- Timeouts and shutdown notifications that are hard to write cleanly
Push shared memory toward the data side, and move notification to primitives you can wait on.
- Windows: event / semaphore / mutex / named pipe, etc. 217
- POSIX: semaphores / process-shared mutex + condvar, etc. 1819
5.12 Thinking “This Lets Me Share Across Machines Too”
There is a moment when you are tempted to think: if I use a file-backed mapping and map a network share file, maybe I can do shared memory across machines.
This is dangerous.
The documentation for Windows CreateFileMapping also says coherence is not guaranteed for remote files.
If two machines map the same page as writable, each sees only its own writes, and nothing is merged when the disk is updated. 8
Shared memory is fundamentally a single-host mechanism. If you need to cross machines, choosing socket / RPC / message broker outright is far better for your sanity.
6. Best Practices
6.1 Separate the Control Plane from the Data Plane
Put only bulk data in shared memory; move notifications and state transitions to a separate channel.
- Shared memory: frame, sample, batch, snapshot
- Event / semaphore / pipe / socket: ready, consumed, stop, error, reconnect
This separation improves design clarity even before it improves performance.
6.2 Put a Fixed Header at the Front
At minimum, we strongly recommend a leading header like this.
typedef struct SharedHeader {
uint32_t magic;
uint16_t abi_version;
uint16_t header_size;
uint32_t state; // 0=initializing, 1=ready, 2=broken
uint32_t flags;
uint64_t total_size;
uint64_t generation;
uint64_t heartbeat_ns;
uint64_t payload_offset;
uint64_t payload_size;
uint64_t write_seq;
uint64_t read_seq;
uint8_t reserved[64];
} SharedHeader;
The points are:
magicrejects foreign or uninitialized segmentsabi_versionandheader_sizereject layout differencesstaterejects mid-initializationgenerationdetects re-creationheartbeatmonitors livenessreservedleaves an escape hatch for future extension
What is painful about shared memory is that “it’s hard to see what is happening.” That is exactly why you give it observability metadata from the start.
6.3 Use Offset References
Hold references as offsets, not pointers.
- Resolve via
base + offset - Add range checks on
offset + length - Define a sentinel for invalid values
This alone substantially reduces address-mismatch incidents.
6.4 Narrow the Concurrency Model
Shared memory gets abruptly harder as writers multiply. So one of these two is the strong starting point.
- SPSC ring buffer
- 1-writer / many-reader snapshot
If you need multiple writers, things usually go better when you reduce the number of consistency responsibility points, for example:
- Only the enqueue is lock-free / atomic
- Actual data updates are funneled to a single consumer
6.5 Make the Commit Protocol Explicit
A design where you cannot explain in words “from which moment is it safe to read” is dangerous.
For a double buffer, for example, you define the publication ritual:
- Write to the non-published buffer
- Finalize the checksum and length
- Switch the active buffer index with release semantics
- The reader reads the active index with acquire semantics
- After reading, verify the index has not changed
6.6 Fix the Size per Generation
Rather than resizing in place, cutting generations like
name = MyShm.v3abi_version = 3generation = 42
is easier to maintain.
Shared memory does not “type-check at call time” the way an API does. That is why not breaking an ABI once decided is so important.
6.7 Build In Observability
At minimum, having these around helps a lot.
- Last update time
- Last successful sequence
- Drop count / overwrite count
- Version mismatch count
- Attach / detach count
- Last error code
- Heartbeat
When shared memory breaks, the logs are usually thin. Adding your own counters makes incident response considerably easier.
6.8 Write the Failure-Path Tests First
The happy path alone is not enough. At the very least, look at these.
- Force-kill the writer mid-update
- Reader stalls and the ring overflows
- Connecting with a version mismatch
- Mixed 32-bit / 64-bit
- Opening across sessions
- Insufficient permissions
- A predecessor process restarts while holding an old generation
- Cache misses / NUMA effects under continuous huge-data transfer
With shared memory, breakage tests are worth more than happy-path tests.
7. What to Check on Windows vs. POSIX
| Aspect | Windows | POSIX |
|---|---|---|
| Create / open | CreateFileMapping / OpenFileMapping / MapViewOfFile 6 |
shm_open / ftruncate / mmap 3 |
| Sharing without a disk file | Pagefile-backed mapping with INVALID_HANDLE_VALUE 68 |
POSIX shared memory object + mmap 3 |
| Initial values | Pagefile-backed pages are zero-initialized 8 | New objects start at length 0. Newly allocated bytes are zero-initialized 3 |
| Synchronization | mutex / semaphore / event / interlocked, etc. 25 | Process-shared mutex / condvar / semaphore 2018 |
| Must not use cross-process | CRITICAL_SECTION, WaitOnAddress 2110 |
Mutex / condvar left as PTHREAD_PROCESS_PRIVATE 2019 |
| Owner death | WAIT_ABANDONED 12 |
Robust mutex + EOWNERDEAD / pthread_mutex_consistent() 1314 |
| Name deletion | Disappears when the last handle / view is released 28 | shm_unlink removes the name. The object persists as long as references remain 2223 |
| Namespace / permissions | Global\ / Local\, ACLs, SeCreateGlobalPrivilege 1524 |
mode, umask, namespace, O_CREAT|O_EXCL 3 |
C#’s MemoryMappedFile is essentially a wrapper over the Windows file mapping too.
So the basics do not change:
- Open by the same name
- Use a separate mutex / event
- Read views with an explicit layout
- Never place object references directly
8. The Checklist to Run First
- Do you really need shared memory? Is it large data on the same host?
- Have you separated the control plane from the data plane?
- Can the concurrency model be reduced to SPSC / 1 writer, many readers?
- Does the leading header have magic / version / size / state / generation / heartbeat?
- Are you placing any pointer /
HANDLE/ fd / STL object /std::mutex? - Is there a commit protocol so readers never see intermediate states?
- Is exactly one initializer designated?
- Is there a recovery procedure for abnormal termination?
- Are names and permissions explicit?
- Is
Global\really necessary? - Are you assuming resize in place?
- Have you tried writer kill / reader stall / version mismatch / insufficient permissions?
9. Summary
Used well, shared memory is genuinely powerful. Especially for large data within a single machine, such as:
- Images
- Audio
- Sensor streams
- Large batches
- High-frequency snapshots
it really pays off.
But the essence of shared memory is less “speed” than a transfer of responsibility. In exchange for fewer copies and less kernel-mediated messaging, you take on:
- Synchronization
- Visibility
- Initialization
- ABI
- Recovery
- Permissions
- Observability
So for your first implementation, the safe shape is:
- An SPSC ring buffer or a double buffer
- A fixed leading header
- Offset references
- Notification over a separate channel
- With version / generation / heartbeat
- With failure-path tests
Start from this shape, and shared memory becomes a fairly well-behaved tool. Treat it from day one as “fast common memory where anything goes,” and over time it stops being an application and becomes archaeology.
10. References
- Windows: file mapping and named shared memory basics 682
- Windows: namespace / security / synchronization 1524512
- POSIX:
shm_open,shm_unlink,mmap, process-shared / robust synchronization 322162013 - .NET:
MemoryMappedFileoverview 1
-
Microsoft Learn, “Memory-Mapped Files” / Microsoft Learn, “MemoryMappedFile Class” ↩ ↩2 ↩3
-
Microsoft Learn, “/volatile (volatile Keyword Interpretation)” / Microsoft Learn, “volatile (C++)” ↩ ↩2
-
Microsoft Learn, “Interlocked Variable Access” / Microsoft Learn, “MemoryBarrier function” ↩ ↩2 ↩3 ↩4
-
Microsoft Learn, “Creating Named Shared Memory” ↩ ↩2 ↩3 ↩4
-
Microsoft Learn, “Scope of Allocated Memory” ↩ ↩2
-
Microsoft Learn, “CreateFileMappingA function” ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 ↩8 ↩9
-
man7.org, “POSIX Shared Memory” training slides ↩
-
Microsoft Learn, “WaitOnAddress function” ↩ ↩2
-
Microsoft Learn, “MapViewOfFileEx function” / Microsoft Learn, “MapViewOfFile function” ↩
-
Microsoft Learn, “Mutex Objects” ↩ ↩2 ↩3
-
man7.org, “pthread_mutex_lock(3p)” / man7.org, “pthread_mutexattr_setrobust(3)” ↩ ↩2 ↩3
-
man7.org, “pthread_mutex_consistent(3)” / man7.org, “pthread_mutex_consistent(3p)” ↩ ↩2
-
Microsoft Learn, “Kernel object namespaces” ↩ ↩2 ↩3
-
Microsoft Learn, “Using Mutex Objects” ↩
-
man7.org, “sem_init(3)” / man7.org, “sem_init(3p)” ↩ ↩2
-
Microsoft Learn, “Critical Section Objects” ↩
-
man7.org, “shm_unlink(3p)” ↩ ↩2
-
man7.org, “shm_open(3)” (shm_unlink semantics) ↩
-
Microsoft Learn, “File Mapping Security and Access Rights” ↩ ↩2
Related Articles
Recent articles sharing the same tags. Deepen your understanding with closely related topics.
A Checklist for Safely Handling Child Processes in Windows Apps
Safely handling child processes in Windows apps depends less on the launch API and more on designing process tree ownership and shutdown ...
Calling a C# Native AOT DLL from C/C++
How to publish a C# class library as a native DLL with Native AOT and call UnmanagedCallersOnly entry points from C/C++ — when this setup...
Windows App Outsourcing and Contract Development: What to Sort Out Before You Ask
Before commissioning Windows app outsourcing or contract development, here is how to sort out existing software modification, device inte...
Serial Communication App Pitfalls - Through Reconnection and Log Design
The serial communication app pitfalls you want to avoid in device integration and instrument control, organized from a practical perspect...
Choosing Between WinForms, WPF, and WinUI - A Practical Decision Table
How to decide between WinForms, WPF, and WinUI, organized from the perspectives of new development, existing assets, deployment, UI expre...
Related Topics
These topic pages place the article in a broader service and decision context.
Windows Technical Topics
Topic hub for KomuraSoft LLC's Windows development, investigation, and legacy-asset articles.
Where This Topic Connects
This article connects naturally to the following service pages.
Windows App Development
Large-volume data exchange and process-isolation designs using shared memory, file mappings, and MemoryMappedFile are directly connected to Windows application development.
Technical Consulting & Design Review
Design work that lowers the accident rate — synchronization strategy, ABI design, recovery strategy, and separating the control plane from the data plane — is a good fit for technical consulting and design review.
Author Profile
Profile page for the article author.
Go Komura
Representative of KomuraSoft LLC
Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.
Public links