Mutual Exclusion Fundamentals for File-Based Integration - Best Practices for File Locks and Atomic Claims
· Go Komura · File Integration, Locking, Design, Windows Development
Mutual exclusion in file-based integration becomes an issue in almost every setup involving shared folders, nightly batches, or cross-process hand-offs. The questions people search for most are: is a file lock alone enough, how do I stop multiple workers from picking up the same file, and how do I avoid reading files that are still being written?
In this article, we look at mutual exclusion for file integration through the lenses of file locks, atomic claims, temp -> rename, and idempotency.
Table of Contents
- The Conclusion First (In One Line)
- Race Patterns That Occur in File Integration (Diagrams)
- 2.1. Reading a File Mid-Write
- 2.2. Multiple Workers Picking Up the Same File Simultaneously
- 2.3. Everyone Stalls on a Stale Lock
- Anti-Patterns
- 3.1. The Two-Step
Exists -> CreateCheck - 3.2. Writing Directly to the Final File Name
- 3.3. Treating a File as Done When Its Size Stops Changing
- 3.4. Everyone Updating a Shared File
- 3.5. Believing Lock APIs Are Almighty
- 3.1. The Two-Step
- Best Practices
- 4.1. Publish via
temp -> close -> rename / replace - 4.2. Make Completeness Explicit With a
doneFile / Manifest - 4.3. The Receiver Takes a Claim Atomically
- 4.4. If You Rely on Lock Files, Make Them Leases
- 4.5. Assume Idempotency
- 4.1. Publish via
- Pseudocode (Excerpts)
- A Rough Guide to Choosing
- Conclusion
- References
File integration is a field where the “hand-off agreement” breaks more easily than the code itself. Things pass in unit tests, yet occasionally break only in the production shared folder or the nightly batch - and the failure is hard to reproduce. This is entirely common.
The cause is usually not the file I/O APIs themselves, but ambiguity in these three things:
- When is it OK to read?
- Who holds the right to process?
- How do we recover when something fails?
In this article, rather than ending the discussion at OS locks, we organize file-integration mutual exclusion as a hand-off protocol.
The code in this article is published on GitHub as a complete buildable and runnable sample set (a library, a demo that demonstrates claim contention between two workers and lease takeover, and unit tests that reproduce contention, corruption, and stale locks).
file-integration-locking-best-practices-komurasoft-style - komurasoft-blog-samples (GitHub)
1. The Conclusion First (In One Line)
- The most important thing in file integration is to ensure that the moment the final file name becomes visible, the file is “safe to read”
- Express generating / published / processing / processed states through file names and directories
- If there are multiple workers, take a claim atomically before reading
- Use lock files and OS locks as aids, and let idempotency catch whatever slips through
In short, the real substance of file integration is not so much mutual exclusion as the design of a hand-off protocol. It is never as simple as calling one lock function and being done.
2. Race Patterns That Occur in File Integration (Diagrams)
2.1. Reading a File Mid-Write
If you start writing directly under the final file name, this accident happens. A JSON file is missing its closing brace, a CSV is short on rows, and a ZIP is simply corrupt.
sequenceDiagram
participant Sender as Sender
participant Share as Shared folder
participant Receiver as Receiver
Sender->>Share: Create orders.csv under its final name
Sender->>Share: Writing rows 1 through 5000
Receiver->>Share: Detects orders.csv
Receiver->>Share: Starts reading immediately
Note over Receiver: Still incomplete
Sender->>Share: Writes the rest
Note over Receiver: Missing rows / parse failure / partial processing
2.2. Multiple Workers Picking Up the Same File Simultaneously
With a “list the directory, open anything unprocessed” flow, two workers can grab the same file. This is how double counting and duplicate sends begin.
sequenceDiagram
participant W1 as Worker 1
participant W2 as Worker 2
participant Dir as incoming
W1->>Dir: Finds a.csv
W2->>Dir: Finds a.csv
W1->>Dir: Starts reading
W2->>Dir: Starts reading
Note over W1,W2: The same input is processed twice
2.3. Everyone Stalls on a Stale Lock
A design that just drops a lock file tends to jam up after abnormal termination. If you cannot tell whose lock it is, whether the owner is still alive, or how long it is valid, everyone downstream waits forever.
sequenceDiagram
participant A as Worker A
participant Lock as lock file
participant B as Worker B
A->>Lock: Creates the lock
Note over A: Crashes here
B->>Lock: Checks whether the lock exists
B->>Lock: Holds off on processing
B->>Lock: Keeps waiting
Note over B,Lock: Cannot tell if it is stale - everyone stops
3. Anti-Patterns
3.1. The Two-Step Exists -> Create Check
The problem here is that “checking” and “acquiring” are separate operations. Another process can squeeze in between them, so this is not mutual exclusion at all.
sequenceDiagram
participant A as Process A
participant B as Process B
participant FS as File system
A->>FS: Checks that no lock exists
B->>FS: Checks that no lock exists
FS-->>A: None
FS-->>B: None
A->>FS: Creates the lock
B->>FS: Creates the lock
Note over A,B: Both proceed
The typical bad example looks like this.
if (!File.Exists(lockPath))
{
File.WriteAllText(lockPath, Environment.ProcessId.ToString());
ProcessFile();
}
What you need is to make “create if absent” a single operation.
In .NET that means the FileMode.CreateNew family; on POSIX systems, atomic creation such as O_CREAT | O_EXCL.
3.2. Writing Directly to the Final File Name
If the receiver’s interpretation is “once that name is visible, it is safe to read,” you have already lost the moment you start writing directly under the final name. The basic rule is: do not equate being visible with being safe to read.
flowchart LR
A[Final name becomes visible] --> B[Receiver detects it]
B --> C[Sender is still writing]
C --> D[Incomplete data gets read]
using var writer = OpenForWrite(finalPath); // finalPath becomes visible here
foreach (var row in rows)
{
writer.WriteLine(row);
}
This approach personally invites the accident in 2.1.
3.3. Treating a File as Done When Its Size Stops Changing
This looks convenient but is quite precarious. Copies over the network, sender-side pauses, buffering, and retries all make it wobble routinely.
sequenceDiagram
participant Sender as Sender
participant Share as Shared folder
participant Receiver as Receiver
Sender->>Share: Starts copying data.zip
Sender->>Share: Pauses partway
Receiver->>Share: Size unchanged for 10 seconds
Note over Receiver: Misjudges as complete
Receiver->>Share: Starts reading
Sender->>Share: Resumes the copy
if (currentLength == lastLength && stableSeconds >= 10)
{
return Ready;
}
If you determine completion by guessing, shared folders and large files will trip you up. Completion is far more stable when stated explicitly via a manifest or a done file.
3.4. Everyone Updating a Shared File
A design where everyone reads and updates a single status.csv or counter.json usually ends in “last writer wins.”
When file integration starts being used as a makeshift database, this is where it starts to hurt.
sequenceDiagram
participant A as Batch A
participant B as Batch B
participant F as status.csv
A->>F: Reads v1
B->>F: Reads v1
A->>F: Writes v2-A
B->>F: Writes v2-B
Note over F: A's update is lost
There is the escape hatch of going append-only, but its semantics wobble depending on the file system and deployment layout. If shared updates are required, it is better not to strain file integration here.
3.5. Believing Lock APIs Are Almighty
Lock APIs matter, but they only work when every participant plays by the same rules. In heterogeneous system integration, it is safer not to over-trust them.
Additional notes:
flockon Linux is an advisory lock, so a party that ignores the agreement can simply write anyway- Windows byte-range locks are ignored by memory-mapped files
- In other words, do not make OS locks alone carry the design of completion notification and ownership
4. Best Practices
4.1. Publish via temp -> close -> rename / replace
The classic approach. Keep the file under a temp name while it is being generated, and switch it to the final name only after closing it. The receiver watches only final names.
flowchart LR
A[Create a unique temp name] --> B[Write the full content to temp]
B --> C[Flush / close]
C --> D[Rename / replace to the final name in the same directory]
D --> E[Receiver watches only final names]
Key points:
- Put temp and final in the same directory - at minimum the same volume / file system
- On Windows / .NET, the
File.Replacefamily is worth considering - Make it the agreed contract that once the final name is visible, the content is complete
If you put temp on a different drive, the rename degrades into a mere copy, or Replace fails.
This prerequisite is unglamorous but very important.
4.2. Make Completeness Explicit With a done File / Manifest
Beyond the data itself, explicitly stating “what has been completed” in a separate file stabilizes the receiver. This is especially effective in heterogeneous system integration.
flowchart TD
A[Generate data.tmp] --> B[Publish as data.csv]
B --> C[Create data.done / manifest.json]
C --> D[Receiver detects the done file / manifest]
D --> E[Verify file name, size, and hash]
Items worth putting in the manifest include:
- Target file name
- Size
- Hash
- Record count
- Integration ID / idempotency key
- Generation timestamp
Order matters too.
If you place the done file before publishing the payload, it is not a completion notice - it is an advance notice of an accident.
4.3. The Receiver Takes a Claim Atomically
If multiple workers watch the same incoming, “move it to your own area before reading” is the clearest approach.
Only the worker whose rename from incoming to processing/<worker>/ succeeds gets to process the file.
sequenceDiagram
participant W1 as Worker 1
participant W2 as Worker 2
participant IN as incoming
participant PR as processing
W1->>IN: Finds a.csv
W2->>IN: Finds a.csv
W1->>PR: Renames a.csv
W2->>PR: Renames a.csv
Note over W1,W2: Only the one that succeeds first takes ownership
Operationally, separating the directories also makes things easier to trace.
flowchart LR
T[temp] -->|publish| I[incoming]
I -->|claim| P[processing]
P -->|success| A[archive]
P -->|failure| E[error]
The claim rename, too, must happen on the same file system - that is a prerequisite.
4.4. If You Rely on Lock Files, Make Them Leases
If you use lock files, make them ownership records with expiration, not mere empty files. A lock whose owner is unknown will inevitably cause disputes later.
flowchart TD
L[lock.json] --> A[ownerId]
L --> B[host]
L --> C[pid]
L --> D[acquiredAt]
L --> E[expiresAt]
L --> F[heartbeatAt]
Key points:
- Create it atomically
- Use cessation of updates as evidence for stale judgment
- Deletion is performed, as a rule, only by the creator
- Assume release failures will happen, and decide the recovery procedure in advance
A lock file is, in the end, a token for cooperation. Trying to guarantee full consistency with that one slip of paper usually ends badly.
4.5. Assume Idempotency
Mutual exclusion matters, but in real operation you can never reduce “occasionally arrives twice” or “re-run midway” to zero. In the end, a design that does not break when fed the same input again is what saves you.
flowchart LR
A[Input + idempotency key] --> B{Already processed?}
B -- Yes --> C[Treat as success without re-executing]
B -- No --> D[Execute the processing]
D --> E[Record in the processed ledger]
For example, give each received file an integration ID and record it in a processed ledger. If the design ensures results are never double-counted even when exclusion is broken once, operations become considerably easier.
5. Pseudocode (Excerpts)
5.1. The Typical Failure Pattern
var lockPath = finalPath + ".lock";
if (!File.Exists(lockPath))
{
File.WriteAllText(lockPath, "");
using var writer = OpenForWrite(finalPath); // Writes directly to the final name
WritePayload(writer);
File.Delete(lockPath);
}
There are three problems.
ExistsandWriteAllTextare separate operationsfinalPathbecomes visible while it is still being written- The
lockis left behind on abnormal termination
5.2. An Example in the Right Direction (Roughly Sketched)
var tempPath = MakeTempPathSameDirectory(finalPath);
WritePayload(tempPath);
FlushAndClose(tempPath);
PublishByRenameOrReplace(tempPath, finalPath); // Assumes same FS / same volume
PublishDoneFile(finalPath + ".done", new
{
FileName = Path.GetFileName(finalPath),
Size = GetFileSize(finalPath),
Hash = ComputeHash(finalPath),
IdempotencyKey = integrationId
});
if (!TryClaimBundleByRename(baseName, incomingDir, processingDir))
{
return; // Another worker claimed it first
}
var manifest = ReadDoneFile(Path.Combine(processingDir, baseName + ".done"));
VerifyPayload(Path.Combine(processingDir, baseName), manifest);
if (AlreadyProcessed(manifest.IdempotencyKey))
{
MoveBundle(processingDir, archiveDir, baseName);
return;
}
Process(Path.Combine(processingDir, baseName));
RecordProcessed(manifest.IdempotencyKey);
MoveBundle(processingDir, archiveDir, baseName);
What matters here is the ordering rather than the implementation details. Keeping “write,” “publish,” “take ownership,” and “record as processed” unmixed makes things much harder to break.
6. A Rough Guide to Choosing
- Single writer / single reader / same host: just
temp -> renamealready gets you quite far - Multiple consumers: add the
incoming -> processingclaim rename - Heterogeneous systems, NAS, shared folders: safer to go all the way to manifest / done files and idempotency
- Multiple writers updating the same logical state: do not over-stretch file integration - also consider a DB or a queue
- OS locks are effective within a homogeneous set of apps sharing the same assumptions, but they are no substitute for a hand-off protocol
That last item is also a withdrawal criterion. Some problems genuinely become painful when done with files.
7. Conclusion
Mutual exclusion in file integration is not about calling a lock function - it is about defining state transitions. That is the backbone of this article. Express generating / published / processing / processed through names and directories, and avoid the two-step Exists -> Create check, direct writes to the final name, waiting for size stability, mutual updates of shared files, and over-trusting lock APIs. On top of that, combining temp -> close -> rename / replace, done files / manifests, claim renames, leases, and idempotency prevents most shared-folder integration accidents.
The trick in file integration is to never equate “can be read” with “may be read.” Just separating those two dramatically reduces the kind of accident that only happens in the middle of the night.
8. References
- Complete sample code for this article (library, demo, unit tests) - komurasoft-blog-samples (GitHub)
- LockFileEx function (Win32)
- Locking and Unlocking Byte Ranges in Files (Win32)
- Moving and Replacing Files (Win32)
- File.Replace Method (.NET)
- rename — POSIX
- open — POSIX (
O_CREAT | O_EXCL) - flock(2) — Linux manual page
- open(2) — Linux manual page
Related Articles
Recent articles sharing the same tags. Deepen your understanding with closely related topics.
A Practical Guide to FileSystemWatcher - Handling Missed and Duplicate Events
We organize how to use FileSystemWatcher and its pitfalls - missed events, duplicate notifications, completion-detection traps, rescans, ...
Why You Should Prefer Event Waits over Sleep(1) on Windows
On Windows, the accuracy of short timed waits is bounded by the system clock granularity and scheduling. If you are waiting for work to a...
A Decision Table for Whether to Exit or Continue After an Unexpected Exception
When an unexpected exception occurs, should the app exit or keep running? We organize the decision from the perspectives of state corrupt...
A Minimum Security Checklist for Windows App Development
A checklist-style guide to the security basics for WPF / WinForms / WinUI / C++ / C# business apps: privileges, signing, updates, secrets...
Why Use the .NET Generic Host and BackgroundService in Desktop Apps
How to use the Generic Host and BackgroundService to organize startup, periodic processing, shutdown, logging, configuration, and DI in W...
Related Topics
These topic pages place the article in a broader service and decision context.
Windows Technical Topics
Topic hub for KomuraSoft LLC's Windows development, investigation, and legacy-asset articles.
Where This Topic Connects
This article connects naturally to the following service pages.
Windows App Development
In Windows application development involving shared-folder integration and nightly batches, mutual-exclusion design translates directly into implementation quality.
Technical Consulting & Design Review
If you want to sort out the division of responsibilities among locks, atomic claims, and idempotency first, we can handle that as technical consulting and design review.
Author Profile
Profile page for the article author.
Go Komura
Representative of KomuraSoft LLC
Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.
Public links