Mutual Exclusion Fundamentals for File-Based Integration - Best Practices for File Locks and Atomic Claims

· · File Integration, Locking, Design, Windows Development

Mutual exclusion in file-based integration becomes an issue in almost every setup involving shared folders, nightly batches, or cross-process hand-offs. The questions people search for most are: is a file lock alone enough, how do I stop multiple workers from picking up the same file, and how do I avoid reading files that are still being written?

In this article, we look at mutual exclusion for file integration through the lenses of file locks, atomic claims, temp -> rename, and idempotency.

Table of Contents

  1. The Conclusion First (In One Line)
  2. Race Patterns That Occur in File Integration (Diagrams)
    • 2.1. Reading a File Mid-Write
    • 2.2. Multiple Workers Picking Up the Same File Simultaneously
    • 2.3. Everyone Stalls on a Stale Lock
  3. Anti-Patterns
    • 3.1. The Two-Step Exists -> Create Check
    • 3.2. Writing Directly to the Final File Name
    • 3.3. Treating a File as Done When Its Size Stops Changing
    • 3.4. Everyone Updating a Shared File
    • 3.5. Believing Lock APIs Are Almighty
  4. Best Practices
    • 4.1. Publish via temp -> close -> rename / replace
    • 4.2. Make Completeness Explicit With a done File / Manifest
    • 4.3. The Receiver Takes a Claim Atomically
    • 4.4. If You Rely on Lock Files, Make Them Leases
    • 4.5. Assume Idempotency
  5. Pseudocode (Excerpts)
  6. A Rough Guide to Choosing
  7. Conclusion
  8. References

File integration is a field where the “hand-off agreement” breaks more easily than the code itself. Things pass in unit tests, yet occasionally break only in the production shared folder or the nightly batch - and the failure is hard to reproduce. This is entirely common.

The cause is usually not the file I/O APIs themselves, but ambiguity in these three things:

  • When is it OK to read?
  • Who holds the right to process?
  • How do we recover when something fails?

In this article, rather than ending the discussion at OS locks, we organize file-integration mutual exclusion as a hand-off protocol.

The code in this article is published on GitHub as a complete buildable and runnable sample set (a library, a demo that demonstrates claim contention between two workers and lease takeover, and unit tests that reproduce contention, corruption, and stale locks).

file-integration-locking-best-practices-komurasoft-style - komurasoft-blog-samples (GitHub)

1. The Conclusion First (In One Line)

  • The most important thing in file integration is to ensure that the moment the final file name becomes visible, the file is “safe to read”
  • Express generating / published / processing / processed states through file names and directories
  • If there are multiple workers, take a claim atomically before reading
  • Use lock files and OS locks as aids, and let idempotency catch whatever slips through

In short, the real substance of file integration is not so much mutual exclusion as the design of a hand-off protocol. It is never as simple as calling one lock function and being done.

2. Race Patterns That Occur in File Integration (Diagrams)

2.1. Reading a File Mid-Write

If you start writing directly under the final file name, this accident happens. A JSON file is missing its closing brace, a CSV is short on rows, and a ZIP is simply corrupt.

ReceiverShared folderSenderReceiverShared folderSenderStill incompleteMissing rows / parse failure / partial processingCreate orders.csv under its final nameWriting rows 1 through 5000Detects orders.csvStarts reading immediatelyWrites the rest

2.2. Multiple Workers Picking Up the Same File Simultaneously

With a “list the directory, open anything unprocessed” flow, two workers can grab the same file. This is how double counting and duplicate sends begin.

incomingWorker 2Worker 1incomingWorker 2Worker 1The same input is processed twiceFinds a.csvFinds a.csvStarts readingStarts reading

2.3. Everyone Stalls on a Stale Lock

A design that just drops a lock file tends to jam up after abnormal termination. If you cannot tell whose lock it is, whether the owner is still alive, or how long it is valid, everyone downstream waits forever.

Worker Block fileWorker AWorker Block fileWorker ACrashes hereCannot tell if it is stale - everyone stopsCreates the lockChecks whether the lock existsHolds off on processingKeeps waiting

3. Anti-Patterns

3.1. The Two-Step Exists -> Create Check

The problem here is that “checking” and “acquiring” are separate operations. Another process can squeeze in between them, so this is not mutual exclusion at all.

File systemProcess BProcess AFile systemProcess BProcess ABoth proceedChecks that no lock existsChecks that no lock existsNoneNoneCreates the lockCreates the lock

The typical bad example looks like this.

if (!File.Exists(lockPath))
{
    File.WriteAllText(lockPath, Environment.ProcessId.ToString());
    ProcessFile();
}

What you need is to make “create if absent” a single operation. In .NET that means the FileMode.CreateNew family; on POSIX systems, atomic creation such as O_CREAT | O_EXCL.

3.2. Writing Directly to the Final File Name

If the receiver’s interpretation is “once that name is visible, it is safe to read,” you have already lost the moment you start writing directly under the final name. The basic rule is: do not equate being visible with being safe to read.

Final name becomes visibleReceiver detects itSender is still writingIncomplete data gets read
using var writer = OpenForWrite(finalPath); // finalPath becomes visible here
foreach (var row in rows)
{
    writer.WriteLine(row);
}

This approach personally invites the accident in 2.1.

3.3. Treating a File as Done When Its Size Stops Changing

This looks convenient but is quite precarious. Copies over the network, sender-side pauses, buffering, and retries all make it wobble routinely.

ReceiverShared folderSenderReceiverShared folderSenderMisjudges as completeStarts copying data.zipPauses partwaySize unchanged for 10 secondsStarts readingResumes the copy
if (currentLength == lastLength && stableSeconds >= 10)
{
    return Ready;
}

If you determine completion by guessing, shared folders and large files will trip you up. Completion is far more stable when stated explicitly via a manifest or a done file.

3.4. Everyone Updating a Shared File

A design where everyone reads and updates a single status.csv or counter.json usually ends in “last writer wins.” When file integration starts being used as a makeshift database, this is where it starts to hurt.

status.csvBatch BBatch Astatus.csvBatch BBatch AA's update is lostReads v1Reads v1Writes v2-AWrites v2-B

There is the escape hatch of going append-only, but its semantics wobble depending on the file system and deployment layout. If shared updates are required, it is better not to strain file integration here.

3.5. Believing Lock APIs Are Almighty

Lock APIs matter, but they only work when every participant plays by the same rules. In heterogeneous system integration, it is safer not to over-trust them.

Additional notes:

  • flock on Linux is an advisory lock, so a party that ignores the agreement can simply write anyway
  • Windows byte-range locks are ignored by memory-mapped files
  • In other words, do not make OS locks alone carry the design of completion notification and ownership

4. Best Practices

4.1. Publish via temp -> close -> rename / replace

The classic approach. Keep the file under a temp name while it is being generated, and switch it to the final name only after closing it. The receiver watches only final names.

Create a unique temp nameWrite the full content to tempFlush / closeRename / replace to the final name in the same directoryReceiver watches only final names

Key points:

  • Put temp and final in the same directory - at minimum the same volume / file system
  • On Windows / .NET, the File.Replace family is worth considering
  • Make it the agreed contract that once the final name is visible, the content is complete

If you put temp on a different drive, the rename degrades into a mere copy, or Replace fails. This prerequisite is unglamorous but very important.

4.2. Make Completeness Explicit With a done File / Manifest

Beyond the data itself, explicitly stating “what has been completed” in a separate file stabilizes the receiver. This is especially effective in heterogeneous system integration.

Generate data.tmpPublish as data.csvCreate data.done / manifest.jsonReceiver detects the done file / manifestVerify file name, size, and hash

Items worth putting in the manifest include:

  • Target file name
  • Size
  • Hash
  • Record count
  • Integration ID / idempotency key
  • Generation timestamp

Order matters too. If you place the done file before publishing the payload, it is not a completion notice - it is an advance notice of an accident.

4.3. The Receiver Takes a Claim Atomically

If multiple workers watch the same incoming, “move it to your own area before reading” is the clearest approach. Only the worker whose rename from incoming to processing/<worker>/ succeeds gets to process the file.

processingincomingWorker 2Worker 1processingincomingWorker 2Worker 1Only the one that succeeds first takes ownershipFinds a.csvFinds a.csvRenames a.csvRenames a.csv

Operationally, separating the directories also makes things easier to trace.

publishclaimsuccessfailuretempincomingprocessingarchiveerror

The claim rename, too, must happen on the same file system - that is a prerequisite.

4.4. If You Rely on Lock Files, Make Them Leases

If you use lock files, make them ownership records with expiration, not mere empty files. A lock whose owner is unknown will inevitably cause disputes later.

lock.jsonownerIdhostpidacquiredAtexpiresAtheartbeatAt

Key points:

  • Create it atomically
  • Use cessation of updates as evidence for stale judgment
  • Deletion is performed, as a rule, only by the creator
  • Assume release failures will happen, and decide the recovery procedure in advance

A lock file is, in the end, a token for cooperation. Trying to guarantee full consistency with that one slip of paper usually ends badly.

4.5. Assume Idempotency

Mutual exclusion matters, but in real operation you can never reduce “occasionally arrives twice” or “re-run midway” to zero. In the end, a design that does not break when fed the same input again is what saves you.

YesNoInput + idempotency keyAlready processed?Treat as success without re-executingExecute the processingRecord in the processed ledger

For example, give each received file an integration ID and record it in a processed ledger. If the design ensures results are never double-counted even when exclusion is broken once, operations become considerably easier.

5. Pseudocode (Excerpts)

5.1. The Typical Failure Pattern

var lockPath = finalPath + ".lock";

if (!File.Exists(lockPath))
{
    File.WriteAllText(lockPath, "");
    using var writer = OpenForWrite(finalPath); // Writes directly to the final name
    WritePayload(writer);

    File.Delete(lockPath);
}

There are three problems.

  • Exists and WriteAllText are separate operations
  • finalPath becomes visible while it is still being written
  • The lock is left behind on abnormal termination

5.2. An Example in the Right Direction (Roughly Sketched)

var tempPath = MakeTempPathSameDirectory(finalPath);
WritePayload(tempPath);
FlushAndClose(tempPath);

PublishByRenameOrReplace(tempPath, finalPath); // Assumes same FS / same volume
PublishDoneFile(finalPath + ".done", new
{
    FileName = Path.GetFileName(finalPath),
    Size = GetFileSize(finalPath),
    Hash = ComputeHash(finalPath),
    IdempotencyKey = integrationId
});
if (!TryClaimBundleByRename(baseName, incomingDir, processingDir))
{
    return; // Another worker claimed it first
}

var manifest = ReadDoneFile(Path.Combine(processingDir, baseName + ".done"));
VerifyPayload(Path.Combine(processingDir, baseName), manifest);

if (AlreadyProcessed(manifest.IdempotencyKey))
{
    MoveBundle(processingDir, archiveDir, baseName);
    return;
}

Process(Path.Combine(processingDir, baseName));
RecordProcessed(manifest.IdempotencyKey);
MoveBundle(processingDir, archiveDir, baseName);

What matters here is the ordering rather than the implementation details. Keeping “write,” “publish,” “take ownership,” and “record as processed” unmixed makes things much harder to break.

6. A Rough Guide to Choosing

  • Single writer / single reader / same host: just temp -> rename already gets you quite far
  • Multiple consumers: add the incoming -> processing claim rename
  • Heterogeneous systems, NAS, shared folders: safer to go all the way to manifest / done files and idempotency
  • Multiple writers updating the same logical state: do not over-stretch file integration - also consider a DB or a queue
  • OS locks are effective within a homogeneous set of apps sharing the same assumptions, but they are no substitute for a hand-off protocol

That last item is also a withdrawal criterion. Some problems genuinely become painful when done with files.

7. Conclusion

Mutual exclusion in file integration is not about calling a lock function - it is about defining state transitions. That is the backbone of this article. Express generating / published / processing / processed through names and directories, and avoid the two-step Exists -> Create check, direct writes to the final name, waiting for size stability, mutual updates of shared files, and over-trusting lock APIs. On top of that, combining temp -> close -> rename / replace, done files / manifests, claim renames, leases, and idempotency prevents most shared-folder integration accidents.

The trick in file integration is to never equate “can be read” with “may be read.” Just separating those two dramatically reduces the kind of accident that only happens in the middle of the night.

8. References

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

These topic pages place the article in a broader service and decision context.

This article connects naturally to the following service pages.

Windows App Development

In Windows application development involving shared-folder integration and nightly batches, mutual-exclusion design translates directly into implementation quality.

Author Profile

Profile page for the article author.

Go Komura

Representative of KomuraSoft LLC

Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.

Back to the Blog