Do UUIDs Collide? Implementation and Operational Patterns That Invite Duplicates

You used UUIDs as primary keys, and one day duplicate key shows up. At that moment, there is a good chance someone says, “So UUIDs do collide after all.”

In practice, however, most UUID duplicates are not a problem with the UUID specification itself but cases where the implementation or operations break the generation conditions the spec assumes. Under RFC 9562, UUIDv4 has 122 bits of random space, and UUIDv7 is also defined on the assumption that the 74 bits outside the timestamp are used for randomness or counters that provide uniqueness. UUIDv8, on the other hand, is explicitly described as implementation-specific, and uniqueness must not be assumed.¹²³ The Python standard library also documents that uuid4() is generated using a cryptographically secure method, so as long as you “use a proper implementation in the normal way,” the guarantees on the UUID side are quite strong.⁴

In this article, we organize the typical patterns where incorrect operations or implementations make UUIDs collide, together with measures to prevent recurrence. The content is based on RFC 9562, the official Python documentation, and the official PostgreSQL documentation as verifiable as of March 2026.⁵⁴⁶

1. The Conclusion First

To summarize up front, these are the dangerous patterns.

Pattern	What happens	First countermeasure
Hand-rolling UUIDv4-like values with a fixed seed or weak PRNG	The same sequence is reproduced in another process or node	Use the OS / runtime standard UUID API
Carrying over generator state as is after fork, VM snapshot, or container cloning	Random or counter state rewinds and duplicates appear	Re-seed after fork, re-initialize after clone, review how persistent state is handled
Using UUIDv3 / v5 under the misconception that they yield “a new ID every time”	The same UUID is regenerated from the same namespace and same name	Understand they are deterministic IDs and restrict their use
Implementing UUIDv1 / v6 / v7 / v8 yourself and handling clock rollback or node/counter carelessly	Duplicates become likely under high-frequency generation or across multiple nodes	Use existing libraries and reduce custom generators
Truncating UUIDs midway or squashing them into another format	You throw away the original 128-bit uniqueness yourself	Store and compare at full length
Not placing UNIQUE / PRIMARY KEY on the DB side	Duplicates slip in silently and root-cause analysis is delayed	Keep a uniqueness constraint at the storage layer

In short, rather than “the UUID collided,” it is usually that the uniqueness you expected from the UUID was shaved away somewhere in the design.

2. Suspect the Generation and Operations First, Not the “Math of UUIDs”

UUID discussions get confusing because the properties differ by version.

UUIDv4 is random-based. Under RFC 9562, the 122 bits other than version / variant are filled with random data.¹
UUIDv7 has a structure that sorts well chronologically; in addition to a Unix millisecond timestamp, the rest is composed of randomness or a carefully seeded counter.²
UUIDv3 / v5 are name-based. Given the same namespace and the same canonical name, producing the same UUID is the correct behavior.⁷
UUIDv8 is for experimental and vendor-specific use, and its uniqueness is implementation-specific. The RFC says uniqueness must not be assumed.³

So even if you say “we use UUIDs,” the story changes completely depending on whether it is

the standard library’s uuid4()
a homemade timestamp + random
uuid5(namespace, name)
or a custom format that merely looks like UUIDv8.

flowchart TD
    A[UUID duplicate is found] --> B{Where did the same value really come from}
    B --> C[Weak generator]
    B --> D[State was rewound]
    B --> E[Misuse of name-based UUID]
    B --> F[Truncated at storage time]
    B --> G[No uniqueness constraint on the DB side]
    C --> H[Implementation mistake]
    D --> H
    E --> H
    F --> H
    G --> H

In practice, working from the right side of this diagram is faster.

3. Pattern 1: Calling It UUIDv4 While Actually Using a Weak PRNG

This is the most common one.

Building 128 bits with a general-purpose PRNG equivalent to Math.random()
Seeding at startup with time() or the PID
Hand-assembling “32 hex digits that look like UUID format”

It may look like a UUID, but if the random source is weak, the same sequence is reproduced in another process or another node.

RFC 9562 says a CSPRNG should be used, both for UUID uniqueness and for unguessability. This is a recommendation (SHOULD), so exceptions can be designed for some use cases, but if you hand-roll UUIDs with a general-purpose PRNG you should be able to explain why. Furthermore, it states that the CSPRNG state should be properly re-seeded upon state changes such as a process fork.⁸ Python’s uuid.uuid4() is likewise documented as generating random UUIDs in a cryptographically secure way.⁴

The practical conclusion here is simple.

Do not hand-roll UUIDs
Do not fiddle with random seeds by hand
Use the standard library or a widely used implementation as is

Keeping a custom generator around “because it’s lightweight” or “because we’ve always used it” is what ends up costing the most later.

4. Pattern 2: Rewinding Generator State via fork, Snapshot, or Clone

The second most dangerous thing is operations where the generator state gets duplicated or rewound.

RFC 9562 explicitly recommends re-seeding after fork, and explains that implementations without stable storage have to generate clock sequences, counters, and random data more frequently, which raises the probability of duplicates.⁸⁹

A practical line of reasoning follows naturally from this.

Restoring multiple instances of the same image after taking a VM snapshot
A custom generator starting from the same initial state every time a container image boots
Sharing PRNG state or counter state across worker forks

Under these operations, the UUID generation sequence can be unintentionally reproduced. The RFC does not literally say “snapshots are dangerous,” but this is a very practical caution you can derive from its notes on re-seeding after fork and on handling generator state.⁸⁹

Countermeasures look like this.

Do not hold custom UUID generator state for long
Re-initialize immediately after fork / clone / restore
Where possible, lean on implementations that draw OS-provided randomness every time
For high-frequency generators, document the state management and re-seeding specification explicitly

5. Pattern 3: Misreading UUIDv3 / v5 as “a New ID Every Time”

UUIDv3 / v5 are not random IDs that resist collision. They are deterministic IDs that can regenerate the same ID from the same name.

RFC 9562 states that UUIDs generated from the same name in canonical format within the same namespace must be equal.⁷ So with usage like the following, duplicates are not an accident — they are the specified behavior.

Using uuid5(NAMESPACE_URL, "https://example.com/users/42") as “fresh ID assignment” every time
Issuing IDs from a namespace shared across all customers plus an email, without putting the tenant into the namespace
Assuming that re-issuing the same logical name on each retry will yield a different ID

Conversely, if the canonicalization of the name is inconsistent, you get different UUIDs for the same subject. The RFC repeatedly stresses the handling of the canonical representation.⁷¹⁰

What matters in this family is three things:

UUIDv3 / v5 are not “collision-free ID assignment” but “same input, same ID”
Do not leave the namespace design vague
Specify the canonicalization of names explicitly

6. Pattern 4: Hand-Implementing Time-Based UUIDs or UUIDv8

UUIDv1 / v6 / v7 / v8 are dangerous to imitate by appearance only.

6.1 Handling node or clock sequence carelessly in UUIDv1 / v6

Under RFC 9562, UUIDv6 is a field-reordered UUIDv1 designed to improve DB locality, and it deals with clock sequences and nodes. The RFC also carries multiple cautions about node collision resistance in distributed environments and about state retention.¹¹⁹¹²

Moreover, the RFC goes as far as saying that with the advent of virtual machines and containers, the uniqueness of MAC addresses can no longer be guaranteed.⁵

So designs like

assuming “it’s a MAC address, so it must be unique”
replicating a node ID baked into an image
resetting the clock sequence to a fixed value on every restart

are dangerous.

6.2 Hand-rolling UUIDv7 and ignoring counter rollover or clock rollback

UUIDv7 is quite practical, but the RFC is careful about monotonicity and counter handling under high-frequency generation. It also explicitly states that implementations must not knowingly return duplicates on clock rollback or counter rollover.²¹³

Which means implementations like

issuing large volumes within the same millisecond with no counter design
continuing to generate without doing anything when the clock goes backwards
multiple processes each initializing the same internal counter independently

are risky.

6.3 Treating UUIDv8 lightly, as if it were just “the new UUID spec”

UUIDv8 looks convenient, but RFC 9562 is quite clear: the uniqueness of UUIDv8 is implementation-specific and must not be assumed.³

So a “company-proprietary UUID” that

embeds a timestamp
embeds a shard ID
embeds some business meaning
and fills the rest with whatever randomness

means that design document is itself your UUID uniqueness specification. It is far too dangerous to introduce without review.

7. Pattern 5: Shortening the UUID Along the Way

Even if generation is done correctly, things can be broken at the storage or comparison stage.

Typical examples:

Using only the first 8 characters as a stand-in for a foreign key
Squashing a 128-bit UUID into a 64-bit integer
A string column too short, so the tail gets cut off
Treating the shortened representation used in logs or on screen as the unique key

What matters here is that changing the representation is not inherently bad.

Removing hyphens
Normalizing to lower / upper case
Storing as 16 binary bytes

Transformations like these, which do not drop any of the 128 bits, are fine. What is dangerous is a transformation that shaves away the very material of uniqueness.

In particular, a design where a separate “human-friendly short ID” was created and then quietly started taking precedence over the real UUID is prone to incidents.

8. Pattern 6: No Uniqueness Constraint on the DB Side

And this one is especially important.

Even if UUIDs are sufficiently collision-resistant, if you truly cannot tolerate duplicates, the storage destination should also carry a uniqueness constraint.

The official PostgreSQL documentation explains that a unique constraint guarantees that the value of a column or group of columns is unique across the whole table, and that a primary key is a row identifier that is unique and not null.⁶

RFC 9562 also says that while UUIDs can provide sufficient uniqueness in practice, true global uniqueness can never be absolutely guaranteed, and that for uses where the impact of a collision is high, stronger countermeasures should be taken.¹⁴

In practice, this combination is the baseline.

Use UUIDs as IDs that are unlikely to collide
Keep UNIQUE / PRIMARY KEY in the DB as the last line of defense
Design retry / idempotency / incident logging for the duplicate case

Using UUIDs and omitting uniqueness constraints are not the same thing.

9. A Practical Checklist

Finally, here is a form you can use directly for adoption or audits.

Check whether you are generating UUIDs yourself If you can move to standard APIs like uuid4() / uuid7(), do that first.
Decide the UUID version as part of the specification State explicitly that v4/v7 are random-based, v3/v5 are deterministic, and v8 is a custom specification.
Inventory how seeds and generator state are handled Make sure the same state is not carried over after fork, worker restart, snapshot, or clone.
Confirm full length is preserved at storage time Do not use prefix comparison or shortened displays as the actual key.
Place UNIQUE / PRIMARY KEY in the DB A UUID is a mechanism that lowers the probability; it is not a constraint itself.
Make duplicates observable Do not swallow duplicate key errors; make it traceable which generator / node / deployment produced them.

10. Summary

UUID collision incidents usually start not because the UUID is weak, but because the implementation or operations break the assumptions the UUID relies on.

Hand-rolling with weak randomness
Rewinding state after fork or snapshot
Using name-based UUIDs for fresh ID assignment
Casually hand-implementing v7 or v8
Truncating along the way and discarding uniqueness
Removing the uniqueness constraint on the DB side

Do any of these, and it is almost the same as actively constructing a situation where collisions become likely.

When you find a duplicate, what you should suspect first is not the math of UUIDs but the generator, state management, storage format, and constraint design. Look in that order, and the cause usually narrows down quickly.

12. References

IETF RFC 9562, Section 5.4 UUID Version 4. On the 122-bit random space of UUIDv4. ↩ ↩²
IETF RFC 9562, Section 5.7 UUID Version 7. On the design of UUIDv7’s timestamp, random bits, and counters. ↩ ↩² ↩³
IETF RFC 9562, Section 5.8 UUID Version 8. On UUIDv8 uniqueness being implementation-specific and not to be assumed. ↩ ↩² ↩³
Python 3.14 documentation, uuid module. On uuid4()’s cryptographically-secure generation, uuid5()’s deterministic behavior, and the properties of uuid7() / uuid8(). ↩ ↩² ↩³
IETF RFC 9562, Universally Unique IDentifiers (UUIDs). The baseline document for the UUID format, each version, and best practices overall. ↩ ↩²
PostgreSQL documentation, Constraints. On guaranteeing uniqueness via UNIQUE constraints and PRIMARY KEY. ↩ ↩²
IETF RFC 9562, Section 6.5 Name-Based UUID Generation. On the same namespace + same name yielding the same UUID, and the importance of canonicalization. ↩ ↩² ↩³
IETF RFC 9562, Section 6.9 Unguessability. On CSPRNG use and re-seeding after fork. ↩ ↩² ↩³
IETF RFC 9562, Section 6.3 UUID Generator States. On handling stable storage and generator state. ↩ ↩² ↩³
IETF RFC 9562, Section 5.5 UUID Version 5. On the specification of name-based UUIDs built from a namespace plus a canonical name. ↩
IETF RFC 9562, Section 5.6 UUID Version 6. On UUIDv6’s node / clock sequence / DB locality. ↩
IETF RFC 9562, Section 6.4 Distributed UUID Generation. On node collision resistance in distributed environments. ↩
IETF RFC 9562, Section 6.2 Monotonicity and Counters. On cautions around clock rollback, counter rollover, and batch generation. ↩
IETF RFC 9562, Sections 6.7 and 6.8. On the thinking behind collision resistance and global uniqueness. ↩

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

Real-Time Systems Programming in Ada — Priorities, Periodic Execution, and CPU Time Control in Practice

A practical deep dive into Ada's Annex D real-time features — task priorities, the Ceiling_Locking protocol, drift-free periodic executio...

Read Article

Fable Is Gone — Don't Give Up: OpenRouter Fusion + Chinese LLMs + Review Layer

Fable is nowhere near replaceable. But combine OpenRouter Fusion with 5 Chinese LLMs, then add a review layer (GPT-5.5-Pro or Codex PR re...

Read Article

Safe Concurrency with Ada — A Practical Guide to Tasks and Protected Objects

A practical introduction to Ada's built-in concurrency model — tasks, rendezvous, and protected objects. Covers the rendezvous pattern (e...

Read Article

The Appeal of the Ada Language — Expressing Design Through Types and Powering Software That Runs for Decades

An introduction to the appeal of the Ada language: strong typing, range constraints, separation of specification and implementation via p...

Read Article

What Is MFC on Windows? Foundational Knowledge for Maintaining Existing Assets

An overview of the Microsoft Foundation Classes (MFC): its relationship to Win32, application structure, message maps, Document/View, DDX...

Read Article

Where This Topic Connects

This article connects naturally to the following service pages.

Technical Consulting & Design Review

UUID collision questions span not just the spec itself but random sources, snapshot operations, DB constraints, and idempotency, so they are worth working through as a design review or technical consultation.

View Service Contact

Bug Investigation & Root Cause Analysis

In real duplicate-ID incidents you need to determine whether the UUID itself is at fault or the implementation and operations are, so organizing the investigation angles and designing recurrence prevention is essential.

View Service Contact

Author Profile

Profile page for the article author.

Go Komura

Representative of KomuraSoft LLC

Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.

View Profile Contact

Public links

GitHub LinkedIn X COM_BLAS COM_BigDecimal

Do UUIDs Collide? Implementation and Operational Patterns That Invite Duplicates

1. The Conclusion First

2. Suspect the Generation and Operations First, Not the “Math of UUIDs”

3. Pattern 1: Calling It UUIDv4 While Actually Using a Weak PRNG

4. Pattern 2: Rewinding Generator State via fork, Snapshot, or Clone

5. Pattern 3: Misreading UUIDv3 / v5 as “a New ID Every Time”

6. Pattern 4: Hand-Implementing Time-Based UUIDs or UUIDv8

6.1 Handling node or clock sequence carelessly in UUIDv1 / v6

6.2 Hand-rolling UUIDv7 and ignoring counter rollover or clock rollback

6.3 Treating UUIDv8 lightly, as if it were just “the new UUID spec”

7. Pattern 5: Shortening the UUID Along the Way

8. Pattern 6: No Uniqueness Constraint on the DB Side

9. A Practical Checklist

10. Summary

12. References

Real-Time Systems Programming in Ada — Priorities, Periodic Execution, and CPU Time Control in Practice

Fable Is Gone — Don't Give Up: OpenRouter Fusion + Chinese LLMs + Review Layer

Safe Concurrency with Ada — A Practical Guide to Tasks and Protected Objects

The Appeal of the Ada Language — Expressing Design Through Types and Powering Software That Runs for Decades

What Is MFC on Windows? Foundational Knowledge for Maintaining Existing Assets

Related Topics

Windows Technical Topics

Where This Topic Connects

Technical Consulting & Design Review

Bug Investigation & Root Cause Analysis

Author Profile

Go Komura

1. The Conclusion First

2. Suspect the Generation and Operations First, Not the “Math of UUIDs”

3. Pattern 1: Calling It UUIDv4 While Actually Using a Weak PRNG

4. Pattern 2: Rewinding Generator State via fork, Snapshot, or Clone

5. Pattern 3: Misreading UUIDv3 / v5 as “a New ID Every Time”

6. Pattern 4: Hand-Implementing Time-Based UUIDs or UUIDv8

6.1 Handling node or clock sequence carelessly in UUIDv1 / v6

6.2 Hand-rolling UUIDv7 and ignoring counter rollover or clock rollback

6.3 Treating UUIDv8 lightly, as if it were just “the new UUID spec”

7. Pattern 5: Shortening the UUID Along the Way

8. Pattern 6: No Uniqueness Constraint on the DB Side

9. A Practical Checklist

10. Summary

11. Related Articles

12. References

Related Articles

Related Topics

Where This Topic Connects

Author Profile

Go Komura