UUID Collision Probability — How Likely Are Duplicate UUIDs?
UUID v4 has 122 bits of randomness. Collisions are theoretically possible but astronomically unlikely. Here's the math behind UUID uniqueness and when collision risk actually...
UUID v4 collisions are theoretically possible — they’re random numbers, not guaranteed unique by construction. But the probability is so vanishingly small that collisions are effectively impossible in practice. Here’s the actual math and what it means for your application.
Use the UUID Generator to generate UUID v4 identifiers.
The math
UUID v4 has 122 bits of randomness (6 bits reserved for version and variant). The total number of possible UUIDs:
2^122 = 5,316,911,983,139,663,491,615,228,241,121,378,304
≈ 5.3 × 10^36 (5.3 undecillion)
That’s 5.3 trillion trillion trillion distinct UUIDs.
Birthday problem: collision probability
The collision probability calculation uses the birthday problem formula. For n randomly generated UUIDs, the probability of at least one collision:
P(collision) ≈ 1 - e^(-n²/(2N))
Where N = 2^122 (total possible UUIDs)
The threshold n at which there’s a 50% chance of collision:
n₅₀ ≈ 2.71 × 10^18 (2.71 quintillion UUIDs)
To reach 50% collision probability, you need to generate 2.71 quintillion UUIDs (2,710,000,000,000,000,000).
Real-world collision scenarios
| Scale | UUIDs generated | Collision probability |
|---|---|---|
| Small app (100K users) | 100,000 | 1 in 10^27 (essentially zero) |
| Medium app (10M users) | 10,000,000 | 1 in 10^22 |
| Large platform (1B users, 100 IDs each) | 10^11 | 1 in 10^17 |
| All websites (10 billion, 1M records each) | 10^16 | 1 in 10^12 |
| Generating 1B UUIDs per second for 100 years | ~3 × 10^18 | ~0.4% |
To put “1 in 10^12” in perspective: your chance of winning a major lottery jackpot is roughly 1 in 10^8. A UUID collision at internet scale is still 10,000 times less likely than winning the lottery.
Python: calculating collision probability
import math
def uuid_collision_probability(n, bits=122):
"""
Calculate probability of at least one collision among n UUIDs.
Uses the birthday problem approximation.
"""
N = 2 ** bits # Total possible UUIDs
# For large N and small n, approximation: P ≈ n²/(2N)
p = 1 - math.exp(-(n * (n - 1)) / (2 * N))
return p
# Examples:
scenarios = [
("1 million UUIDs", 1_000_000),
("1 billion UUIDs", 1_000_000_000),
("1 trillion UUIDs", 1_000_000_000_000),
("1 quadrillion UUIDs", 1_000_000_000_000_000),
("2.71 quintillion UUIDs (50% threshold)", 2_710_000_000_000_000_000),
]
for label, n in scenarios:
p = uuid_collision_probability(n)
print(f"{label}: {p:.2e} probability")
Output:
1 million UUIDs: 9.43e-26 probability
1 billion UUIDs: 9.43e-20 probability
1 trillion UUIDs: 9.43e-14 probability
1 quadrillion UUIDs: 9.43e-08 probability
2.71 quintillion UUIDs (50% threshold): 5.00e-01 probability
JavaScript UUID generation quality
The collision probability assumes cryptographically random UUIDs. Most UUID v4 implementations use OS-level cryptographic random number generators:
// crypto.randomUUID() — uses crypto.getRandomValues() internally:
crypto.randomUUID()
// Cryptographically random → full 122 bits of entropy
// Math.random() — do NOT use for UUIDs:
// Math.random() produces ~52 bits of randomness
// Manual UUID construction with Math.random reduces uniqueness significantly
Many home-grown UUID implementations use Math.random(), which provides far less entropy:
// POOR IMPLEMENTATION (commonly seen, but wrong):
function badUUID() {
return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, (c) => {
const r = Math.random() * 16 | 0; // Math.random() ≈ 52 bits, not 128!
return (c === 'x' ? r : (r & 0x3 | 0x8)).toString(16);
});
}
// CORRECT — use crypto:
crypto.randomUUID() // Browser + Node.js
// or
const { v4: uuidv4 } = require('uuid'); // Uses crypto internally
With Math.random(), the effective entropy drops from 122 bits to roughly 52 bits, increasing collision probability by ~10^21 (a quintillion times higher).
Seeded random number generators
If your UUID generator is accidentally seeded with a predictable value (e.g., the current second), you could generate identical sequences:
// Never do this:
const seed = Date.now(); // Predictable!
// Using seeded Math.random or a non-CSPRNG with this seed
// Two processes started within the same second could generate identical sequences
// Use system entropy:
crypto.randomUUID() // Seeded from OS /dev/urandom or equivalent
This is the real-world cause of UUID collisions: bad implementations, not inherent randomness limitations.
When UUID collision probability actually matters
Cryptographic security: If UUIDs are used as secrets (session tokens, API keys, password reset tokens), collision resistance is secondary to unpredictability. UUIDs are guessable at a rate of 2^122 attempts — that’s adequate, but purpose-built tokens from crypto.randomBytes(32) (256 bits) are better for secrets.
Extreme scale: At Google/Meta/Amazon scale (billions of daily writes), UUID v4 remains safe. Twitter processes ~500B events/day. At that rate across 10 years, collision probability is still under 10^-10.
Testing: Use UUIDs freely in tests. Generate millions. Collision probability is negligible.
Short IDs for human use: If you need shorter IDs visible to users (order numbers, short codes), use sequential IDs or a shorter format like NanoID with appropriate length for your scale.
UUID v1 collision properties
UUID v1 embeds a timestamp and MAC address. Collision risk comes from:
- Same microsecond on same machine: Clock sequence prevents this
- Same microsecond on different machines with same MAC: Rare, but can happen in VMs
- Virtualized environments: Multiple VMs may report the same MAC address
For v1, the real risk is MAC address spoofing or VMs sharing MAC addresses — not random collisions.
Related tools
- UUID Generator — generate UUID v4 values
- UUID v4 Generator — UUID v4 format and usage
- UUID Format — understanding UUID structure
Related posts
- UUID v4 vs v7 for Databases: The Benchmark You Need — UUID v4 fragments your primary key index. UUID v7 fixes it with millisecond-orde…
- CUID2 — Collision-Resistant IDs Better Than UUID v4 — CUID2 generates secure, URL-safe, database-friendly IDs with better collision re…
- NanoID vs UUID — Which Unique ID Generator Should You Use? — NanoID generates shorter, URL-safe unique IDs using a custom alphabet. UUID v4 i…
- UUID Format — Understanding the 128-Bit Unique Identifier Structure — UUIDs follow a specific 8-4-4-4-12 hexadecimal format defined by RFC 4122. Here'…
- UUID Generator Online — Generate UUID v4 and v7 Instantly — A UUID is a 128-bit identifier formatted as 32 hex digits in 5 groups. UUID v4 u…
Related tool
Generate UUID v4 and v7 identifiers in bulk.
Written by Mian Ali Khalid. Part of the Dev Productivity pillar.