Vol. I  ·  Pattern §2 ← back to cover MMXXVI
Pattern · Reference

Token & ID Generation

Distributed systems need unique IDs without coordination overhead — your choice of approach shapes correctness, ordering, and collision risk.

i. The Problem

Distributed systems need unique IDs generated across multiple nodes with no coordination overhead. Requirements typically include:

ii. Approach 1 — Database Auto-Increment

INSERT INTO urls (...) → returns BIGINT auto_increment id

iii. Approach 2 — UUID v4 (Random)

550e8400-e29b-41d4-a716-446655440000

iv. Approach 3 — UUID v7 (Time-ordered)

018e-xxxx-xxxx-xxxx-xxxxxxxxxxxx  (48-bit timestamp prefix)

v. Approach 4 — Snowflake ID (Twitter)

[41-bit timestamp ms] [10-bit machine ID] [12-bit sequence]
= 64-bit integer

Variants: Sonyflake: 39-bit timestamp (10ms resolution) + 16-bit sequence + 8-bit machine. Instagram: Postgres sequence per shard + timestamp + shard ID.

vi. Approach 5 — MD5 / SHA Hash of Content

short_code = base62(SHA256(long_url))[:7]

Collision handling: On write: INSERT ... ON CONFLICT or check-then-insert. On collision: append salt, re-hash, retry.

vii. Approach 6 — Counter + Base62 Encoding

counter = 1000000
base62(1000000) = "4c92"  →  7-char short code

Base62 alphabet: a-z A-Z 0-9 (62 chars)

CharsMax valuesCoverage at 100M/day
656 billion~1.5 years
73.5 trillion~95 years
8218 trillion~5,800 years

Counter Sources

MethodHowTrade-off
Single DB sequenceSELECT nextval('url_seq')Simple; DB bottleneck
Redis INCRAtomic incrementFast; single Redis = SPOF without clustering
Zookeeper rangesEach server claims range (e.g., 1M–2M)No coordination per ID; Zookeeper only needed for range claims

viii. Approach 7 — Pre-generated Token Pool

Background job generates N random tokens
Stores in "available" table
Write service pops one atomically

ix. Base62 Implementation Reference

ALPHABET = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
BASE = 62

def encode(num: int) -> str:
    if num == 0:
        return ALPHABET[0]
    result = []
    while num:
        result.append(ALPHABET[num % BASE])
        num //= BASE
    return ''.join(reversed(result))

def decode(s: str) -> int:
    result = 0
    for char in s:
        result = result * BASE + ALPHABET.index(char)
    return result

x. Decision Framework

RequirementBest approach
Human-readable + shortBase62 counter
Time-sortable + compactSnowflake / UUID v7
No coordination at allUUID v4
Deterministic from contentHash (SHA/MD5) + truncate
Pre-validated randomnessToken pool
Simplest possibleDB auto-increment

xi. Key Points