Split business identity from surrogate; spend clustering where it pays.
TL;DR
- Don’t cluster on row IDs.
- Keep a surrogate id for FKs (UUIDv7/
NEWSEQUENTIALID()
/INT). - Make the business key the PK or
UNIQUE
. - Choose clustering/indexes per engine (SQL Server/InnoDB vs Postgres/SQLite).
- Store IDs in 16 bytes (
uuid
/BINARY(16)
), notCHAR(36)
. - Worst practice: random UUIDv4 as your clustered index (or as PK where PK is clustered). Slower than
INT IDENTITY
.
The Most Expensive Default in SQL
CREATE TABLE users (
id INT IDENTITY(1,1) PRIMARY KEY, -- ❌ Expensive mistake
username VARCHAR(50) UNIQUE, -- ✅ Actual business identifier
email VARCHAR(255) UNIQUE,
created_date DATETIME
);
You’re conflating system identity with business identity, and paying for it in reads, writes, and indexes.
Truth: primary key, clustered index, and surrogate key are different. Treat them separately.
Relational Grounding
- Codd: the primary key uniquely identifies each tuple.
- Date: the key captures why the tuple exists.
- Fagin: model a single concept; the key is the minimal identifier.
An auto-increment integer is implementation convenience, not business meaning.
The Three Things People Mix Up
- Primary key (business identity): what makes the row unique (
username
,email
,(order_id, line_item)
). - Clustering (physical leaf order): only on engines that maintain it (SQL Server, InnoDB). Postgres/SQLite don’t.
- Surrogate key (stable system id): used for FKs/logs/cross-system refs (INT,
UUIDv7
,NEWSEQUENTIALID()
).
UUID Myths vs Facts (60 seconds)
- “UUIDs kill inserts.” Only when you use random UUIDv4 as a clustered structure on clustered engines. Use UUIDv7 (time-ordered) or cluster on your hot range column.
- “UUIDs are huge.” Binary UUID is 16 bytes. Use native/binary (
uuid
,BINARY(16)
), notCHAR(36)
. - “Postgres clusters by PK.” No. It’s heap; use B-tree/BRIN.
- “Pick natural or UUIDs.” Use both: natural/business key for identity, UUID as the stable surrogate FK target.
Your Primary Key ≠ Your Clustered Index
They’re different. Treat them separately.
Primary Key (PK): logical/business identity; enforces uniqueness.
Clustered Index (CI): physical order of the main storage (when the engine maintains one). It drives range scans and lookup costs.
Engine realities:
SQL Server: PK is clustered by default unless you say otherwise. Make it NONCLUSTERED
when the business key isn’t your best clustering key.
MySQL/MariaDB (InnoDB): the PK is the clustered index. Your PK choice is your clustering choice.
PostgreSQL: heap tables; PK is a unique B-tree, not maintained clustering. CLUSTER
is one-time and drifts.
SQLite: INTEGER PRIMARY KEY
equals rowid
; non-integer PKs are separate unique indexes.
Takeaways: pick the PK for business identity; pick the CI (where supported) for your dominant range scans. Worst practice is random UUIDv4 as the clustered index. Prefer UUIDv7/NEWSEQUENTIALID()
or cluster on a range column.
The Worst of Both Worlds: Random UUIDv4 as Your Clustered Index
A sure way to make things slower than INT IDENTITY
is to make your clustered structure random. UUIDv4 does exactly that.
SQL Server: NEWID()
as clustered PK causes random-leaf inserts, page splits, fragmentation, and write amplification. Even if the PK is nonclustered, putting the clustered index on a v4 column has the same problem. Fix: PRIMARY KEY NONCLUSTERED
on the business key and cluster on your hot range (often time), or use NEWSEQUENTIALID()
/app-side UUIDv7.
InnoDB (MySQL/MariaDB): the PK is clustered. UUIDv4 as PK means random clustering and larger secondary indexes (every secondary stores the PK). Fix: if you need time locality, make the PK (created_utc, id)
and keep id
UNIQUE
for FKs, or use UUIDv7 as the tiebreaker.
PostgreSQL: no maintained clustering, but B-tree inserts on a v4-leading index are still random, causing page splits and poor cache locality. Fix: prefer UUIDv7 for locality; align indexes to access paths (BRIN for large time ranges).
SQLite: rowid
integers are monotonic and fast. GUID PKs force separate indexes with random insert patterns. Fix: keep rowid
for storage; use GUIDs as UNIQUE
if needed.
Store IDs in 16 bytes (uuid
/BINARY(16)
), not CHAR(36)
. Fill factor only delays splits; it doesn’t cure randomness.
Engine-Scoped Patterns
SQL Server (clustered storage)
CREATE TABLE dbo.Users(
Id UNIQUEIDENTIFIER NOT NULL
CONSTRAINT DF_Users_Id DEFAULT NEWSEQUENTIALID(), -- surrogate FK target
Username NVARCHAR(50) NOT NULL,
Email NVARCHAR(255) NOT NULL,
CreatedUtc DATETIME2 NOT NULL
CONSTRAINT DF_Users_CreatedUtc DEFAULT (SYSUTCDATETIME()),
CONSTRAINT PK_Users PRIMARY KEY NONCLUSTERED (Username), -- business identity
CONSTRAINT UQ_Users_Id UNIQUE (Id),
CONSTRAINT UQ_Users_Email UNIQUE (Email)
);
CREATE CLUSTERED INDEX CX_Users_CreatedUtc ON dbo.Users(CreatedUtc);
MySQL / MariaDB (InnoDB clustered by PK)
CREATE TABLE users (
id BINARY(16) NOT NULL, -- UUIDv7 (binary)
username VARCHAR(50) NOT NULL,
email VARCHAR(255) NOT NULL,
created_utc DATETIME(6) NOT NULL,
PRIMARY KEY (created_utc, id), -- clustered on time + tiebreaker
UNIQUE KEY uq_users_id (id), -- stable FK target
UNIQUE KEY uq_users_username (username),
UNIQUE KEY uq_users_email (email)
);
PostgreSQL (heap; no maintained clustering)
CREATE TABLE users(
id uuid NOT NULL, -- surrogate FK target
username text NOT NULL,
email text NOT NULL,
created_utc timestamptz NOT NULL,
CONSTRAINT uq_users_id UNIQUE (id),
CONSTRAINT uq_users_username UNIQUE (username),
CONSTRAINT uq_users_email UNIQUE (email)
-- or: CONSTRAINT pk_users PRIMARY KEY (username)
);
CREATE INDEX ix_users_username ON users (username);
CREATE INDEX ix_users_created_utc ON users (created_utc);
-- For massive append-only tables:
CREATE INDEX ix_users_created_utc_brin ON users USING brin (created_utc);
-- Optional one-time reheap (not maintained):
-- CLUSTER users USING ix_users_created_utc;
SQLite
CREATE TABLE products (
id TEXT NOT NULL, -- surrogate (UUIDv7 as text/binary)
sku TEXT NOT NULL,
name TEXT,
created_utc TEXT NOT NULL,
PRIMARY KEY (sku)
) WITHOUT ROWID;
Query Path: What Actually Improves
-- Orders for a given username in last month
SELECT o.*
FROM orders o
JOIN users u ON u.id = o.user_id
WHERE u.username = 'john_doe'
AND o.order_date >= '2024-07-01';
Why it’s faster: user lookup aligns to an index (or clustering where supported). Time predicates align to clustered/index order (SQL Server/InnoDB) or BRIN/B-tree (Postgres). Fewer nonaligned lookups → fewer page reads → lower p95/p99.
pengdow.crud: Stop Conflating PK, Clustered Index, and Row ID
Most DB layers/ORMs encourage the confusion: they auto-treat Id
as the primary key, couple it to clustering (when present), and ignore the actual reason the row exists.
pengdow.crud separates concerns and lets you do CRUD by either key.
- Primary key = business identity (
[PrimaryKey]
) - Row id = surrogate/system identifier (
[Id]
) - Clustering is an engine decision, not a model hack
- CRUD by business primary key or by surrogate row id
public class User {
[Id, Column("id")]
public Guid Id { get; set; } // surrogate (UUIDv7/NEWSEQUENTIALID())
[PrimaryKey, Column("username")]
public string Username { get; set; } // business identity
[Column("email")]
public string Email { get; set; }
[Column("created_utc")]
public DateTime CreatedUtc { get; set; }
}
Example operations:
// By business primary key
var u1 = await helper.ReadAsync<User>(pk: "john_doe");
await helper.DeleteAsync<User>(pk: "john_doe");
// By surrogate row id
var u2 = await helper.ReadByIdAsync<User>(id: userId);
await helper.DeleteByIdAsync<User>(id: userId);
// Upsert by business key
await helper.UpsertAsync(new User { Username = "john_doe", Email = "j@x.com" });
Deterministic + Time-Ordered IDs
Use UUIDv7 where clustering/locality matters (timestamp-ordered). Use deterministic, hash-derived UUIDs (e.g., BLAKE3→GUID) when natural inputs are stable to make merges and FK hygiene trivial.
Security note: deterministic IDs can leak information if inputs are guessable. Use keyed hashing (keyed-BLAKE3/HMAC), tenant namespaces, or reserve for internal systems.
Further reading (your piece): Onions in Milkshakes — GUIDs and Deterministic IDs with BLAKE3
https://www.pengdows.com/2025/03/28/onions-in-milkshakes-a-case-for-guids-and-deterministic-ids-with-blake3/
When to Break the Rule
- Unstable business keys (e.g., social handles): surrogate PK +
UNIQUE
on the business identifier. - Logs, queues, time-series: monotonic clustering (
IDENTITY
,(timestamp, id)
, or UUIDv7). - ORM constraints: keep a surrogate PK but enforce business uniqueness separately.
Unicode/Collation Gotchas (Natural Keys)
Define case rules, accent/width normalization, canonicalization, and ensure DB collation matches app rules. Normalize consistently in both layers.
Time Sync + UTC (Ops that matter)
As long as I’m ranting: one of the most overlooked failures is bad time. Your DB server needs accurate, monotonic time. Reasons: replication and PITR, time-ordered IDs (UUIDv7/COMB/NEWSEQUENTIALID()
), accounting/ledgers, TTL/leases/locks, auth token validity, and incident timelines.
Do this (Linux servers):
sudo apt-get update && sudo apt-get install -y chrony
sudo systemctl enable --now chrony
chronyc tracking
chronyc sources -v
sudo timedatectl set-timezone UTC
timedatectl status
Prefer chrony (or full NTPd) over lightweight SNTP. Ensure exactly one time client runs on the host/VM. Don’t run NTP clients inside app containers; keep sync at the host.
Store UTC in the DB; convert at the edge.
CREATE TABLE audit_logs (
event_id UNIQUEIDENTIFIER NOT NULL, -- surrogate
event_type VARCHAR(50) NOT NULL,
occurred_utc DATETIME2 NOT NULL, -- always UTC
user_id UNIQUEIDENTIFIER NULL,
CONSTRAINT PK_audit_logs PRIMARY KEY (event_id)
);
App pattern (pengdow.crud):
public class AuditableEntity {
[CreatedOn, Column("created_utc")] public DateTime CreatedUtc { get; set; }
[LastUpdatedOn, Column("updated_utc")] public DateTime UpdatedUtc { get; set; }
}
Bottom Line
Model business identity with a PK/UNIQUE
. Choose clustering per engine. Use a surrogate key for FKs and cross-system references. Tune for inserts. UTC everywhere with monitored time sync. Stop designing around ORM defaults. Design for how you actually query data. Tools—including pengdow.crud—should respect your choices, not dictate them.
References
Codd, E.F. (1970). A Relational Model of Data for Large Shared Data Banks. Communications of the ACM 13(6): 377–387.
Date, C.J. (2003). An Introduction to Database Systems (8th ed.). Addison-Wesley.
Fagin, R. (1977). Multivalued Dependencies and a New Normal Form for Relational Databases. ACM TODS 2(3): 262–278.
RFC 9562 (2024). Universally Unique IDentifiers (UUIDs) — UUIDv7. https://www.rfc-editor.org/rfc/rfc9562
Microsoft Docs — NEWSEQUENTIALID()
. https://learn.microsoft.com/sql/t-sql/functions/newsequentialid-transact-sql
BLAKE3 — keyed hashing (MAC/PRF) mode. https://github.com/BLAKE3-team/BLAKE3
Pengdows (2025). Onions in Milkshakes: A Case for GUIDs and Deterministic IDs with BLAKE3. https://www.pengdows.com/2025/03/28/onions-in-milkshakes-a-case-for-guids-and-deterministic-ids-with-blake3/
SQL Server — Clustered vs nonclustered indexes; constraints. https://learn.microsoft.com/sql/relational-databases/indexes/clustered-and-nonclustered-indexes-described, https://learn.microsoft.com/sql/relational-databases/tables/primary-and-foreign-key-constraints
MySQL InnoDB — clustered by PK; secondary indexes store PK. https://dev.mysql.com/doc/refman/8.0/en/innodb-index-types.html, https://dev.mysql.com/doc/refman/8.0/en/innodb-index-design.html
PostgreSQL — CLUSTER
, BRIN, uuid
type. https://www.postgresql.org/docs/current/sql-cluster.html, https://www.postgresql.org/docs/current/brin-intro.html, https://www.postgresql.org/docs/current/datatype-uuid.html
SQLite — rowid
, INTEGER PRIMARY KEY
, WITHOUT ROWID
. https://www.sqlite.org/rowidtable.html, https://www.sqlite.org/withoutrowid.html