Consistency Is a Per-Operation Decision
The spectrum, PACELC, and why 'strong vs eventual' is the wrong question
“Is our system strongly consistent or eventually consistent?” is the wrong question, asked at the wrong altitude. Consistency is not a property of a database. It is a property of an operation — and a well-designed system runs strong, causal, session, and eventual guarantees side by side, choosing per call. This module is about making that choice deliberately, and about the one law (PACELC, not CAP) that tells you what each choice costs on an ordinary Tuesday.
The spectrum is a dial, not a switch
Between “every read sees the latest write” and “reads see something, eventually” lies a graded spectrum, and each stop buys a guarantee at a price. The mistake most teams make is treating it as a binary they set once, globally, in a config file — when the right mental model is a dial they turn per endpoint. Sweep it:
Within one session, you always see your own writes and never go backward in time (monotonic reads). Other sessions may lag.
Cheap: route a session's reads to a replica that has its writes, or carry a version token. No cross-client coordination.
Cross-session freshness. Your friend may not see your post for a few seconds — but you always do.
Notice what happens around the middle. Linearizable is what people mean when they say “strongly consistent”: one global, real-time order, at the cost of a coordination round trip and unavailability during a partition. You need it far less than you think — for the username uniqueness check, the lock, the balance gate before a withdrawal. Eventual is the cheap end: perfect for a like counter, lethal for the username your user just set. The two stops that quietly run most user-facing software live in between.
Session guarantees: the 90% that isn't linearizability
The display-name bug doesn’t need a global order. It needs one promise: a session sees its own writes(read-your-writes) and never sees time run backward(monotonic reads). Both are session guarantees — scoped to one client, not the whole system — which makes them cheap. You don’t coordinate across replicas; you just steer one session’s reads to a replica fresh enough to honor what it has already seen.
That’s the entire trick the reference implementation uses, and it’s why session guarantees are the highest-leverage idea in this module: they deliver the consistency users actually perceive, at a fraction of linearizability’s cost.
CAP is a footnote. PACELC is the bill you pay daily.
Everyone quotes CAP — under a network Partition, choose Availability or Consistency. True, and almost irrelevant to your day, because real partitions are rare. CAP describes a decision you make for a few seconds a year. It says nothing about the cost you pay on every request when the network is perfectly healthy.
PACELC completes it, and it’s the version worth memorizing: if Partition, then A or C; Else, then L or C. The else clause is the one that bills you continuously. Even with a flawless network, stronger consistency costs latency — a read that must confirm it has the newest value has to talk to a quorum or the leader, and that round trip is on your p99 forever. The real architectural question isn’t “what do we do during a partition?” It’s “what latency are we paying for consistency right now, and is this operation worth it?”
The PACELC Decision Grid
Classify any datastore — or any single operation — by what it gives up during a partition, and what it gives up the rest of the time.
Gives up consistency during partitions AND trades it for latency normally. The default for high-scale user data.
Refuses to serve wrong answers — unavailable during partitions, slower normally. The default for money and metadata.
Stays up when the network splits, but pays for consistency (latency) when it's healthy. The pragmatic middle.
Rare and deliberate: hold consistency when it matters most (the partition) but optimize latency when calm.
The screenshot-worthy move: classify not your database but each operation. “Withdraw money” is PC/EC. “Show the feed” is PA/EL. They can live in the same service, on the same store, because the consistency dial is set at the call site.
Building read-your-writes without going linearizable
Here is the core of the session router: a per-session high-water mark (the newest version this session has observed) and a read that refuses any replica too stale to honor it, falling back to the leader. No global coordination — just one session’s freshness floor.
1read(sessionId: string, key: string): ReadResult {2 const hw = this.highWater.get(sessionId) ?? 03 4 // only serve from a replica fresh enough to honor what this session saw5 const replica = this.replicas.find((r) => r.appliedVersion >= hw)6 const [result, servedBy] = replica7 ? [replica.get(key), replica.id]8 : [this.leader.latest(key), "leader"] // nobody's fresh -> the leader is9 10 const version = result?.version ?? 011 // a RYW / monotonic-read violation IS exactly: handing back something12 // older than this session already observed.13 const staleViolation = version < hw14 if (result) this.bump(sessionId, version)15 return { value: result?.value, version, servedBy, staleViolation }16}courses/distributed-systems/reference-impl/02-session-guarantees/A single-leader cluster with lagging replicas and the high-water-mark router. The demo reproduces the opening ticket: naive routing serves the stale city=Austin after the user committed city=Berlin; the session router serves the leader and keeps the promise. npm run demo — and 4 passing tests covering leader fallback, replica reuse, and cross-session isolation.
| Dimension | Linearizable | Causal | Session (RYW) | Eventual |
|---|---|---|---|---|
| Staleness a reader can see | None — newest write, always | Bounded by causal order; concurrent ops may reorder | Your own writes never stale; others can be | Unbounded until convergence |
| Available under partition? | No (C over A) | Yes | Yes | Yes |
| Added latency (healthy network) | High — quorum/leader round trip | Low — track deps, no global vote | Low — route by high-water mark | None |
| Coordination cost | Global consensus per op | Causal metadata (version vectors) | Per-session token / sticky routing | None |
| Where it fits | Locks, uniqueness, balances | Comments, chat, collaborative edits | Most user-facing reads | Counts, caches, analytics |
| Choose when | Correctness depends on a single global truth at the instant of the operation — money, locks, 'is this taken?'. | Order matters between related events but unrelated events can diverge — anything with a reply-after-comment shape. | One user must see their own and a non-regressing view, others can lag. The default for app reads. | Staleness of seconds is imperceptible and availability/latency dominate. |
Default user-facing reads to session guarantees, not eventual — it kills the “my edit vanished” class of bug for almost nothing. Reserve linearizable for the handful of operations whose correctness is global, and pay its latency knowingly. Reaching for linearizability everywhere is the most common way teams burn their latency budget on a guarantee no user asked for.
The 24-hour data-consistency incident, 21 October 2018
Consistency is chosen whether or not you choose it — a failover timeout is a consistency decision in disguise. If correctness needs a single writer, you need a mechanism that enforces it during the partition (a real leader lease with fencing, Module 8), not an availability-optimized promotion that hopes two primaries never overlap. Decide the A-vs-C trade explicitly, per system, before an incident decides it for you.
What this sounds like in an interview
A user updates their profile and immediately refreshes. Sometimes they see the old value. How do you fix it?
The interviewer wants to see whether you reach for the biggest hammer (make it strongly consistent) or the right-sized one.
I'd make sure the read goes to the primary database instead of a replica, so it always has the latest data.
The replica is lagging. I'd either read from the primary for this endpoint, or add a short cache of the user's own recent writes so their reads reflect them.
This is a read-your-writes problem, and it doesn't need global strong consistency — just a session guarantee. I'd give each session a high-water mark (the version of its last write) and route its reads only to a replica caught up to that version, falling back to the primary if none is. Other users can still read slightly-stale replicas, so I keep the read-scaling benefit and only pay for freshness where the same user is involved.
Same session-guarantee mechanism, but I'd frame it as a per-operation consistency decision and be explicit about the cost model. Read-your-writes via a version token is cheap and is the right default for user-facing reads; I'd push back on anyone proposing to make the whole service linearizable, because that puts a coordination round trip on every read's p99 to fix a problem that's scoped to one session. I'd also name the failure mode: if I store the high-water mark client-side, I have to make sure it can't be forged to pin reads to the leader and defeat replica scaling, and if I store it server-side I've added session affinity that complicates load balancing. The trade I'm making is a small amount of routing complexity for a large latency saving versus global consistency.
Named the precise guarantee needed (read-your-writes), refused the oversized fix (global linearizability) with a cost argument, and surfaced the second-order operational trade-offs of the token mechanism. That's someone who has shipped this and felt where it bites.
Don't make the whole system linearizable to fix one stale read
Linearizability puts a coordination round trip on the critical path of every operation and makes you unavailable during partitions. Using it to fix a read-your-writes bug is like rebuilding the highway because one driver missed an exit. Scope the guarantee to the session that needs it.
Don't use session guarantees where you need global truth
Read-your-writes is per-session. It does nothing for invariants that span users: two people racing for the same username, a seat-booking system, a balance check. Those need real linearizable operations. Session guarantees are a scalpel, not a substitute for consensus.
Don't pin the high-water mark client-side without thinking
A version token the client controls can be forged or replayed to force every read to the leader, quietly destroying your read scaling — or set artificially low to read stale data it shouldn’t. Treat it as untrusted input: sign it, bound it, or keep it server-side behind session affinity.
Don't choose eventual consistency for anything a user authored
Counts, recommendations, and other people’s data tolerate eventual consistency beautifully. A user’s own post, comment, or setting does not: “I did that and it disappeared” is the most corrosive bug a product can ship. Author-visible data is session-consistent at minimum.
Exercises
Design the consistency strategy for a Twitter-like service with four operations: (1) post a tweet, (2) read your own profile timeline, (3) read a global trending-topics list, (4) follow another user, which must be reflected the next time you load your home feed. For each, name the consistency level you’d choose and the mechanism, and classify it on the PACELC grid.
02-session-guarantees, add a writes-follow-reads guarantee: a write must be applied on top of a state that already reflects everything the session has read (so a reply can’t be ordered before the comment the user just read). Extend SessionRouter.write to require the target’s applied version ≥ the session high-water mark, and add a test where a write routed to a lagging replica would violate it.write bumps the session’s high-water mark so the user can immediately read their own write back. It introduces a way for a session to believe it observed a version that never actually committed. Find the window.1write(sessionId: string, key: string, value: string): number {2 // optimistically record that we'll have observed this write3 const nextVersion = this.leader.peekNextVersion()4 this.bump(sessionId, nextVersion) // <- bump BEFORE the write commits5 const version = this.leader.write(key, value)6 return version7}- 01Consistency is a property of an operation, not a database. The right design runs linearizable, causal, session, and eventual guarantees side by side, chosen per call site.
- 02PACELC > CAP. Partitions are rare; the else-latency clause is the bill you pay on every healthy request. Ask “what latency am I paying for consistency here?” not just “what happens in a partition?”
- 03Session guarantees (read-your-writes + monotonic reads) deliver the consistency users actually perceive, scoped to one client, for almost nothing — a high-water mark and freshness-aware routing. Make them the default for user-facing reads.
- 04Reserve linearizability for operations whose correctness is genuinely global (locks, uniqueness, balances), and pay its coordination latency on purpose.
- 05A failover timeout is a consistency decision in disguise (GitHub 2018). Decide A-vs-C explicitly, per system, before an incident decides it for you.