Interviews Vector

“Is our system strongly consistent or eventually consistent?” is the wrong question, asked at the wrong altitude. Consistency is not a property of a database. It is a property of an operation — and a well-designed system runs strong, causal, session, and eventual guarantees side by side, choosing per call. This module is about making that choice deliberately, and about the one law (PACELC, not CAP) that tells you what each choice costs on an ordinary Tuesday.

The spectrum is a dial, not a switch

Between “every read sees the latest write” and “reads see something, eventually” lies a graded spectrum, and each stop buys a guarantee at a price. The mistake most teams make is treating it as a binary they set once, globally, in a config file — when the right mental model is a dial they turn per endpoint. Sweep it:

The Consistency Spectrum Dial

← stronger · weaker →

Guarantee

Within one session, you always see your own writes and never go backward in time (monotonic reads). Other sessions may lag.

What you pay

Cheap: route a session's reads to a replica that has its writes, or carry a version token. No cross-client coordination.

What breaks here

Cross-session freshness. Your friend may not see your post for a few seconds — but you always do.

Reach for it when · Post a comment and see it on refresh; update a profile and see the new value; the 90% case for user-facing apps.

Notice what happens around the middle. Linearizable is what people mean when they say “strongly consistent”: one global, real-time order, at the cost of a coordination round trip and unavailability during a partition. You need it far less than you think — for the username uniqueness check, the lock, the balance gate before a withdrawal. Eventual is the cheap end: perfect for a like counter, lethal for the username your user just set. The two stops that quietly run most user-facing software live in between.

Mental model

Session guarantees: the 90% that isn't linearizability

The display-name bug doesn’t need a global order. It needs one promise: a session sees its own writes(read-your-writes) and never sees time run backward(monotonic reads). Both are session guarantees — scoped to one client, not the whole system — which makes them cheap. You don’t coordinate across replicas; you just steer one session’s reads to a replica fresh enough to honor what it has already seen.

That’s the entire trick the reference implementation uses, and it’s why session guarantees are the highest-leverage idea in this module: they deliver the consistency users actually perceive, at a fraction of linearizability’s cost.

Use it when: Any read that the same user must see correctly, but other users can see a few seconds late — which is most reads in a typical app.

CAP is a footnote. PACELC is the bill you pay daily.

Everyone quotes CAP — under a network Partition, choose Availability or Consistency. True, and almost irrelevant to your day, because real partitions are rare. CAP describes a decision you make for a few seconds a year. It says nothing about the cost you pay on every request when the network is perfectly healthy.

PACELC completes it, and it’s the version worth memorizing: if Partition, then A or C; Else, then L or C. The else clause is the one that bills you continuously. Even with a flawless network, stronger consistency costs latency — a read that must confirm it has the newest value has to talk to a quorum or the leader, and that round trip is on your p99 forever. The real architectural question isn’t “what do we do during a partition?” It’s “what latency are we paying for consistency right now, and is this operation worth it?”

Framework · 2×2

The PACELC Decision Grid

Classify any datastore — or any single operation — by what it gives up during a partition, and what it gives up the rest of the time.

PA / EL

Available + fast, never strict

Gives up consistency during partitions AND trades it for latency normally. The default for high-scale user data.

e.g. Dynamo, Cassandra, Riak

PC / EC

Consistent, always, at a price

Refuses to serve wrong answers — unavailable during partitions, slower normally. The default for money and metadata.

e.g. HBase, VoltDB, Spanner (with TrueTime)

PA / EC

Available under partition, strict otherwise

Stays up when the network splits, but pays for consistency (latency) when it's healthy. The pragmatic middle.

e.g. MongoDB (majority reads), many tuned SQL replicas

PC / EL

Strict under partition, fast otherwise

Rare and deliberate: hold consistency when it matters most (the partition) but optimize latency when calm.

e.g. Yahoo PNUTS, some Cosmos DB tiers

The screenshot-worthy move: classify not your database but each operation. “Withdraw money” is PC/EC. “Show the feed” is PA/EL. They can live in the same service, on the same store, because the consistency dial is set at the call site.

Building read-your-writes without going linearizable

Here is the core of the session router: a per-session high-water mark (the newest version this session has observed) and a read that refuses any replica too stale to honor it, falling back to the leader. No global coordination — just one session’s freshness floor.

session-router.ts — read-your-writes via a freshness floor

1read(sessionId: string, key: string): ReadResult {
2  const hw = this.highWater.get(sessionId) ?? 0
3 
4  // only serve from a replica fresh enough to honor what this session saw
5  const replica = this.replicas.find((r) => r.appliedVersion >= hw)
6  const [result, servedBy] = replica
7    ? [replica.get(key), replica.id]
8    : [this.leader.latest(key), "leader"]   // nobody's fresh -> the leader is
9 
10  const version = result?.version ?? 0
11  // a RYW / monotonic-read violation IS exactly: handing back something
12  // older than this session already observed.
13  const staleViolation = version < hw
14  if (result) this.bump(sessionId, version)
15  return { value: result?.value, version, servedBy, staleViolation }
16}

Runnable reference implementation

TypeScript

courses/distributed-systems/reference-impl/02-session-guarantees/

A single-leader cluster with lagging replicas and the high-water-mark router. The demo reproduces the opening ticket: naive routing serves the stale city=Austin after the user committed city=Berlin; the session router serves the leader and keeps the promise. npm run demo — and 4 passing tests covering leader fallback, replica reuse, and cross-session isolation.

Dimension	Linearizable	Causal	Session (RYW)	Eventual
Staleness a reader can see	None — newest write, always	Bounded by causal order; concurrent ops may reorder	Your own writes never stale; others can be	Unbounded until convergence
Available under partition?	No (C over A)	Yes	Yes	Yes
Added latency (healthy network)	High — quorum/leader round trip	Low — track deps, no global vote	Low — route by high-water mark	None
Coordination cost	Global consensus per op	Causal metadata (version vectors)	Per-session token / sticky routing	None
Where it fits	Locks, uniqueness, balances	Comments, chat, collaborative edits	Most user-facing reads	Counts, caches, analytics
Choose when	Correctness depends on a single global truth at the instant of the operation — money, locks, 'is this taken?'.	Order matters between related events but unrelated events can diverge — anything with a reply-after-comment shape.	One user must see their own and a non-regressing view, others can lag. The default for app reads.	Staleness of seconds is imperceptible and availability/latency dominate.

Verdict

Default user-facing reads to session guarantees, not eventual — it kills the “my edit vanished” class of bug for almost nothing. Reserve linearizable for the handful of operations whose correctness is global, and pay its latency knowingly. Reaching for linearizability everywhere is the most common way teams burn their latency budget on a guarantee no user asked for.

How this fails in production · GitHub

The 24-hour data-consistency incident, 21 October 2018

The setup

GitHub ran MySQL with a primary in one US region and replicas in another, connected over the WAN. A routine 43-second network partition between the two coasts was enough to trigger the automated failover system (Orchestrator) to promote a replica in the second region to primary.

What happened

For those 43 seconds, both regions believed they could accept writes. When the partition healed, there were two divergent histories: writes had landed on the old primary that the newly-promoted one had never seen, and vice versa. The system could not automatically reconcile them without risking data loss, so engineers fell back to a slow, careful, partly-manual repair. User-visible inconsistency and degraded service stretched past 24 hours.

The moment it went wrong

The failover was tuned to optimize availability — promote fast, stay writable — without a mechanism to guarantee a single writer across the partition. That is a CAP choice (A over C) made implicitly by a timeout, and PACELC’s warning made flesh: the cross-region replication that bought low latency in the “else” case was exactly what made the partition case unrecoverable.

The transferable lesson

Consistency is chosen whether or not you choose it — a failover timeout is a consistency decision in disguise. If correctness needs a single writer, you need a mechanism that enforces it during the partition (a real leader lease with fencing, Module 8), not an availability-optimized promotion that hopes two primaries never overlap. Decide the A-vs-C trade explicitly, per system, before an incident decides it for you.

GitHub — October 21 post-incident analysis ↗

What this sounds like in an interview

Calibration ladder · L3 → L6

A user updates their profile and immediately refreshes. Sometimes they see the old value. How do you fix it?

The interviewer wants to see whether you reach for the biggest hammer (make it strongly consistent) or the right-sized one.

L3 · Junior

I'd make sure the read goes to the primary database instead of a replica, so it always has the latest data.

Missed: Solves it by sending everything to the primary — correct for this user, but throws away read scaling for the whole system and doesn't generalize.

L4 · Mid

The replica is lagging. I'd either read from the primary for this endpoint, or add a short cache of the user's own recent writes so their reads reflect them.

Missed: On the right track, but treats it ad hoc per endpoint instead of recognizing the general pattern (session guarantees) that solves the whole class.

L5 · Senior

This is a read-your-writes problem, and it doesn't need global strong consistency — just a session guarantee. I'd give each session a high-water mark (the version of its last write) and route its reads only to a replica caught up to that version, falling back to the primary if none is. Other users can still read slightly-stale replicas, so I keep the read-scaling benefit and only pay for freshness where the same user is involved.

Missed: Strong and correct. Missing only the altitude: naming it as a per-operation consistency choice with an explicit latency cost model, and the second-order concerns (token forgery, session affinity) that show production scars.

L6 · Staff

Same session-guarantee mechanism, but I'd frame it as a per-operation consistency decision and be explicit about the cost model. Read-your-writes via a version token is cheap and is the right default for user-facing reads; I'd push back on anyone proposing to make the whole service linearizable, because that puts a coordination round trip on every read's p99 to fix a problem that's scoped to one session. I'd also name the failure mode: if I store the high-water mark client-side, I have to make sure it can't be forged to pin reads to the leader and defeat replica scaling, and if I store it server-side I've added session affinity that complicates load balancing. The trade I'm making is a small amount of routing complexity for a large latency saving versus global consistency.

What scored L6

Named the precise guarantee needed (read-your-writes), refused the oversized fix (global linearizability) with a cost argument, and surfaced the second-order operational trade-offs of the token mechanism. That's someone who has shipped this and felt where it bites.

When NOT to use this

Don't make the whole system linearizable to fix one stale read

Linearizability puts a coordination round trip on the critical path of every operation and makes you unavailable during partitions. Using it to fix a read-your-writes bug is like rebuilding the highway because one driver missed an exit. Scope the guarantee to the session that needs it.

Don't use session guarantees where you need global truth

Read-your-writes is per-session. It does nothing for invariants that span users: two people racing for the same username, a seat-booking system, a balance check. Those need real linearizable operations. Session guarantees are a scalpel, not a substitute for consensus.

Don't pin the high-water mark client-side without thinking

A version token the client controls can be forged or replayed to force every read to the leader, quietly destroying your read scaling — or set artificially low to read stale data it shouldn’t. Treat it as untrusted input: sign it, bound it, or keep it server-side behind session affinity.

Don't choose eventual consistency for anything a user authored

Counts, recommendations, and other people’s data tolerate eventual consistency beautifully. A user’s own post, comment, or setting does not: “I did that and it disappeared” is the most corrosive bug a product can ship. Author-visible data is session-consistent at minimum.

Exercises

Exercise · Design scenario

Design the consistency strategy for a Twitter-like service with four operations: (1) post a tweet, (2) read your own profile timeline, (3) read a global trending-topics list, (4) follow another user, which must be reflected the next time you load your home feed. For each, name the consistency level you’d choose and the mechanism, and classify it on the PACELC grid.

Exercise · Implementation task

In 02-session-guarantees, add a writes-follow-reads guarantee: a write must be applied on top of a state that already reflects everything the session has read (so a reply can’t be ordered before the comment the user just read). Extend SessionRouter.write to require the target’s applied version ≥ the session high-water mark, and add a test where a write routed to a lagging replica would violate it.

Exercise · Find the race

This write bumps the session’s high-water mark so the user can immediately read their own write back. It introduces a way for a session to believe it observed a version that never actually committed. Find the window.

session-router.ts — shipped, subtly broken

1write(sessionId: string, key: string, value: string): number {
2  // optimistically record that we'll have observed this write
3  const nextVersion = this.leader.peekNextVersion()
4  this.bump(sessionId, nextVersion)   // <- bump BEFORE the write commits
5  const version = this.leader.write(key, value)
6  return version
7}

Walk away with this

01Consistency is a property of an operation, not a database. The right design runs linearizable, causal, session, and eventual guarantees side by side, chosen per call site.
02PACELC > CAP. Partitions are rare; the else-latency clause is the bill you pay on every healthy request. Ask “what latency am I paying for consistency here?” not just “what happens in a partition?”
03Session guarantees (read-your-writes + monotonic reads) deliver the consistency users actually perceive, scoped to one client, for almost nothing — a high-water mark and freshness-aware routing. Make them the default for user-facing reads.
04Reserve linearizability for operations whose correctness is genuinely global (locks, uniqueness, balances), and pay its coordination latency on purpose.
05A failover timeout is a consistency decision in disguise (GitHub 2018). Decide A-vs-C explicitly, per system, before an incident decides it for you.