Idempotency & 'Exactly-Once' That Survives Contact
Idempotency keys, the transactional outbox, and effectively-once delivery
Exactly-once delivery is impossible over an unreliable network — the sender can never know its message arrived, so it must either risk losing it (at-most-once) or risk duplicating it (at-least-once). What you can build is exactly-once effect: at-least-once delivery plus deduplication, assembled from three specific parts. This module builds all three and names the trap in between.
Idempotency keys: making a retry a no-op
The fix for the double charge is to make the operation idempotent: running it twice has the same effect as running it once. For naturally idempotent operations (set the address to X, delete order 7) you get this free — repeating them changes nothing. For operations with side effects that aren’t naturally idempotent — charge a card, send an email, increment a balance — you make them idempotent with a client-supplied idempotency key: a unique token the client generates per logical operation and reuses on every retry of it.
The server records the key the first time it sees it, performs the side effect, and stores the response — all atomically. Every subsequent request with that key returns the stored response without re-running the side effect. The lifecycle of that key is the whole pattern:
The Idempotency-Key Lifecycle
Three states, one ironclad rule: claim the key, do the side effect, and store the response inside a single transaction.
| State | How you reach it | What a retried request does |
|---|---|---|
| NEW | Key never seen. Insert (key → IN_PROGRESS) via a UNIQUE constraint — the insert itself is the lock. | N/A — this is the first request; proceed to do the work. |
| IN_PROGRESS | Key inserted, side effect not yet committed (original still running, or crashed mid-flight). | Return 409 / 'retry shortly'. Do NOT run the side effect — the original may still complete. |
| COMPLETED | Side effect committed AND the response persisted in the same transaction. | Return the stored response verbatim. No side effect. This is the idempotent replay. |
| EXPIRED | TTL passed (keys can't live forever). Record garbage-collected. | Treated as NEW — so set the TTL longer than any client will plausibly retry. |
The one rule that makes it correct: the side effect and the response-storage must commit together. If you charge the card and then store the response in a second step, a crash between them leaves a charged card with no stored response — and the retry charges again. Atomicity (Module 5) is load-bearing here.
1func (s *Store) Charge(key string, amount int) ChargeResult {2 if rec, ok := s.idem[key]; ok {3 if rec.status == statusCompleted {4 return ChargeResult{Response: rec.response, Replayed: true} // replay5 }6 return ChargeResult{Conflict: true} // original still running -> 4097 }8 // claim the key, do the side effect, enqueue the event, store the9 // response — modeled single-threaded, but in production ONE transaction:10 s.idem[key] = &idemRecord{status: statusInProgress}11 s.ledger = append(s.ledger, fmt.Sprintf("charged %d (key=%s)", amount, key))12 s.outbox.Add(fmt.Sprintf("charge.created amount=%d key=%s", amount, key))13 resp := fmt.Sprintf("ok: charged %d", amount)14 s.idem[key].status = statusCompleted15 s.idem[key].response = resp16 return ChargeResult{Response: resp}17}The Effectively-Once Triangle
Idempotency keys handle the synchronous request path. But the charge probably also emits an event — charge.created — that other services consume (email a receipt, update analytics, credit a referral). Now you have two new ways to leak a duplicate or lose an event entirely, and fixing them requires two more pieces. Together they form a triangle: remove any vertex and duplicates (or losses) leak through.
The Effectively-Once Triangle
Exactly-once effect isn't one mechanism — it's three working together. Idempotent producer, transactional outbox, idempotent consumer.
The transactional outbox solves the dual-write problem: instead of “write the DB, then publish to the broker” (two systems, no atomicity — a crash between them loses or duplicates the event), you write the event to an outbox table in the same database transaction as the side effect. A separate poller publishes outbox rows and marks them sent. The event is published if and only if the side effect committed.
The idempotent consumer closes the loop: because the poller delivers at-least-once (it may crash after publishing but before marking sent), consumers dedup by message ID, so a redelivered event has no extra effect.
courses/distributed-systems/reference-impl/06-idempotency-outbox/All three vertices, runnable. The demo charges the same idempotency key three times (one real charge, two replays), then crashes mid-publish so the broker receives the event twice — and the idempotent consumer still applies the effect exactly once. go run ., with tests for each vertex.
Delivery vs. effect: stop trying to deliver once
The most expensive confusion in messaging is treating “exactly once” as a delivery guarantee to configure. It isn’t one. A sender that gets no ack cannot know if the message arrived; it must choose to resend (maybe duplicate) or not (maybe lose). There is no third option at the delivery layer — that’s the FLP/two-generals reality from Module 1.
So you stop trying. You choose at-least-once delivery (never lose), accept that duplicates will happen, and make the effect idempotent so duplicates don’t matter. “We handle exactly-once” should always decode to “at-least-once delivery plus idempotent processing.”
| Dimension | At-most-once | At-least-once | At-least-once + dedup | Broker EOS (Kafka) |
|---|---|---|---|---|
| Can lose messages? | Yes — fire and forget | No | Lowest | Send once, never retry |
| Can duplicate effects? | No | Yes — every duplicate applies | Low (just retry) | Retry until acked |
| Implementation cost | No | No — dedup absorbs them | Moderate — keys + outbox + dedup | Exactly-once EFFECT |
| What it really is | No | No (within the Kafka boundary) | High; only within one broker's transactions | The triangle, productized |
| Choose when | Loss is acceptable and duplicates are not — metrics, telemetry samples, best-effort notifications. | Loss is unacceptable and the consumer is naturally idempotent already (e.g. setting a value), so duplicates are harmless. | Loss is unacceptable AND the effect isn't naturally idempotent (payments, emails, balance changes). The default for business-critical events. | Your producers and consumers all live inside one Kafka cluster and you can adopt its transactional API end to end. |
For anything with a side effect that matters, build at-least-once delivery + deduplication — the triangle. Don’t chase “exactly-once delivery” as a config flag; it doesn’t exist, and the time spent looking for it is time not spent making your effects idempotent. Broker EOS is real but bounded to one cluster’s transactions — the moment an effect leaves that boundary (a card charge, an email), you’re back to the triangle.
Retries that double-charge — and the idempotency key that stops them
Idempotency-Key header, and Stripe guarantees that replaying a request with the same key returns the original response and performs the side effect at most once. The hard part is the implementation they describe — claiming the key, recording request parameters, persisting the response, and handling the in-progress and crashed-midway states correctly, all without a window where a retry slips a second charge through.Make every non-idempotent write endpoint accept an idempotency key, and treat the key’s lifecycle as part of the same transaction as the side effect. The duplicate request is not an edge case to log — it is the contract. Build for it, and the double-charge ticket never gets written.
What this sounds like in an interview
How do you make a 'create payment' endpoint safe for clients to retry?
The interviewer wants to see if you treat duplicate requests as the normal case and know how to dedup atomically.
I'd check if a payment with the same details already exists before creating a new one, and skip it if so.
I'd have the client send an idempotency key, store it, and if I've seen it before, return the previous result instead of charging again.
Client-generated idempotency key, and the key handling has to be atomic with the side effect. On the first request I insert the key with a UNIQUE constraint — the insert is the lock — do the charge, and store the response, all in one transaction. A retry hits the key: if it's completed I replay the stored response; if it's still in progress I return a 409 so I don't run the charge twice. The subtle part is that storing the response and doing the charge must commit together, or a crash between them re-charges on retry.
Same atomic idempotency-key design, but I'd cover the full blast radius. Beyond the synchronous charge, the endpoint emits events — so I'd use a transactional outbox to publish 'payment.created' in the same commit, avoiding the dual-write problem, and make downstream consumers dedup by event ID, because delivery is at-least-once. I'd bound the idempotency key with a TTL longer than any client retry window and store a hash of the request body with the key, so a client reusing a key with different parameters gets a 422 instead of silently getting the old response. I'd also be explicit that this gives exactly-once effect, not delivery — there's no such thing as exactly-once delivery, and I'd push back if someone specced it. The trade is a bit of storage and a dedup table for making an unavoidable property of networks — duplicates — harmless.
Made the idempotency atomic with the side effect, extended it to the async event path with an outbox + deduping consumers, guarded against key reuse with a body fingerprint, and explicitly reframed 'exactly-once delivery' as effect. That's someone who has built a payments path.
Don't add idempotency keys to naturally idempotent operations
A PUT that sets a resource to a fixed value, a delete by ID, a “mark as read” — these are already idempotent: running them twice changes nothing. Bolting an idempotency-key table onto them adds storage, a dedup lookup, and a TTL to manage, for a property you already had. Spend the mechanism on the operations that actually accumulate (charges, increments, sends).
Don't claim or design for exactly-once delivery
It doesn’t exist over an unreliable network. Speccing it sends a team hunting for a config flag that isn’t there, instead of building at-least-once + dedup. If a requirement says “exactly-once,” translate it to “never lose, and make the effect idempotent” before you design.
Don't store idempotency keys forever
An unbounded key table grows without limit and eventually dominates your storage and lookup cost. Set a TTL longer than any plausible client retry window (hours to a day, not years), and accept that a request retried after expiry is treated as new — which is fine, because no real client retries a day later.
Don't dedup on payload instead of an explicit key
Hashing the request body to detect duplicates breaks two ways: two legitimately-distinct operations with identical payloads (two $5 coffees) collapse into one, and a single operation whose payload varies slightly on retry (a new timestamp) looks like two. Use an explicit client-generated key for the logical operation; reserve the body hash for detecting key reuse with different params.
Exercises
06-idempotency-outbox, wrap Store.Charge in a real net/http handler that reads the Idempotency-Key header, returns 409 while a key is in progress and the stored response on replay, and rejects a reused key whose request body differs (store a body fingerprint alongside the key). Add a TTL sweep that expires old keys.1async function charge(key: string, amount: number) {2 // check if we've seen this idempotency key3 const existing = await db.query("SELECT response FROM idem WHERE key = $1", [key])4 if (existing) {5 return existing.response // replay6 }7 // not seen -> do the charge8 const resp = await paymentGateway.charge(amount)9 await db.query("INSERT INTO idem (key, response) VALUES ($1, $2)", [key, resp])10 return resp11}- 01Your write endpoints will receive duplicate requests — clients must retry on timeout because a network call’s third outcome is “no answer” (Module 1). Design for duplicates as the normal case.
- 02Make non-idempotent writes idempotent with a client-supplied idempotency key, and claim the key atomically before the side effect (UNIQUE-constraint insert as the lock), storing the response in the same transaction.
- 03Exactly-once delivery is impossible; exactly-once effect is the Effectively-Once Triangle: idempotent producer + transactional outbox + idempotent consumer. Remove any vertex and duplicates leak.
- 04The transactional outbox kills the dual-write problem: write the event in the same transaction as the side effect, publish it from there. Delivery stays at-least-once; the deduping consumer makes the effect once.
- 05When anyone says “exactly-once,” translate it to “at-least-once delivery + idempotent processing” and find the dedup. If you can’t find it, it isn’t exactly-once.