Module 2 · Lesson 3 · Core · 36 min

Agentic Systems: The Bounded Autonomy Pattern

Agentic systems fail in the boundary, not in the model. The Bounded Autonomy Pattern names the five dimensions where autonomy lives and lets you design the boundary deliberately. Without the pattern, every agent design becomes a debate about ReAct vs Reflexion; with it, the architecture follows from the bounds.

Most agentic-system interviews end up in the same place: the candidate explains ReAct or describes a planner-executor loop, the interviewer asks 'how do you bound it?', and the candidate offers timeouts and retries. The interviewer writes the same note they have written ten times that week — 'understands the architecture, doesn't understand the failure modes.' The failure modes of agentic systems do not live in the model; they live at the boundary between the agent and the world, and 'timeout and retry' is approximately one-tenth of what that boundary requires.

Bounded Autonomy is the five-dimension framework for designing the boundary. Action surface, step budget, reversibility tier, confidence threshold, recovery envelope. Each is a knob you set explicitly; agents that ship safely have all five set deliberately; agents that cause incidents almost always have one or more set implicitly. The framework's main job is to refuse the 'agent that decides' framing and replace it with 'agent that decides within these specific bounds.'

Framework

The Bounded Autonomy Pattern

Agentic systems fail at the boundary, not at the model. The Bounded Autonomy Pattern is the five-dimension lens that converts 'how much should an agent decide on its own?' from a vibes question into a structured design decision. Each dimension is a knob the designer turns explicitly. Production agents that work have all five knobs set deliberately; agents that fail almost always have one or more set implicitly.

1
Dimension 1 — Action surface
What tools can the agent call, with what arguments? The agent's action surface is the design's most consequential boundary. A read-only retrieval surface is fundamentally safer than a read-write database surface; both are safer than a 'execute arbitrary code' surface. Most production agents that caused incidents had an action surface that included one tool the designer thought was safe and wasn't.
2
Dimension 2 — Step budget
How many actions can the agent take before the system forces termination? Step budget is the dimension that bounds runaway loops. Production agents with no step budget hit token limits, timeouts, or accidental denial-of-service against their own backends. Step budget should be set per task type, not globally.
3
Dimension 3 — Reversibility tier
For each action, is the outcome reversible (read a file), recoverable (send an email to internal team), or irreversible (send a payment, post publicly)? Group actions by reversibility tier and require different human-in-the-loop policies per tier. Conflating tiers is how 'autonomous agent makes irreversible mistake' incidents happen.
4
Dimension 4 — Confidence threshold
What confidence signal does the agent emit, and what's the threshold below which it pauses for human input? Most agentic systems lack a meaningful confidence signal at all; they pretend the model's output is calibrated and act on it. The L7 design surfaces explicit uncertainty (low retrieval recall, model abstention, tool error) and routes through a human-in-the-loop policy.
5
Dimension 5 — Recovery envelope
When something goes wrong mid-task — tool error, model timeout, contradictory result — what's the agent's recovery behavior? Retry, escalate, rollback, abandon. The recovery envelope is the dimension that determines whether the agent is helpful when things fail or just generates more chaos. Design the envelope before the agent ships, not after the first incident.

When to use

Apply the pattern to any agentic-system design prompt — research agents, coding agents, customer-service agents, data-pipeline agents. The pattern is also the right opening for 'is this use case appropriate for an agent?' questions, because three of the five dimensions sometimes resolve to 'don't build an agent; build a workflow.'

Worked example

Prompt: 'Design an autonomous coding agent that ships PRs.' Senior answer: 'Use a code LLM with a tool that runs tests; if tests pass, open the PR.' Staff answer: 'Action surface: read repo, write to scratch branch, run tests, open PR — never merge. Step budget: 30 actions per task. Reversibility tier: writing PR (reversible) is autonomous; merging PR (irreversible) is human-gated unconditionally. Confidence: agent must emit a self-confidence score and abstain below 0.8; tests must pass before any PR opens. Recovery: tool errors trigger retry once; contradictory test results escalate to a human. Five knobs, set explicitly. Now we can argue about whether the design is right — but the design is at least a design.'

Calibration ladder

Design a customer-service agent that handles refunds for an e-commerce platform.

Classic agentic design prompt. The interviewer is testing whether you treat 'refund' as a tool the model can call autonomously or as the boundary the design hinges on.

L4 · Mid

An LLM with tools for looking up orders, checking refund policy, and issuing the refund. Wrap it in a chat interface.

Missed: Treated the agent as a chat layer with tools. No design at the boundary. Will ship and cause an incident in week three.

L5 · Senior

Two-tool agent — order lookup is read-only, refund issue is write. Use ReAct-style planning. Add guardrails on refund amount (cap below $100 autonomous, route higher to human).

Missed: Knew about action-surface segmentation but didn't decompose into the five-dimension structure. Will design a credible system that misses one or two dimensions.

L6 · Staff

Bounded autonomy. Action surface: order lookup (read), policy lookup (read), draft refund (write to staging), issue refund (write to production). Step budget: 15 actions per conversation. Reversibility: refund issuance is hard to reverse, so it's a different tier — gate it behind explicit human confirmation in the customer's session, not just on the agent side. Confidence: agent emits a policy-match confidence; below threshold, route to human with the agent's draft response as context. Recovery: tool errors retry once; contradictory order data (e.g., refund-already-issued) escalates.

Missed: Strong bounded-autonomy design. Missing the meta-move — that the right Staff answer might be 'don't build an agent for this; build a workflow.'

L7 · Principal

Same bounded-autonomy structure, with three additions. (1) The action surface should be split by reversibility tier explicitly — read tools available to autonomous agent, write-to-staging available with confidence threshold, write-to-production gated by human confirmation always. The 'always' is non-negotiable for the first six months of deployment regardless of confidence; building trust in the confidence calibration comes before relaxing the gate. (2) The confidence signal must be calibrated against observed outcomes — the team has to commit to logging confidence-vs-correctness pairs and re-calibrating monthly. Unreviewed confidence scores drift; the calibration is a continuing system. (3) Most importantly: the entire design is wrong if the use case is well-served by a workflow instead of an agent. For refunds, the rule structure is mostly determinable — eligibility, amount, customer history. A workflow with LLM-assisted classification at the entry point covers 90% of cases at lower cost and higher safety than a planning agent. I'd argue for the workflow-first design and reserve agent autonomy for genuinely open-ended cases (escalations, novel disputes) where planning is needed. The pattern: most 'we need an agent' use cases are actually 'we need a workflow with LLM classification,' and proposing the agent design without that challenge is the canonical Staff-vs-Senior gap.

What scored L7

Named all five dimensions and added the three additions: human-gated production writes for an initial trust-building period, calibration as a continuing system, and the workflow-vs-agent reframing. The reframing is the highest-leverage move; most 'agentic system' interview prompts have a Senior-tier candidate answer (the agent design) and a Staff-tier candidate answer (the workflow plus selective agent autonomy). Recognizing which one the prompt actually wants is the Staff signal.

Pattern recognition

When you see

The interviewer asks 'design an agent that does [X].'

→

Think

Before designing the agent, name whether the use case is workflow-shaped (deterministic rules with classification at the entry) or agent-shaped (genuinely open-ended planning required). About 70% of 'agentic' prompts are workflow-shaped; proposing the agent design without challenging the framing is the canonical Senior failure.

Agentic frameworks (LangChain, AutoGPT, custom planner-executor loops) are exciting and have real use cases. They are also vastly over-applied. A workflow with three LLM-assisted classification steps is safer, cheaper, faster, and more debuggable than a planning agent for almost every well-defined business process. The Staff move is to ask 'is the planning loop earning its cost here?' before designing it.

Unspoken rubric

What interviewers grade on agentic system design prompts.

What they score

·Did the candidate name the boundary, or did they describe the planning loop?
·Did they distinguish reversibility tiers, or did they treat all tool calls as equivalent?
·Did they propose a confidence signal, or did they assume the model's outputs are calibrated?
·Did they design the recovery envelope, or did they say 'retry on failure' and move on?
·Did they ask whether the use case justifies the agent at all, or did they default to building one?

Why it's not on the rubric

Agentic systems are the area where the gap between 'reading the latest paper' and 'shipping production' is largest right now. The rubric language is generic ('demonstrates depth in agentic architectures'); the actual scoring is whether you've operated this kind of system or only built proofs of concept. Boundary design is the signal that you've operated.

How to signal it

→Name the action surface in tools-and-arguments form within the first 60 seconds. 'Read order, read policy, write-to-staging refund draft, write-to-production refund issuance' beats 'a refund tool.'
→Group actions by reversibility tier out loud. 'Tier 1 reversible, Tier 2 internal-only, Tier 3 irreversible' is the signal.
→Propose a specific confidence signal (retrieval recall, model self-rating, tool consistency check) and a specific threshold.
→Design recovery as a state machine, not as 'try-catch.' Different errors have different recoveries.
→Once per prompt, challenge whether the agent design is the right shape. If a workflow does 80% of the cases at 20% of the cost, say so.

Drill · 12 minutes

Practice this. Time yourself.

You have 12 minutes. Design a research agent that ingests a domain (e.g., 'electric vehicle battery technology'), searches the web, summarizes findings, and produces a 5-page report. Use the Bounded Autonomy Pattern to define all five dimensions explicitly. Then write a sixth section: 'is this use case actually agent-shaped, or would a multi-step workflow do better?' Defend your answer.

Self-assessment rubric

Dimension	Weak	Passing	Strong	Staff bar
All five dimensions named	Named 1-2 dimensions.	Named all 5 but at least one was hand-wavy ('confidence: TBD').	Named all 5 with specific commitments per dimension.	Named all 5 with specific commitments AND explained how the dimension settings interact (e.g., a tighter confidence threshold forces a larger step budget because the agent will abstain more).
Reversibility tier	Did not distinguish tiers.	Identified that web searches are read-only and report writing is write.	Distinguished read (search), write-to-scratch (drafting), write-to-final (publishing the report) tiers with different policies.	Named that 'cite this source as authoritative' is itself a near-irreversible action because misinformation propagates downstream — and added a fact-grounding requirement before any cited claim makes it to the final report.
Workflow-vs-agent challenge	Did not address.	Said 'agent is appropriate because the topic varies.'	Argued either side with specifics — when the domain is well-mapped, a workflow with retrieval + multi-pass summarization beats a planning agent.	Argued that the right design is a workflow at the outer level (retrieval → cluster sources → multi-pass summarize → fact-check → assemble) with an inner agent only for the 'fact-check' and 'investigate contradiction' sub-tasks. Hybrid is the Staff answer here.

Reveal model solution

Action surface. (1) Web search (read), (2) page fetch (read), (3) source ranking (internal compute), (4) summary draft to scratch (write-to-scratch), (5) fact-grounding query against retrieved sources (internal compute), (6) report assembly to scratch (write-to-scratch), (7) publish report (write-to-final). The 'publish' tier is separated from drafting. Step budget. 50 actions per research task, but the budget is segmented: up to 20 search + fetch actions, up to 20 summarization/grounding actions, up to 10 assembly actions. Segmented budgeting prevents the agent from exhausting its budget on web searches and reaching the report-writing phase with no remaining steps. Reversibility tier. Three tiers. Tier 1 — web search and source retrieval, fully reversible. Tier 2 — draft summary and report assembly to scratch, recoverable (can be discarded). Tier 3 — publishing the report and treating any cited source as authoritative, near-irreversible because downstream consumers may propagate the misinformation. Tier 3 actions require explicit fact-grounding (each cited claim must trace to a retrieved passage) and a human approval step for the first six months of deployment. Confidence threshold. The agent emits two signals: source-quality confidence (based on source reputation and consistency across sources) and synthesis confidence (based on whether the summary is supported by multiple sources). Below 0.7 on source-quality, sources are flagged for review. Below 0.6 on synthesis, the agent inserts an "unverified" tag in the draft and routes for human review before the publish step. Recovery envelope. Tool errors retry once with exponential backoff. Contradictory sources trigger an investigate-contradiction sub-task with its own step budget (up to 5 steps). If contradiction is unresolvable, the agent notes both positions in the draft rather than choosing. Web search returning zero relevant results escalates to a query-rewrite sub-task; if still zero, the agent admits the topic has insufficient web coverage and returns a partial report. Workflow-vs-agent challenge. The right design is hybrid. At the outer level this is a workflow: retrieve → cluster sources → multi-pass summarize → fact-check → assemble → publish. The workflow handles the stable parts of research with no agentic decision-making — that's faster, cheaper, and more debuggable. The agentic component is reserved for two specific sub-tasks: (1) investigating contradictions between sources, where the path is not pre-determined and planning is genuinely needed, and (2) deciding which sub-questions to research more deeply when the initial search reveals gaps. Treating the entire research task as agentic is over-applying autonomy; treating it as workflow misses the cases that need real planning. The hybrid pattern is the Staff answer.

Common failures

✗Designed a planner-executor agent without naming the action surface in specific tools.
✗Treated all tool calls as the same reversibility tier. Citing a source as authoritative is not reversible the way querying a search engine is.
✗Did not segment the step budget. Single-budget agents commonly run out before reaching the assembly phase.
✗Did not propose a confidence signal. 'The model knows when it's uncertain' is the assumption that produces incidents.
✗Did not challenge the workflow-vs-agent framing. The hybrid is almost always the right answer for research-like tasks.

Artifact · checklist

The Bounded Autonomy Checklist

Before designing the agent — challenge the framing

☐Is the use case well-defined enough that a workflow with LLM classification handles 80%+?
☐Are there specific sub-tasks where planning is genuinely needed, or is the whole thing planning?
☐If we shipped a workflow today, what would be the gap that justifies the agent?
☐If the answer is 'the cases workflow doesn't cover,' is that a 10% or 90% case? Build accordingly.

Dimension 1 — Action surface (write these down)

☐Every tool the agent can call, by name and argument signature.
☐For each tool: is it a read, a write-to-staging, or a write-to-production?
☐What's the worst plausible action sequence given this surface? Is it tolerable?

Dimension 2 — Step budget

☐Total step budget per task.
☐Segmented sub-budgets if the task has phases.
☐What happens when the budget is exhausted — return partial, escalate, terminate?

Dimension 3 — Reversibility tier

☐Tier 1 (reversible): autonomous.
☐Tier 2 (recoverable / internal-only): autonomous with logging.
☐Tier 3 (irreversible / external impact): human-gated for an explicit period regardless of confidence.

Dimension 4 — Confidence threshold

☐What signal? (Retrieval recall, model self-rating, tool consistency, ensemble agreement.)
☐What's the threshold? Per tier.
☐Below threshold: pause, escalate, or insert 'unverified' marker?
☐Is the team committed to calibrating this signal monthly?

Dimension 5 — Recovery envelope

☐Tool errors: retry once, escalate after.
☐Contradictory outputs: investigate, abstain, or note both.
☐Zero-result actions: rewrite query, escalate, or terminate.
☐Recovery is a state machine; design it before launch.

Post-mortem · anonymized

Setup

Mid-stage AI startup, autonomous customer-service agent deployed for refund handling on an e-commerce platform. 80% accurate at launch on test cases, considered production-ready. The team had implemented ReAct-style planning, OpenAI function calling, and 'safety guardrails' (a cap on refund amount).

What happened

Week three of production deployment, the agent issued 47 duplicate refunds to a single customer over a 90-minute period. The customer had a flaky network and resubmitted requests; the agent's step budget allowed it to call the issue_refund tool multiple times per conversation, and the duplicate-detection logic lived in the agent's prompt ('check if refund was already issued') rather than in the tool itself. The model's check passed each time because the agent's context window didn't carry across the customer's resubmissions, and the tool returned success each time because it didn't enforce idempotency.

The moment

The architecture diagram had a box labeled 'issue_refund' as a single tool. The team had not separated the tool's action surface into reversible (draft a refund decision) and irreversible (execute the refund). They had not enforced idempotency at the tool layer because they were trusting the agent's reasoning. The design had four of the five Bounded Autonomy dimensions implicit. The 'confidence' dimension was implicit (model just acts on what it generates), the recovery envelope was implicit (no contradiction-detection beyond model reasoning), the reversibility tier was implicit (all tool calls equal), and the action surface was implicit (one big tool instead of staged ones). Only the step budget was explicit, and it was set to 20 actions which proved more than enough rope.

What they should have said

At design time: 'The issue_refund tool needs to be split into draft_refund and execute_refund. Draft is reversible — agent can call it autonomously. Execute is irreversible — requires the agent to pass a unique customer-conversation ID, the tool enforces idempotency by that ID, and the agent must obtain explicit customer confirmation in the conversation before execution. The agent's confidence in the policy match must be above 0.85 before execute_refund is even available as a tool; below that, the agent can only draft and route to a human reviewer. Recovery envelope: contradictory results between draft and the customer's claim trigger immediate escalation; the agent does not silently re-draft.' Each of these is a 30-second design decision. Together they prevent the 47-refunds-in-90-minutes incident entirely.

Lesson

Agentic systems fail at the boundary, and the boundary lives in five specific dimensions. Implicit boundaries become incidents. Explicit boundaries — written into the action surface, the step budget, the reversibility tiers, the confidence threshold, and the recovery envelope — are what make agents safe. None of this is exotic; all of it requires the framework that forces the design questions. Bounded Autonomy is that framework.