SherlockLiu

Build Your Own Agent Harness: The Practical Blueprint (Part 12)

2026-04-25T00:00:00+01:00

Series: The Agent Harness — Part 12 of 12

Eleven posts. Eleven components. Hundreds of design decisions, naming patterns, anti-patterns, and checklists.

This final post is not a recap. It’s a synthesis — the kind you can actually use to make a decision and start building. We’ll answer three questions in order:

Do you actually need an agent harness?
Should you build one, or use a platform?
If you build, what are the principles worth stealing from Claude Code?

Then we’ll point you at a practical kit to start right.

Question 1: Do You Actually Need a Harness?

Most LLM use cases don’t need one. The wrong answer here costs months.

The decision lives in three questions, applied in order:

Does the agent need to act on intermediate results?
  No  → Simple API call. Stop here.
  Yes ↓

Does it involve side effects (files, commands, network)?
  No  → Simple API call. Stop here.
  Yes ↓

Does it need cost control, security, or multi-turn state?
  No  → Function Calling (single-turn tool use)
  Yes → Agent Harness

The rule of thumb: if your system needs the LLM to perform “observe → think → act → observe again,” you need a harness. If it’s “input → output,” you don’t.

Use case	Right choice
Translation, summarization, classification	Simple API call
Single-turn Q&A with one or two tool calls	Function Calling
Code editing, ops automation, research loops	Agent Harness

If you are building a harness when a simple API call would do, you are not being more sophisticated — you are creating maintenance overhead for no benefit.

Question 2: Build Your Own or Use a Platform?

Assuming you need a harness, the next honest question is whether to build one or adopt a framework like LangGraph, CrewAI, AutoGen, or a hosted platform like Vertex AI Agents or Bedrock Agents.

The common framing is “build vs. buy.” The more useful framing is: at what point does custom beat framework?

Dimension	Build Your Own	Platform / Framework
Initial velocity	Slow (you build everything)	Fast (components exist)
Customization ceiling	None	Framework abstractions
Debug visibility	Total (you wrote it)	Partial (black boxes)
Maintenance burden	Yours alone	Shared with community
Architecture fit	Exact	Approximate
Security control	Full	Depends on the framework
Upgrade path	You decide when to change	Framework release schedule

The honest answer most teams don’t want to hear: frameworks win at the start; custom wins at scale.

Use a framework for:

Proof of concept and early validation
Teams without dedicated infrastructure engineers
Domains where the framework’s built-in tool integrations cover most of your needs

Build your own for:

Production systems where you need full visibility into every permission check
Use cases where the framework’s abstraction layer creates problems faster than it solves them
Teams that have hit the ceiling of a framework and are spending more time working around it than using it

Claude Code is a useful data point here. Anthropic chose to build everything from scratch — no framework dependency, no abstraction tax, no upgrade path to manage. The result is a system where every component is designed exactly for the problem it solves. The tradeoff is that they own every bug and every maintenance burden.

For most teams, the right path is: start with a framework, migrate custom components as you hit the ceiling. The ceiling usually shows up in permission control, context management, or debugging production failures.

Question 3: What to Steal from Claude Code

If you’ve read this series, you’ve spent eleven posts inside Claude Code’s architecture. The most valuable output isn’t the code — it’s the design decisions that kept recurring across unrelated components.

These aren’t Claude Code-specific. They’re the engineering discipline that makes any autonomous agent production-ready.

1. Loops over recursion

Every component in Claude Code that could have been recursive is iterative. The agent’s core is while(true), not a call stack.

Why it matters: you cannot abort a recursive turn mid-flight. State recovery requires unwinding a stack. In-flight inspection becomes frame tracing. The loop gives you a natural checkpoint every iteration — a place to read state, apply compression, check abort signals, and write new state atomically.

Covered in Part 2.

2. Schema-driven, not hard-coded

Validation logic, permission checking, and model documentation all derive from the same Zod schema. One definition — no drift.

The discipline: never maintain separate schemas for validation and documentation. They will diverge. When they do, the model starts hallucinating input formats. The schema is the single source of truth or it’s not the source of truth at all.

Covered in Part 3.

3. Progressive permissions with a clear winner

Four stages, in order. Each stage can short-circuit. And the rule that never bends: deny always wins over allow, regardless of source or order.

Fail-safe, not fail-stop: invalid input routes to “ask the user,” not a crash. The system stays safe even when something unexpected happens.

Covered in Part 4.

4. Layered config with defined merge semantics

Six layers, ascending priority. The nuance that makes it work: arrays concatenate and deduplicate across layers (allow-lists, hook lists, permissions); scalars shadow (model, temperature, timeout).

The failure mode when you skip this: team settings stomp user preferences, or personal machine paths leak into shared config. Both erode trust fast.

Covered in Part 5.

5. Memory is a clue, not a conclusion

Store only what cannot be derived from current project state at runtime. Treat stored memories as signals that warrant verification — not as ground truth.

Trust “why” memories directly (they record decisions). Verify “what” memories against current state (file paths go stale; configurations change). The failure mode of treating memory as fact is an agent that confidently acts on outdated information.

Covered in Part 6.

6. Compress proactively, not reactively

Context management done wrong waits until overflow, then panics. Done right, it compresses at natural milestones — task boundaries, before a new major phase — while there’s still enough working memory to make intelligent compression decisions.

The circuit breaker is non-optional: three consecutive compression failures stop further attempts. Without it, a broken API state generates thousands of wasted calls before the session terminates.

Covered in Part 7.

7. Extension without forking

The hook system is how Claude Code lets operators customize behavior across 26+ lifecycle events without modifying core code. The architecture: events fire at known points; hooks subscribe; hooks output structured JSON; both the JSON and the exit code are read.

Start with Command hooks (shell scripts). Reach for Prompt hooks only when script logic is genuinely insufficient. Never use Prompt hooks for decisions a grep can make.

Covered in Part 8.

8. Minimum necessary context and tools for sub-agents

Every sub-agent gets the minimum context and minimum tool set needed for its task. No sub-agent sees the full conversation history. No sub-agent gets write tools if it only needs to read.

The depth limit (≤3 levels) is enforced in code, not by convention. Depth limits enforced by convention are not enforced.

Covered in Part 9.

9. Streaming first, everywhere

AsyncGenerator from the loop all the way to the UI. Every component is incremental and cancellable.

The failure mode of batching: users see a blank screen for ten seconds, then everything at once. Streaming makes agents feel fast even when they’re doing real work. It also makes them cancellable at any point — which is critical for cost control in production.

Covered in Part 10.

10. Read before you write

Plan Mode is not a UX suggestion. It’s an enforcement mechanism: write tools are denied at the permission pipeline level during the planning phase. The agent literally cannot act until you approve.

The principle extends to anything with significant blast radius. The cost of exploration is nearly zero. The cost of a wrong first move in a multi-file refactor can be hours of recovery work.

Covered in Part 11.

The Smart Way to Start: The Agent Harness Kit

Knowing the principles is one thing. Starting a new project from a blank file and applying them correctly is another.

The agent-harness-kit is a portable spec and skill set built directly from this series. It contains:

SPEC.md — Design rules, anti-patterns, and per-component checklists for all 10 components. Load it into any AI coding assistant before designing or building agent infrastructure.
Skills for Claude Code, Gemini, and Codex — Pre-configured project instructions that load the spec automatically when you describe an agent design problem.

agent-harness-kit/
├── SPEC.md                        Design rules + checklists for 10 components
└── skills/
    ├── claude/
    │   ├── SKILL.md               Claude Code skill (/build-agent)
    │   └── CLAUDE.md              Add to your project's CLAUDE.md
    ├── gemini/
    │   └── GEMINI.md              Drop into project root
    └── codex/
        └── system-prompt.md       Paste as system or project instructions

Two workflows it supports:

Starting a new harness: Describe your agent to the AI — what it does, what tools it needs, what side effects it has. The skill walks you through the six components in dependency order, applying spec rules at each step.

Auditing existing agent code: Say “audit this codebase against the agent harness spec.” The AI produces a gap report: compliant / partial / missing / anti-pattern for each component. Useful before a production launch or after inheriting someone else’s agent infrastructure.

The kit will be available at github.com/sherlockliu/agent-harness-kit.

What This Series Has Actually Been About

Most agent systems fail at the seams — between the loop and the tool system, between the permission check and the context state, between what the model thinks is true and what’s actually on disk.

Claude Code’s architecture doesn’t prevent those failures by being clever. It prevents them by being deliberate: every boundary is defined, every failure mode is named, every component has a clear contract with every other component.

The twelve posts in this series were an attempt to make that deliberateness legible. Not to produce a new framework, but to show the reasoning behind specific decisions — so you can apply the reasoning to your own system, in your own language, with your own constraints.

LLMs are text generators by default. Agent harnesses are what make them autonomous. The difference is engineering.

Your agent is the goal.

References

Series posts

External references

Building Effective Agents — Anthropic Research
Dive into Claude Code: Design Space of AI Agent Systems — arxiv
LLM Powered Autonomous Agents — Lilian Weng

Plan Mode: The Architecture of Thinking Before Acting (Part 11)

2026-04-23T00:00:00+01:00

Series: The Agent Harness — Part 11 of 12

The most expensive agent mistake is acting on a misunderstood requirement. By the time you discover the misunderstanding — modified files, failed tests, broken pipelines — the cost of correction is high. The mistake didn’t happen because the agent was bad at coding. It happened because the agent started coding before understanding the problem.

This is the Premature Action failure mode. It’s common. It’s expensive. And it has an architectural solution.

Plan Mode separates agent behavior into two phases: a read-only exploration phase where the agent understands the problem, and an execution phase where it acts on that understanding. The key insight is that during the planning phase, there are no side effects — no files modified, no commands run. The cost of discovering and correcting a misunderstanding is zero.

Part 10 covered streaming performance. This post covers the planning architecture that shapes when agents act.

The Problem: Premature Action

Without a planning phase, an autonomous agent faces a dilemma on complex tasks: act immediately (high risk of misunderstanding) or re-read the same files in every turn without committing to a direction (inefficient).

The table below shows what happens with and without Plan Mode:

Scenario	Without Plan Mode	With Plan Mode
Misunderstood requirements	Implemented wrong feature; needs rollback	Discovered misunderstanding in read-only phase; zero-cost correction
Ignored existing patterns	Code inconsistent with project style	Explored patterns first; implementation matches
Poor solution choice	Implemented slow approach; needs rewrite	Compared solutions before acting
Missed edge cases	Found post-implementation; rework	Enumerated in plan; incorporated before acting

Every row in that table describes a real kind of agent failure. Plan Mode doesn’t prevent all of them — but it catches the ones that stem from acting without understanding.

The Mode Switch: How Read-Only Becomes Enforced

Plan Mode isn’t a suggestion. It’s an enforced permission mode change.

When the agent enters Plan Mode:

The current permission mode is saved to prePlanMode
The permission context switches to plan mode
In plan mode, Write tools return deny from the permission pipeline (Stage 3: checkPermissions)
The agent receives a clear behavioral instruction set

In plan mode, you should:
Thoroughly explore the codebase to understand existing patterns
Identify similar features and architectural approaches
Consider multiple approaches and their trade-offs
Use AskUserQuestion if you need to clarify the approach
Design a concrete implementation strategy
When ready, use ExitPlanMode to present your plan for approval

The six-step sequence has a structure: steps 1–2 are divergent (broad exploration), steps 3–4 are convergent transition (focused analysis, open to questions), steps 5–6 are fully convergent (concrete plan, ready to present). The instructions encode a cognitive model, not just a to-do list.

The Sub-Agent Constraint

Sub-agents cannot enter Plan Mode. This is an architectural constraint, not a policy choice.

The reason: Plan Mode requires the user to review and approve a plan before execution begins. If a sub-agent enters Plan Mode, it blocks waiting for user approval — but the user may not know the sub-agent exists, and may not be watching for approval requests from nested agents. The entire parent agent’s execution stalls on an invisible approval request.

The constraint is enforced in EnterPlanModeTool.call(): the first check is whether the call is in an agent context. If it is, the tool throws an error immediately. Plan Mode is only for the main conversation.

Exiting Plan Mode: The Approval Gate

The exit is more complex than the entrance. ExitPlanModeV2 handles several scenarios.

Mode restoration with circuit breaker. The saved prePlanMode value is read and restored. But there’s a guard: if prePlanMode was auto, the system checks whether auto mode’s gate is currently open. If it was closed during the planning phase (due to a circuit breaker trigger or policy change), the system falls back to default mode.

Why? There’s a time window between entering and exiting Plan Mode. The state that allowed auto mode at entry may no longer be valid at exit. Restoring to auto mode blindly would bypass security controls that were activated in the interim.

prePlanMode = "auto"
  ↓
Is auto mode gate open?
  Yes → Restore to auto mode
  No  → Fall back to default mode (security takes priority)

The approval UI. After presenting the plan, the user reviews it. If the plan looks correct, they approve — the agent switches back to execution mode and begins implementation. If not, they request changes — the agent remains in Plan Mode for another round of exploration.

This is the human-in-the-loop moment. Not at every tool call (that’s the default permission mode), not never (that’s bypass mode) — but at the right moment: when the full plan is visible and the user can make an informed judgment.

Plan-Execute Workflow in Practice

A concrete example: adding pagination to a REST API.

Exploration phase (Plan Mode):

Tool calls (all read-only):
Glob("src/routes/*.ts")           ← discover route files
Glob("src/models/*.ts")           ← discover model files
Grep("limit|offset|page|cursor")  ← existing pagination patterns
Read("src/routes/users.ts")       ← typical route implementation
Read("src/middleware/validate.ts") ← validation patterns
Grep("interface.*Response")       ← response type definitions

Discoveries: Express + TypeScript, 12 route files, no existing pagination, Zod validation middleware, Prisma ORM.

Analysis phase (still Plan Mode):

The agent compares offset pagination (simple, worse at scale) vs cursor pagination (complex, better at scale), considers the current project’s scale, and selects an approach.

Plan presentation (ExitPlanMode):

## Pagination Implementation Plan

Solution: Offset pagination (project scale doesn't warrant cursor complexity)

Steps:
1. Create src/types/pagination.ts — type definitions
2. Create src/middleware/pagination.ts — parameter parsing
3. Modify src/routes/users.ts — first route implementation
4. Add Zod validation — limit (1–100), offset (>=0)
5. Update ApiResponse type — pagination metadata

Files affected: 2 new, 3 modified
Risk: Low — additive change, no modification to existing functionality

User approves. Execution phase begins.

Zero side effects occurred during the exploration and planning. The agent now acts with full context.

Background Scheduling: Cron and Remote Triggers

Plan Mode handles interactive planning within a session. But agent harnesses also need to schedule tasks that run without user interaction: nightly code reviews, periodic health checks, automated report generation.

Claude Code supports two scheduling mechanisms:

Cron jobs (CronCreate): Session-scoped recurring prompts. Standard five-field cron syntax in local timezone. Jobs fire when the REPL is idle. A 7-day auto-expiry prevents zombie jobs from accumulating.

"Run smoke tests every morning at 9"
→ CronCreate: cron="57 8 * * *", recurring=true

"Remind me to check the deploy in 30 minutes"
→ CronCreate: cron="", recurring=false

Note the off-by-a-minute pattern: 57 8 instead of 0 9. When many users ask for “9am,” all their jobs land at the same API timestamp. Offset by a few minutes reduces thundering herd.

Remote triggers: Long-lived triggers that persist beyond the session. Configured via API, they can fire on external events or remote schedules. Useful for CI/CD integration: trigger an agent run when a PR is opened, a deploy completes, or a monitoring alert fires.

The integration point with Plan Mode: a scheduled agent can be configured to run in Plan Mode, surfacing a plan for human review before any destructive operations execute. This combines autonomous scheduling with mandatory oversight for high-risk operations.

Two User Models: External vs. Internal

An interesting design detail from the source: Plan Mode presents different behavioral guidance to different user types.

For external users, the system encourages Plan Mode: “For implementation tasks, consider using Plan Mode first.” Safety and alignment take priority.

For internal (Anthropic) users, the guidance is more direct: “Start working immediately; clarify through questions when in doubt.” Efficiency and fluency take priority.

This reflects a genuine trade-off. Plan Mode adds overhead — an extra exploration phase, an approval step. For users who deeply trust the agent and work at speed, that overhead isn’t worth it. For users who are still building trust in the agent’s behavior, the overhead is entirely worth it.

The lesson for harness builders: one mode doesn’t fit all users. Build the planning pattern for the use case, then tune the defaults for your audience.

Key Takeaways

Premature Action is the most expensive agent failure mode. It stems from acting before understanding. Plan Mode’s architectural solution is a read-only phase where exploration has no cost.
Mode switch is enforced, not advisory. In plan mode, Write tools return deny from the permission pipeline. The constraint is structural, not just instructional.
Sub-agents cannot enter Plan Mode. A nested plan approval request would block the parent agent invisibly. Plan Mode is main-conversation only.
Exit with circuit breaker. If prePlanMode was auto and the auto-mode gate closed during the planning phase, fall back to default. Don’t bypass controls that activated mid-session.
The six-step planning sequence encodes a cognitive model: diverge (explore) → converge-transition (analyze) → fully converge (present plan).
Cron scheduling provides session-scoped recurring tasks. Remote triggers provide persistent external-event-driven invocations. Both integrate with Plan Mode for human-in-the-loop oversight.

What’s Next

In Part 12: Build Your Own Agent Harness — The Practical Blueprint, we synthesize the series into a practical guide:

The decision flowchart: when to use a simple API call vs. function calling vs. a full harness
Six-step implementation roadmap: dialog loop → tools → permissions → context → memory → hooks
Pseudocode skeleton for the minimal viable harness
Production readiness checklist
Framework comparison: build-your-own vs. LangGraph, CrewAI, AutoGen

References

Planning and workflow patterns

Building Effective Agents — Anthropic Research
Harness Design for Long-Running Applications — Anthropic Engineering
Claude Code Common Workflows — Official docs

Architecture analysis

Dive into Claude Code: Design Space of AI Agent Systems — arxiv
12 Agentic Harness Patterns from Claude Code — Generative Programmer
Inside Claude Code: Architecture Behind Tools, Memory, Hooks, and MCP — Penligent

Streaming Architecture: Building Agents That Feel Fast (Part 10)

2026-04-21T00:00:00+01:00

Series: The Agent Harness — Part 10 of 12

An agent can be architecturally correct — proper permission pipeline, solid memory system, working context compression — and still feel unusably slow. The problem isn’t correctness, it’s latency perception.

The gap between “instant response” and “waiting to load” is usually measured in how early the agent starts showing output, not how quickly the underlying computation finishes. Streaming is what closes that gap. And streaming isn’t just a UI feature you add at the end — it’s an architectural constraint that shapes how every component is built.

Part 9 covered multi-agent orchestration. This post covers the performance architecture those agents run on.

QueryEngine: The Session State Owner

Most harness implementations pass session state through function parameters: the message list, the abort controller, the file cache. This works until it doesn’t — every new state field requires updating all function signatures across the call chain.

Claude Code’s solution is a class: QueryEngine. One session, one instance. State lives as instance properties. Adding a new field requires only updating the constructor, not every function that touches session state.

class QueryEngine {
  private messages: Message[]
  private abortController: AbortController
  private deniedPermissions: Set<string>
  private usage: TokenUsage
  private fileStateCache: Map<string, FileState>
  private discoveredSkills: Set<string>

  async *submitMessage(input: string): AsyncGenerator<StreamEvent> {
    // Each call starts a new turn; state persists between turns
  }
}

Single ownership matters in concurrent scenarios. If multiple components simultaneously read from and write to a shared message list, messages can arrive out of order or get processed twice. The class provides a natural mutual exclusion boundary: all state modifications go through one owner.

submitMessage is an AsyncGenerator — callers consume events one at a time without waiting for the turn to complete. The UI renders each token as it arrives. Tool results surface immediately. The user sees progress.

Streaming vs. Non-Streaming: The Real Performance Difference

The performance argument for streaming isn’t about raw computation time. It’s about when work starts.

Consider a model response that triggers three tool calls over 5 seconds:

Strategy	Second 1	Second 2	Second 3–5	Finish
Non-streaming	Waiting	Waiting	Waiting	All tools start → complete
Streaming	Tool 1 starts	Tool 2 starts	Tool 3 starts	Tools complete during model output

In streaming mode, Tool 1 starts executing at second 1. By the time the model finishes generating at second 5, the tools may already be done. Non-streaming mode waits 5 seconds for the complete response, then begins tool execution.

The latency difference is the model’s entire generation time. For complex multi-tool turns, that’s meaningful.

Streaming also means the user sees partial output immediately. A tool response that takes 2 seconds to stream feels faster than one that dumps 2 seconds of accumulated output at once.

Streaming Processing: Token by Token

The API returns streaming events: message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop.

The system processes each event as it arrives:

message_start → reset usage counters for the new message
content_block_start with type tool_use → immediately prepare tool execution context
content_block_delta → append to incremental buffer, attempt incremental JSON parsing
content_block_stop → hand completed tool call to StreamingToolExecutor
message_delta → accumulate token usage

The key moment is content_block_start with tool_use. The system doesn’t wait for content_block_stop to prepare. It pre-looks up tool definitions and permission contexts as soon as it knows a tool call is coming. By the time the parameters arrive, setup is already done.

Incremental JSON Parsing

Tool parameters are JSON, but they arrive character by character in streaming. Traditional JSON.parse() requires a complete string. The harness maintains an accumulation buffer, appending each delta, and attempts parsing at key boundary events.

Streaming arrives: {"path": "/src/ind
Buffer:            {"path": "/src/ind   ← not valid JSON yet
                   {"path": "/src/index.ts"}  ← valid at content_block_stop

Heavy computation belongs at boundary events (content_block_stop), not on every delta. A delta may contain one or two tokens. Parsing overhead on every delta costs more than it saves.

StreamingToolExecutor: Execute on Arrival

StreamingToolExecutor is the component that executes tools immediately as their parameter blocks complete, rather than waiting for the entire model response.

Each tool tracked by the executor passes through four states:

queued → executing → completed → yielded

When a new tool call completes its parameter block, it enters queued and immediately triggers execution logic. Whether it can execute depends on one rule:

A tool can execute if and only if: no tools are currently executing, OR all currently executing tools AND the new tool are concurrency-safe.

Non-concurrency-safe tools execute exclusively — nothing runs in parallel with them.

Safe vs. Unsafe: The Concurrency Matrix

Tool Class	Concurrency Safe	Reason
Read, Grep, Glob, Search	Yes	Read-only, no side effects
Bash, Edit, Write	No	Side effects; may conflict

Read-only tools parallelize freely. Write tools serialize.

Why not do fine-grained dependency analysis between Bash commands? Theoretically, echo hello and echo world could run in parallel while mkdir foo && echo bar > foo/file.txt has a dependency. But parsing shell semantics reliably is expensive and error-prone. The conservative rule — Bash is always unsafe — is simpler, more maintainable, and the extra second of serialization is rarely noticeable.

Order Guarantee

Results are always emitted in request order, regardless of execution order. A faster tool (completing at state completed) waits until all previous tools have been yielded before its result is forwarded.

This matters for the conversation history: tool results must appear in the same order as the tool calls that produced them. If Bash Tool 3 completes before Read Tool 1, Tool 3 waits in completed state until Tools 1 and 2 have been yielded.

Sibling Abort on Bash Failure

When a Bash command fails, all other parallel tools (siblings) are cancelled. This prevents cascading issues where later steps depend on a failed earlier step. Bash is the primary execution primitive; its failure usually means the overall plan is wrong, not just one step.

Startup Performance: Parallel Prefetch and Lazy Load

Response latency during a conversation is the primary performance metric. But startup latency matters too — a CLI tool that takes 3 seconds to start feels broken.

Claude Code handles both:

Parallel prefetching: Tools, skills, and MCP servers are initialized in parallel at startup. Independent initializations don’t wait for each other. The expensive operations (spawning MCP server processes, loading skill files) happen concurrently.

Lazy loading: The ToolSearchTool (Part 3) allows the agent to discover tools on demand rather than loading all tool schemas upfront. Sending 50 tool definitions to the model costs tokens every turn. Lazy discovery means only the tools currently needed are included in the request.

deferred loading: Some tools are registered but not loaded until first use. The initialization cost is spread across the session rather than frontloaded.

Prompt Cache Strategy

The Anthropic API’s prompt cache is byte-prefix matching — if consecutive requests share the same prefix, the cached prefix is reused, saving input token costs and latency.

Three rules for cache-stable requests:

1. Stable system prompt prefix. The system prompt should not change between turns within a session. Dynamic elements (current time, session ID) should go at the end of the system prompt, not the beginning. A change at byte position N invalidates the cache for everything from position N onward.

2. Consistent tool definitions. Tool schemas included in the API request are part of the cache key. Tools should not appear/disappear between turns unless necessary. This is why the Fork pattern (Part 9) passes exact tool bytes to sub-agents rather than reconstructing.

3. Message history order. The conversation history prefix is part of the cache key. Don’t reorder messages between turns (they shouldn’t be reordered anyway — this is a hygiene note).

Cache hits dramatically reduce turn latency. A 30,000-token system prompt that costs $0.30 at standard input rates costs $0.008 at cache rates. For heavy users, this compounds across dozens of turns per session.

Key Takeaways

Streaming is an architectural choice, not a UI feature. It shapes every component: the loop abstraction, tool execution timing, event types, buffer management.
QueryEngine owns session state as a class. Single ownership prevents concurrent state corruption. submitMessage is an AsyncGenerator — callers consume events immediately.
Execute on arrival: StreamingToolExecutor starts tool execution as soon as parameter blocks complete, not when the entire model response arrives.
Concurrency safety is binary: read-only tools parallelize, write tools serialize. Conservative simplicity over fragile dependency analysis.
Results yield in request order regardless of completion order. Faster tools wait in completed state.
Sibling abort cancels all parallel tools when a Bash command fails — prevents cascading from a broken plan.
Cache stability requires stable system prompt prefix, consistent tool definitions, and stable message history ordering.

What’s Next

In Part 11: Plan Mode — The Architecture of Thinking Before Acting, we cover the planning system:

Why autonomous agents need a “thinking space” before acting
The mode switch mechanism: how read-only becomes the enforced constraint
The six-step planning workflow the model follows
The approval gate: where human-in-the-loop belongs in an autonomous system
Background scheduling for long-running workflows

References

Streaming and performance

Building Effective Agents — Anthropic Research
Master the Claude API for Streaming and Tool Use — n1n.ai
Claude Code Common Workflows — Official docs

Architecture analysis

Sub-Agents, Coordinators, and Skills: Multi-Agent Orchestration (Part 9)

2026-04-19T00:00:00+01:00

Series: The Agent Harness — Part 9 of 12

A single agent hits two kinds of ceilings: capability ceilings (it doesn’t have the right tools for a sub-problem) and context ceilings (the task is too large to fit in one conversation). Multi-agent architectures solve both — but introduce coordination problems that are worse than the original problem if you don’t design them carefully.

This post covers the full multi-agent stack: spawning sub-agents that share context efficiently (Fork pattern), orchestrating specialist workers via a dedicated coordinator (Coordinator pattern), packaging reusable behaviors as skills, and connecting to external tool ecosystems via MCP.

Part 8 covered the hook system. This post covers multi-agent orchestration built on top of it.

Four Built-In Agent Types: Specialist Design

Before discussing orchestration patterns, understand what you’re orchestrating. Claude Code ships four built-in agent types — each a specialist with specific capability constraints.

Explore: Read-Only Code Archaeology

The Explore agent is built for speed and safety. Two design decisions define it:

Dual-lock read-only enforcement: The system prompt prohibits file modifications and the tool list physically excludes Edit, Write, and similar tools. Soft constraint (prompt) plus hard constraint (tool unavailability). Even if the model hallucinates a desire to modify a file, it can’t — the tool doesn’t exist in its tool set.

Token optimization: Explore omits CLAUDE.md. CLAUDE.md typically contains coding conventions, commit message formats, PR templates — completely useless to a search agent. Omitting it reduces token consumption and noise, letting the model focus on search. Estimated savings: 5–15 billion tokens per week across the user base.

Best for: finding where something is defined, tracing call chains, understanding dependencies, mapping project structure.

Plan: Software Architect

Plan reuses Explore’s read-only toolset but plays a different role. Its output is structured: implementation steps in priority order, key files needing modification, risk assessment, dependency mapping between steps.

The architectural insight: Plan omits CLAUDE.md not because it’s irrelevant, but because it shouldn’t influence planning. Planning is about structure; implementation conventions are execution details. Let the planner focus on “what to do,” not “how to name things.”

General Purpose: Default Executor

Full tool access. No preset restrictions. The security boundary is entirely the global permission layer. The design philosophy: “trust by default, push the boundary to the perimeter.” Maximum flexibility for the agent, maximum responsibility for the harness.

Anti-pattern: using General Purpose for read-only tasks. Use Explore instead — cheaper model, no CLAUDE.md noise, no accidental-modification risk.

Verification: Adversarial Tester

The Verification agent is explicitly designed to break the code being verified. Red background in the UI emphasizes its adversarial role. It always runs in the background (doesn’t block the main agent), cannot modify project files, and is prohibited from verbal confirmation — it must actually run tests.

The system prompt explicitly warns against two failure modes:

Verification avoidance: “The code looks correct” without running tests
Surface correctness trap: Passing happy-path tests while missing boundary conditions, concurrency issues, or error paths

This is red team methodology applied to agent verification: don’t confirm it works, try to make it fail.

Why background? Three reasons: users don’t need real-time visibility into the verification process; background mode frees the main thread; isolation prevents verification from being interrupted by user input.

The Fork Pattern: Cache-Safe Parallel Execution

When the main agent needs to delegate multiple independent sub-tasks, the naive approach sends each sub-agent a full copy of the conversation history. At 50,000 tokens of history, three sub-agents cost 150,000 tokens just for the prefix. That’s expensive and slow.

The Fork pattern eliminates this redundancy by leveraging the API’s prompt cache.

The API’s prompt cache is byte-prefix matching. Two requests share a cache when their inputs are identical up to a prefix. The Fork pattern exploits this: all Fork sub-agents share the same conversation history prefix.

The message structure for a forked sub-agent:

[...conversation history]           ← shared prefix (hits cache)
[assistant turn with tool_use blocks]  ← shared (same for all forks)
[user turn with placeholder results]   ← shared fixed string: "Fork started -- processing in background"
[sub-task directive]                ← unique per fork

Only the final directive differs between sub-agents. Everything else is identical byte-for-byte, maximizing cache hits.

The token math:

Traditional sub-agents (no cache):
  3 sub-agents × 62,000 tokens each = 186,000 input tokens

Fork sub-agents (with cache):
  Shared prefix:    62,000 tokens (established by first request)
  3 × directive:    3 × 200 tokens = 600 new tokens
  Total:            62,600 tokens

Savings: ~66%

At scale (dozens of fork calls per session), the savings compound significantly.

The Byte-Level Cache Requirement

Cache matching is byte-precise, not semantic. One extra space invalidates the match. This is why the Fork pattern passes the raw rendered bytes of the parent agent’s system prompt to sub-agents rather than reconstructing it. Reconstruction could produce byte-level differences (whitespace, attribute ordering) that break cache matching even when the content is logically identical.

Five dimensions must match exactly:

System prompt (rendered bytes)
User context (CLAUDE.md content)
System context
Tool definitions + model selection
Conversation history prefix

This also explains why the Fork pattern uses useExactTools — it reuses the parent’s tool pool directly rather than re-resolving, maintaining byte-level tool definition consistency.

Recursive Fork Protection

Fork sub-agents retain the Agent tool to keep tool definitions cache-consistent. This creates a risk: sub-agents forking their own sub-agents, causing exponential resource growth.

Protection is dual-layer:

querySource marker (primary): A runtime marker in the fork context that identifies “I was forked.” It’s outside the conversation history and survives context compression.
Message scanning (fallback): Detects fork directive tags in edge cases where the querySource marker wasn’t preserved.

The fork directive also explicitly states behavioral norms: “You are a Fork worker, not the main agent. You are prohibited from generating sub-agents.”

The Coordinator Pattern: Centralized Orchestration

The Fork pattern is peer parallelism: equal agents sharing context, each running independently. The Coordinator pattern is centralized orchestration: one agent manages all the others.

Think of it as construction: the Fork pattern is a crew where everyone knows the blueprint and works independently. The Coordinator pattern is a project manager who assigns tasks, tracks dependencies, handles blocked workers, and manages shared resources.

The Coordinator’s Tool Set

The coordinator has exactly four tools: Agent (spawn a worker), TaskStop (stop a worker), SendMessage (communicate with a worker), and a structured output tool. It has no Read, Write, Edit, or Bash — it cannot do work itself.

Coordinator tools:   Agent, TaskStop, SendMessage, StructuredOutput
Worker tools:        Read, Write, Edit, Bash, Grep, Glob, WebSearch, Skill, MCP

This separation is strict. The coordinator manages. Workers execute. The coordinator never inspects a worker’s results through another worker (information chain decay) — it receives results directly.

Coordinator vs. Fork: When to Use Each

Dimension	Fork	Coordinator
Structure	Centerless, peer agents	Centralized, hierarchical
Use case	Same context, independent parallel tasks	Complex task decomposition, dependencies
State management	Each fork independent	Coordinator tracks global state
Communication	None between forks	Coordinator mediates all communication
Overhead	Low (lightweight)	Higher (dedicated coordinator process)
Debugging	Simple	More complex

Fork when: you need to run the same type of task against multiple targets in parallel. Coordinator when: you have a complex pipeline where workers have dependencies, shared resources, or require dynamic task reassignment.

Skills: Packaged Reusable Behaviors

Beyond tools (single operations) and agents (full conversations), the harness needs a middle layer: reusable prompt templates that can be invoked like commands. That’s the skill system.

Skills are Markdown files with YAML frontmatter:

---
name: security-audit
description: Analyze security vulnerabilities in code
tools: [Bash, Read, Grep, Glob]
disallowedTools: [Write]
model: haiku
background: true
---

You are a code security audit expert. Analyze the provided code for:
1. Common attack vectors (XSS, SQL injection, CSRF)
2. Insecure dependencies
3. Credential handling issues

The frontmatter declares what tools the skill uses, which model, whether it runs in background, and what lifecycle hooks it attaches. The body is the system prompt.

Four-Level Skill Hierarchy

Skills load from five sources, in priority order:

managedSkillsDir    (enterprise policy — highest priority)
userSkillsDir       (~/.claude/skills/ — personal global)
projectSkillsDirs   (.claude/skills/ — team-shared)
additionalDirs      (--add-dir paths)
legacyCommands      (/commands/ directory)

Deduplication uses realpath to resolve symlinks — the same physical file accessed via different paths is not loaded twice.

Built-in Skills

Claude Code ships core built-in skills compiled into the binary: verify, debug, simplify, remember, batch, stuck, update-config. Feature-gated skills (loop, schedule, claude-api) are only registered when the corresponding feature flag is enabled.

Built-in skills that need reference files (like verify) use a lazy singleton extraction pattern: files are compiled into the binary and extracted to a secure temporary directory on first invocation. File writes use O_NOFOLLOW | O_EXCL flags to prevent symlink attacks, with 0o700 directory permissions and 0o600 file permissions.

MCP: The External Capability Protocol

Skills and agents handle packaged behaviors within the harness. MCP (Model Context Protocol) handles connections to external tool ecosystems: databases, filesystems, APIs, IDE integrations, cloud services.

Why a Standard Matters

Without MCP, every AI application needs custom integrations for every external tool. A database vendor would need separate adapters for Claude, ChatGPT, Cursor, and every other AI tool. MCP is the USB-C standard for AI tool connectivity: implement an MCP server once, work with every MCP-compatible client.

MCP follows three design principles:

Protocol as contract: Servers declare capabilities; clients discover them via standardized requests
Transport agnostic: Same server protocol over stdio, HTTP, WebSocket, or in-process calls
Security by design: Default distrust, permission checks at every layer

Eight Transport Protocols

Protocol	Best For
`stdio`	Local development tools, filesystem access, CLI wrappers — lowest latency, natural process isolation
`sse`	Remote HTTP services, cloud-deployed MCP servers
`http`	Streaming HTTP responses (new MCP spec)
`ws`	Real-time bidirectional communication
`sse-ide` / `ws-ide`	IDE extension integration
`sdk`	In-process calls, near-zero overhead
`claudeai-proxy`	Claude.ai platform

For local tools: stdio. For remote services: sse or http. For IDE extensions: sse-ide or ws-ide. For SDK embedding: sdk.

MCP Tools Are First-Class Citizens

Once an MCP server connects, its tools are mapped to native Claude Code tool objects. They enter the same four-stage permission pipeline, participate in the same concurrency scheduling, and can be intercepted by PreToolUse hooks — identical to built-in tools.

This is the power of the tool abstraction (Part 3): new capabilities can be added without changing the core execution engine. MCP tools are registered, not special-cased.

Seven Configuration Scopes

MCP servers can be configured at seven levels, following the same priority hierarchy as the rest of the configuration system (Part 5): managed policy → local → user → project → command-line → agent-specific → programmatic. Higher scopes override lower ones for the same server name.

The security implication: projectSettings is excluded from write access to the memory path (same as general config), preventing a malicious repository from redirecting MCP operations to sensitive locations.

The Capability Hierarchy

Put it together: four layers of capability, each building on the last.

Tool         → Single operation (Read, Bash, Grep)
Skill        → Reusable prompt template (security-audit, verify)
Agent        → Specialized autonomous sub-agent (Explore, Verify)
MCP Server   → External ecosystem connection (GitHub, databases, cloud services)

The harness architect’s job is to know which layer to use for each capability requirement. Tools for granular operations. Skills for repeatable workflows. Agents for specialized autonomous tasks. MCP for external ecosystem integration.

Key Takeaways

Four built-in agents cover the software engineering workflow: Explore (read-only search), Plan (architecture), General (execution), Verify (adversarial testing). Constraints are enforced at both prompt and tool levels.
Fork pattern shares conversation context via byte-level API cache matching. All forks share a common prefix; only the final directive differs. ~66% token savings at typical conversation lengths.
Cache requires byte consistency. Pass rendered bytes, not reconstructed content. Use useExactTools to maintain tool definition consistency across forks.
Coordinator pattern uses a dedicated orchestrator with only management tools (Agent, TaskStop, SendMessage). Workers have execution tools. The coordinator receives results directly — no worker-inspecting-worker chains.
Skills are reusable Markdown prompt templates, loadable from five sources with priority ordering. Built-ins are compiled into the binary with secure lazy extraction.
MCP is the external capability protocol — implement once, work with all MCP clients. MCP tools are first-class: same permission pipeline, same concurrency scheduling, same hook interception as built-in tools.

What’s Next

In Part 10: Streaming Architecture — Building Agents That Feel Fast, we cover the performance layer:

QueryEngine as the session state owner: why a class beats function parameters
How the StreamingToolExecutor executes tools as parameter tokens arrive
Concurrency safety: the rules governing which tools can run in parallel
Startup performance: parallel prefetching and lazy loading
Prompt caching strategy: how to build requests that reliably hit the cache

References

Multi-agent systems

Building Effective Agents — Anthropic Research
Harness Design for Long-Running Applications — Anthropic Engineering
Claude Code Overview — Official docs

MCP and protocols

Model Context Protocol — MCP Specification

Architecture analysis

Inside Claude Code: Architecture Behind Tools, Memory, Hooks, and MCP — Penligent
Dive into Claude Code: Design Space of AI Agent Systems — arxiv
12 Agentic Harness Patterns from Claude Code — Generative Programmer

The Hook System: Extension Points That Don’t Break the Core (Part 8)

2026-04-17T00:00:00+01:00

Series: The Agent Harness — Part 8 of 12

The permission pipeline (Part 4) answers: can the agent do this? The configuration system (Part 5) answers: how is the agent configured? But neither answers: what should happen immediately before and after every meaningful agent action?

That’s the hook system’s job.

A hook is a piece of custom logic — a shell command, an LLM call, a webhook — that attaches to a lifecycle event and runs without modifying the agent’s core. A team’s security requirements are different from a CI pipeline’s. An enterprise’s audit needs are different from an individual developer’s. The hook system is how you satisfy all of them from the same codebase.

The design pattern is Observer + Chain of Responsibility: each lifecycle event is a signal, multiple hooks can subscribe to it, they fire in priority order, and any hook can block signal propagation.

Part 7 covered context compression. This post covers how to extend agent behavior at lifecycle boundaries.

Five Hook Types: Choosing the Right Execution Engine

Not every hook scenario has the same latency budget or capability requirement. Claude Code defines five hook types, each with a different execution model.

Command Hook: The Default Choice

Shell execution. Runs synchronously by default (blocks until complete). Supports custom timeout, a status message shown to users while running, and an once flag for one-shot initialization tasks.

{
  "hooks": {
    "PreToolUse": [{
      "matcher": "Bash",
      "hooks": [{
        "type": "command",
        "command": "python3 scripts/validate_command.py",
        "timeout": 5000,
        "message": "Validating bash command safety..."
      }]
    }]
  }
}

Use Command hooks when: running safety checks, executing linters, calling CLI tools, checking preconditions before operations.

Prompt Hook: When Rules Can’t Express It

Calls an LLM to evaluate the hook input. The placeholder $ARGUMENTS is replaced with the hook’s input JSON. The model returns a structured decision.

{
  "type": "prompt",
  "prompt": "Analyze this file write. If it modifies src/core/, return {\"decision\": \"block\", \"reason\": \"Core module changes require review\"}. Otherwise return {\"decision\": \"approve\"}. Input: $ARGUMENTS"
}

Use Prompt hooks when: the approval decision requires semantic understanding that a regex or script can’t provide. “Is this code modification safe?” is not a question a shell script can answer reliably.

Agent Hook: Multi-Step Validation

Like Prompt, but designed for validation that requires multiple reasoning steps. A code review that needs to read related tests, run them, check coverage, and only then make a decision — that’s an Agent hook.

Use Agent hooks when: the hook itself needs to perform a mini-investigation before reaching a verdict.

HTTP Hook: External System Integration

POSTs the hook input JSON to a configured URL. Supports custom headers and environment variable interpolation via an allowedEnvVars whitelist.

{
  "type": "http",
  "url": "https://audit.internal.company.com/api/log",
  "headers": { "Authorization": "Bearer $AUDIT_TOKEN" },
  "allowedEnvVars": ["AUDIT_TOKEN"]
}

Use HTTP hooks when: audit trails need to land in a SIEM system, approval flows live in external services, CI/CD systems need notification of agent actions.

Security note: allowedEnvVars should contain only the specific variables you need. Never open the whole environment — in multi-user deployments, that’s a credential leak waiting to happen.

Function Hook: Runtime-Only

TypeScript callbacks registered at runtime. Cannot be persisted to configuration files — they exist only for the session. Used for SDK embedding where deep runtime integration is needed.

The reason Function hooks can’t be persisted is architectural: persisting them would mean serializing executable code references to JSON. That’s the boundary between declarative configuration (Command/Prompt/Agent/HTTP) and imperative code (Function). Mixing both in the same config system creates unpredictable behavior and security risks.

Three Execution Modes for Command Hooks

Beyond hook type, Command hooks have three execution modes:

Synchronous (default): Blocks the agent. The operation doesn’t proceed until the hook completes. Use this for pre-approval flows: “check before acting.”

Asynchronous (async: true): Runs in background. The agent continues immediately. Hook results are not visible to the model. Use this for fire-and-forget logging and notifications.

Async-rewake (asyncRewake: true): Runs in background, but if the hook exits with code 2, it injects an error message that wakes the model to continue. Normal exit (0) doesn’t disturb the agent. Use this for long-running monitors: “don’t interrupt me unless something’s wrong.”

The async-rewake pattern is particularly useful for Stop event hooks: monitor conditions in the background and only intervene when the agent is about to stop without finishing its work.

26 Lifecycle Events: The Agent’s Observable Moments

Claude Code defines 26 lifecycle events organized into six categories.

The Tool Call Sandwich: PreToolUse / PostToolUse / PostToolUseFailure

The most-used events. They form a sandwich around every tool execution.

PreToolUse fires before execution. It’s the primary interception point:

Block the operation (decision: "block")
Modify the tool’s input parameters (updatedInput)
Log for audit purposes

Exit code semantics:

0 — silent pass (nothing shown to model)
2 — block the tool call (stderr shown to model)
Other non-zero — warning but continue (stderr shown to user)

PostToolUse fires after success. Carries both the tool’s input and output. Can override MCP tool output via updatedMCPToolOutput.

Tip: PostToolUse hooks should almost always be async. The tool is done; there’s no reason to block the agent’s next action for an audit log write.

PostToolUseFailure fires on failure. Carries error, error_type, is_interrupt, and is_timeout — enough diagnostic data to route to different recovery strategies or monitoring systems.

UserPromptSubmit: The Translation Layer

Fires after user input arrives, before the model sees it. This is your chance to:

Inject context the user didn’t provide (current git branch, project state)
Block messages that trigger quota limits or content policies
Expand brief questions into more complete prompts

{
  "hooks": {
    "UserPromptSubmit": [{
      "hooks": [{
        "type": "command",
        "command": "echo '{\"additionalContext\": \"Branch: '$(git branch --show-current)'. Recent commits: '$(git log --oneline -3)'\"}'",
        "message": "Attaching git context..."
      }]
    }]
  }
}

The additionalContext field injects information into the model’s context without modifying the user’s original message. The user’s input is preserved; the model gets more to work with.

Stop: The Completion Gate

Fires before the agent ends its response. If exit code 2 is returned, the agent continues — the stderr message is injected and the model picks up from there.

This event exists because LLMs sometimes stop before fully completing a task. A completeness check at Stop can detect unfinished items and force continuation:

{
  "hooks": {
    "Stop": [{
      "hooks": [{
        "type": "command",
        "command": "python3 scripts/check_task_completion.py",
        "asyncRewake": true
      }]
    }]
  }
}

PreCompact / PostCompact: Customizing Compression

PreCompact fires before context compression. Its stdout is appended as custom instructions to the compression prompt — enabling project-specific guidance on what to preserve.

"Preserve all database schema decisions and migration rationale."
"Keep the security review comments from earlier in the session."

This is the escape hatch for AutoCompact’s one-size-fits-all summary. Different projects define “important” differently; PreCompact lets you encode that definition.

Exit code 2 on PreCompact blocks compression entirely — useful when you’re mid-debugging and don’t want the context reorganized.

SessionStart / SessionEnd: Session Bookending

SessionStart fires when the session opens. Its stdout is shown to the model. Blocking errors are ignored — if hooks could prevent session startup, one misconfigured hook would make the system unusable. Core initialization can’t be hijacked by extension logic.

SessionEnd has a 1,500ms hard timeout. It runs during the shutdown sequence; any operation exceeding the limit is forcibly terminated. Keep it lightweight.

The Full Event Table

Event	Category	Blockable	Primary Use
PreToolUse	Tool	Yes	Intercept / modify tool input
PostToolUse	Tool	No	Audit / post-process output
PostToolUseFailure	Tool	No	Failure diagnosis
UserPromptSubmit	User	Yes	Context injection / filtering
Notification	User	No	External notification routing
SessionStart	Session	No*	Environment initialization
SessionEnd	Session	No	Cleanup / session summary
Stop	Session	Yes	Completeness check / force continue
StopFailure	Session	No	API error reporting
SubagentStart	Sub-agent	No	Sub-agent monitoring
SubagentStop	Sub-agent	Yes	Result validation
PreCompact	Compression	Yes	Custom compression instructions
PostCompact	Compression	No	Compression quality check
PermissionRequest	Permission	Yes	Auto-approve flows
PermissionDenied	Permission	No	Alternative suggestions
ConfigChange	Config	Yes	Change auditing
Setup	Init	No	Environment preparation
FileChanged	Environment	No	Cache invalidation
CwdChanged	Environment	No	Directory change notification
InstructionsLoaded	Instructions	No	Instruction audit

*SessionStart blocking is ignored (graceful degradation).

The Structured Response Protocol

A hook doesn’t just run — it communicates a decision. The output is structured JSON:

{
  "decision": "approve",           // or "block"
  "reason": "...",                  // block reason (when blocking)
  "additionalContext": "...",       // injected into model context
  "hookSpecificOutput": {
    "hookEventName": "PreToolUse",
    "updatedInput": { ... },        // modified tool input
    "permissionDecision": "allow"   // override permission decision
  }
}

The stdout channel carries unstructured output (shown to users on non-zero exit). The JSON is the structured control channel.

Default behavior when output isn’t valid JSON: continue execution. A malformed hook output silently passes — this prevents a bad hook from accidentally blocking operations.

Exit Codes and JSON Work Together

Both dimensions jointly determine the outcome:

Exit Code	JSON Decision	Result
0	approve or absent	Pass
0	block	Block (JSON takes priority)
2	any	Block, stderr shown to model
Other non-zero	approve	Warning but continue
Other non-zero	block	Block

Don’t let exit codes and JSON express contradictory intents — that’s confusing to maintain and produces unexpected behavior.

Priority Ordering

When multiple hooks fire for the same event, they execute in priority order:

userSettings    (highest — user's global config)
projectSettings
localSettings
pluginHook
builtinHook
sessionHook     (lowest)

User configuration has highest priority. This is the “user sovereignty” principle: your personal security preferences can override what a project or plugin does.

All matching hooks execute — a block decision by one hook doesn’t skip the rest (they just see the blocked state). But the operation is blocked once any hook returns decision: "block" or exits with code 2.

Three-Layer Security Model

Hook configuration is powerful. A PreToolUse hook can execute arbitrary shell commands. A misconfigured or malicious hook is a serious risk. Claude Code gates hook execution through three layers:

Layer 1: disableAllHooks (policySettings)
  → Emergency kill switch. Disables everything.

Layer 2: allowManagedHooksOnly (policySettings)
  → Only enterprise-administrator-configured hooks run.
  → User/project/local hooks are blocked.

Layer 3: Workspace trust check
  → Hooks from untrusted workspaces are blocked.
  → Defense against supply chain attacks via cloned repos.

The workspace trust check is the most important for everyday use. When you clone an open-source project, its .claude/settings.json may contain hooks. Without workspace trust gating, those hooks execute automatically — potentially exfiltrating environment variables on every tool call. Workspace trust requires explicit user consent before any hook from that workspace runs.

This is the same supply chain attack vector described in Part 5 for projectSettings. The defense is the same: explicit trust, not implicit.

Key Takeaways

Hooks attach custom logic to lifecycle events without touching the agent’s core. The patterns are Observer (subscribe to events) + Chain of Responsibility (priority ordering, any hook can block).
Five hook types: Command (shell), Prompt (LLM evaluation), Agent (multi-step), HTTP (webhook), Function (runtime-only). Choose based on latency tolerance and capability need.
Three execution modes for Command: sync (block), async (fire and forget), async-rewake (background with conditional wake).
26 lifecycle events across six categories. The most important: PreToolUse (intercept before), UserPromptSubmit (modify user input), Stop (force continuation), PreCompact (customize compression).
Hook output is structured JSON (decision, updatedInput, additionalContext) plus exit codes. Both channels matter. Keep them consistent.
Priority: userSettings > projectSettings > localSettings > plugin > builtin > session. User configuration wins.
Three-layer security: global disable → managed-hooks-only → workspace trust. Workspace trust is the defense against supply chain attacks from cloned repositories.

What’s Next

In Part 9: Sub-Agents, Coordinators, and Skills — Multi-Agent Orchestration, we cover multi-agent patterns:

The Fork pattern: how sub-agents share prompt cache without wasting tokens
Built-in agent types: Explore, Plan, General, Verification — and their design constraints
The Coordinator pattern: one agent orchestrating many specialists
Skills and plugins: packaged reusable behaviors beyond tools
MCP: the external capability protocol and why a standard matters

References

Hook systems and extensibility

Claude Code Hooks Reference — Official docs
Building Effective Agents — Anthropic Research
Harness Design for Long-Running Applications — Anthropic Engineering

Architecture analysis

Inside Claude Code: Architecture Behind Tools, Memory, Hooks, and MCP — Penligent
Dive into Claude Code: Design Space of AI Agent Systems — arxiv
12 Agentic Harness Patterns from Claude Code — Generative Programmer

Context Management: The Compression Problem (Part 7)

2026-04-15T00:00:00+01:00

Series: The Agent Harness — Part 7 of 12

The context window is the agent’s working memory. Everything the agent knows — conversation history, tool results, intermediate reasoning — has to fit on it at once. And unlike human working memory, it has a hard ceiling.

For short tasks, this isn’t a problem. For long-running autonomous agents — the kind that read dozens of files, run multiple tool chains, and iterate over hundreds of turns — the ceiling is the central engineering problem.

Most frameworks handle this badly: they truncate the oldest messages when you get close to the limit. That works until it doesn’t. You lose the decision that explained why the current approach was chosen. You lose the error that the agent just recovered from. You lose the context that would have prevented the next mistake.

The right solution isn’t truncation. It’s a cascade: try the cheapest intervention first, escalate only when cheaper options are insufficient, and never compress more information than necessary.

Part 6 covered the memory system for cross-session persistence. This post covers context management within a session.

The Effective Window Formula

Before you can manage a context window, you need to know how much space you actually have.

The naive answer is: “whatever the model’s maximum context is.” That’s wrong. The LLM also needs room to output a response. If you fill the context to capacity and ask for a summary, the summary generation itself can fail — there’s no room to produce output.

Claude Code reserves the lesser of the model’s maximum output tokens and 20,000 tokens as a hard output reservation:

Effective Window = Model Window - Reserved Output Tokens

For a 200K token model with 16K max output:

Reserved = min(16,384, 20,000) = 16,384 tokens
Effective = 200,000 - 16,384 = 183,616 tokens

Those 183,616 tokens are your actual budget for conversation history. Plan around that number, not the headline context size.

The Four Warning Thresholds

Claude Code maintains four progressively tighter thresholds based on the effective window:

Zone	Usage Level	Response
Safe zone	0–85%	Normal operation
Warning	~85%	Show yellow indicator
Danger	~90%	Trigger auto-compression
Blocked	~95%	Reject new requests

These aren’t just UI states — they drive actual system behavior. The warning threshold exists so users see the problem while there’s still time to act. The danger threshold triggers compression while there’s still enough room to generate a quality summary. The blocked threshold is the hard stop: if compression has failed and usage is this high, sending more API calls would fail anyway.

The spacing between thresholds matters. There’s a 5% buffer between each level so that the system doesn’t thrash between states if usage is hovering near a boundary.

The Circuit Breaker

Auto-compression requires an LLM call. If the API is down, the network is flaky, or the conversation structure itself is malformed, compression fails. Without a circuit breaker, the system retries on every subsequent turn — making doomed API calls indefinitely.

Claude Code uses a classic circuit breaker pattern with a threshold of three consecutive failures:

CLOSED (normal) → compression fails → counter increments
counter reaches 3 → OPEN (stop attempting)
New session or manual compression success → CLOSED (reset)

Before the circuit breaker existed, Claude Code observed 1,279 sessions with over 50 consecutive compression failures each — some reaching 3,272 consecutive failures. That’s approximately 250,000 wasted API calls per day. After the circuit breaker, cascading failures dropped to zero.

The lesson: Any system that retries on failure without a counter is vulnerable to this class of avalanche. The fix is two lines of state and a threshold check.

The Four-Level Compression Cascade

Compression is not a single operation. It’s a cascade of four levels, each more aggressive than the last. The system tries the cheapest option first and escalates only when cheaper options have already fired.

Level 1: Snip — Zero LLM Cost

Snip replaces old tool result content with a marker: [Old tool result content cleared]. No LLM call. No information synthesis. Just token clearance.

Why replace rather than delete? Because deleting messages breaks the message chain — subsequent turns may reference earlier tool call IDs. The marker preserves structural integrity while freeing the tokens.

Snip is triggered manually (user marks messages as no longer needed) and is the first method tried. After reading 10 large files to analyze an architecture, those file contents are often no longer needed once the analysis is done. Snip reclaims that space immediately.

Level 2: MicroCompact — Time-Triggered Cache Cleanup

MicroCompact fires when a configured time interval has elapsed since the last assistant message. When that threshold is crossed, the server-side prompt cache has already expired — the full context would need to be resent on the next API call anyway. At that point, old tool results are just wasted payload.

MicroCompact keeps the most recent N tool results (configurable, minimum 1) and replaces everything older with the clearance marker.

The time-trigger is elegant: it converts a natural conversation pause into a compression event, at the moment when clearing costs the least (cache was expired anyway).

Compressible tool types: Read, Bash, Grep, Glob, WebSearch, WebFetch, Edit, Write.

Level 3: Collapse — Proactive Context Restructuring

Collapse shifts the philosophy from “react when full” to “restructure before full.” It activates at 90% context utilization (before the danger threshold) and proactively reorganizes the message structure.

The key distinction from Level 4: Collapse is selective. It restructures groups of messages rather than summarizing everything into one flat summary. More original detail survives. This is why it runs at 90% instead of waiting for 95%.

Level 4: AutoCompact — Full LLM Summary

AutoCompact is the final fallback. It calls the LLM to produce a complete conversation summary, replacing the compressed history with a structured document.

The process:

Fire PreCompact hook (user can inject custom compression instructions)
Select compression prompt template (full history, partial from a point, or partial up to a point)
Stream summary generation via a restricted one-turn agent
If the prompt is too long, truncate the oldest API turn group and retry (up to 3 times)
Rebuild context: boundary marker + summary + re-injected attachments
Fire PostCompact hook

The output is always in a fixed order: boundary marker → summary messages → retained messages → attachments → hook results. Consistent ordering matters for the agent to correctly identify what has and hasn’t been compressed.

The Dual-Phase Prompt: Thinking vs. Output

AutoCompact’s compression prompt asks the model to produce two XML blocks:

  [Chain-of-thought: organize thoughts, identify what matters,
   ensure comprehensive coverage before writing the summary]



  ## Goals and Intent
  ## Key Decisions and Changes
  ## Unresolved Issues
  ## File Change Summary
  ... (9 structured sections total)

The block is a scratchpad — it improves summary quality by giving the model a thinking space before committing to the summary. Then it’s discarded before entering the final context. The thinking didn’t need to be remembered; only the result does.

This is the dual-phase compression principle: thinking is the process, the summary is the result. Don’t put the process in the context window.

Without the analysis phase, the model tends to miss things — it writes the summary too quickly without reasoning through the full history. With it, but keeping it in context, you waste tokens on a scratchpad. The discard step is what makes this work.

The CompactBoundaryMessage

After each compression, a CompactBoundaryMessage is inserted into the message stream. It marks the dividing line between pre-compression and post-compression history and carries metadata:

Trigger type: manual or automatic
Pre-compression token count
Number of messages included in the compression
A logicalParentUuid linking it to the last message before compression

Why does the boundary marker matter? Because subsequent compression operations need to know which messages have already been summarized. Without it, you’d re-summarize already-summarized content — a waste at best, confused history at worst.

Post-Compression Token Budget

After AutoCompact, the system re-injects some content back: recent attachments, hook results, skills. Without a budget, this re-injection can trigger another compression immediately.

Hard limits:

Budget	Value	What it protects
Total budget	50,000 tokens	Total re-injection ceiling
Per file	5,000 tokens	Prevents one large file from consuming all budget
Per skill	5,000 tokens	Same protection for skill definitions
Skills subtotal	25,000 tokens	Prevents skill spam
Max files restored	5	Prevents reopening too many files

These limits ensure the conversation doesn’t immediately re-inflate after compression. The common mistake is re-loading all previously-read files after compression. Don’t. Only reload what’s needed for the current task.

Proactive vs. Reactive: When to Compress

The worst time to compress is when the system forces you to. By then, you’re at 90%+ utilization, the model is under token pressure, and the summary will miss things.

The best time is when you decide to — at a natural milestone, with guidance about what matters.

Reactive compression (don't):
  Turn 80: Context fills. Auto-compress fires.
  Summary loses some context. Maybe the critical decision.

Proactive compression (do):
  Turn 50: Analysis phase complete.
  User triggers /compact with: "preserve all database schema decisions"
  Compression is targeted. Important content explicitly preserved.

Three practical patterns:

Phased work: Research → compress → planning → compress → implementation. Each phase starts with a clean context budget.

Manual Snip: After reading files you no longer need, use Snip to clear them before they drain the budget.

Memory + Compression: Before triggering compression, save key decisions to the memory system (Part 6). After compression, the memory is still there. This combination prevents the “we made that decision three hours ago and the summary lost it” problem.

Key Takeaways

Effective window = model window − reserved output tokens. Reserve space for compression output or compression itself can fail.
Four-level cascade: Snip (zero cost) → MicroCompact (time-triggered) → Collapse (proactive restructuring) → AutoCompact (full LLM summary). Try cheap options first; escalate only when necessary.
Circuit breaker: Three consecutive failures → stop attempting. Without it, one broken API session generates thousands of wasted calls.
Dual-phase prompt: scratchpad improves summary quality; discard it before storing. Thinking is the process; the summary is the result.
CompactBoundaryMessage prevents double-compression of already-summarized content.
Post-compression token budget (50K total, 5K/file) prevents immediate re-inflation after compression.
Proactive beats reactive. Compress at milestones with guidance, not when the system forces you to.

What’s Next

In Part 8: The Hook System — Extension Points That Don’t Break the Core, we cover the lifecycle extension system:

26 lifecycle events and which ones are actually worth hooking
Five hook types: when to use a shell command vs. an LLM call vs. a webhook
The structured JSON response protocol — how hooks communicate decisions, not just output
Three-layer security model: global disable, managed-hooks-only, workspace trust

References

Context management

Building Effective Agents — Anthropic Research
Harness Design for Long-Running Applications — Anthropic Engineering
Claude Code Overview — Official docs

Architecture analysis

Dive into Claude Code: Design Space of AI Agent Systems — arxiv
Inside Claude Code: Architecture Behind Tools, Memory, Hooks, and MCP — Penligent
12 Agentic Harness Patterns from Claude Code — Generative Programmer

The Memory System: How Agents Remember Across Sessions (Part 6)

2026-04-13T00:00:00+01:00

Series: The Agent Harness — Part 6 of 12

Every conversation with a stateless agent starts from zero. It doesn’t know your name, your coding style, your project’s architectural constraints, or the feedback you gave it last Tuesday. You explain the same context every time. It makes the same mistakes it made last week.

This isn’t a model limitation. It’s a harness limitation.

The model could use that information — it just doesn’t have it. Your job as a harness builder is to get it there. That means building a memory system: a mechanism that persists what matters across sessions and loads it back at conversation start.

But “save everything” is worse than saving nothing. It bloats the context window, buries signal under noise, and teaches the agent to treat stale state as current fact. The interesting engineering is in what to save, how to organize it, how to extract it without blocking the main loop, and how to load it without blowing the token budget.

Part 5 covered the configuration system. This post covers the memory system that rides on top of it.

The Core Question: What Is Worth Remembering?

Before designing a memory system, answer one question: what information can’t be derived from the current project state at runtime?

Code patterns, file structure, API route lists, library versions — all of these can be obtained in milliseconds via a tool call (ls, grep, cat package.json). For a human developer, memorizing them saves hours of re-reading. For an agent, the cost to re-acquire them is a few hundred tokens. The value of a memory is proportional to how hard it is to re-acquire. Low re-acquisition cost = low memory value.

What has high re-acquisition cost? Information that lives in people’s minds or external systems:

Who the user is — their expertise, role, how they like to work
Validated practices — what the agent got right, what got corrected
Project decisions — why things are the way they are (the “why” is never in the code)
External system pointers — the Grafana dashboard URL, the Linear project, the Slack channel

These four categories form the closed type system used by Claude Code’s memory architecture. The closed design is intentional.

A Closed Four-Type System (And Why “Closed” Is the Right Call)

Claude Code constrains memory to exactly four types: user, feedback, project, and reference. No custom types allowed.

This seems restrictive. It isn’t. An open type system has a fatal flaw in agent contexts: type explosion. Different users and projects create dozens of types. The agent can’t efficiently determine relevance. Classifications overlap. The index bloats. A closed system trades apparent flexibility for consistent, reliable relevance reasoning.

`user` — Who You’re Working With

Stores role, expertise, and communication preferences. An agent that remembers “ten years of Go, first week of React” frames all frontend explanations in backend analogues. An agent that remembers “junior developer, new to TypeScript” keeps examples concrete and avoids jargon.

When to save: User shares their background, role, or working preferences
How to use:   Calibrate explanation depth, vocabulary, and analogy selection

This is cross-project information — stored in the user’s global directory, it applies everywhere.

`feedback` — Validated Rules

Records both corrections (“don’t mock the database in integration tests”) and confirmations (“yes, the single bundled PR was the right call”). Most systems only save failures. That’s wrong. If you only record what went wrong, the agent grows overly cautious and drifts away from approaches that were already validated.

When to save: User corrects an approach OR confirms a non-obvious choice without pushback
Structure:    The rule itself → Why: (the reason given) → How to apply: (when it triggers)

The Why field is critical. “Don’t mock the database” without context is a rule to blindly follow. “Don’t mock the database — we got burned when mock tests passed but the migration failed” is a rule you can reason about in edge cases.

`project` — Why Things Are the Way They Are

Records decisions, deadlines, and work in progress. The code shows what was built. The memory explains why.

When to save: Who is doing what, why, and by when
Structure:    The fact or decision → Why: (motivation) → How to apply: (what it changes)
Special:      Always convert relative dates to absolute ("next Thursday" → "2026-05-08")

Relative dates decay. A memory saved today that says “launches next week” will say “launches next week” in six months — worse than useless. Absolute dates stay accurate.

`reference` — External System Pointers

Pointers to things that don’t live in the codebase: dashboards, documentation, ticketing projects, Slack channels.

When to save: You learn the location of an external resource and what it's for
How to use:   When the user references an external system or you need to check external state

The Index: MEMORY.md

The memory directory contains two components: individual memory files and an index.

MEMORY.md is automatically loaded at the start of every conversation. It’s not a memory — it’s a table of contents. One line per entry, each line a link and a hook description:

- [pre-commit-lint-requirement](feedback_lint.md) — Run npm run lint before every commit; CI failed for a day over unlinted code
- [user-go-background](user_role.md) — Deep Go expertise, new to React; use backend analogues for frontend explanations
- [auth-rewrite-motivation](project_auth.md) — Auth middleware rewrite driven by legal compliance, not tech debt

The index has hard capacity limits: 200 lines and 25KB, whichever triggers first.

Why two limits? They catch different problems:

Line limit protects comprehension efficiency. Even short lines add cognitive overhead. Over 200 entries, the index is no longer a quick browse — it’s a document to parse.
Byte limit protects the token budget. Long descriptions (approaching the 150-character per-entry limit) on 200 entries could reach 30KB. That’s real cost per conversation.

Line truncation runs first. This means: when entries are few but verbose, the byte limit triggers; when entries are many but terse, the line limit triggers. Either way, there’s a ceiling.

What NOT to Save

The exclusion list is as important as the inclusion list.

Don’t save:

Code patterns, file structure, architecture — derivable by reading the code
Git history — git log is authoritative
Debugging solutions — the fix is in the code; the commit message has the context
Anything already in CLAUDE.md
Ephemeral task details — current session state, in-progress work

The test: “If this memory were deleted, would the agent’s behavior be substantively different?” If not, don’t save it.

When a user asks to save something that fails this test, redirect toward what’s actually worth keeping. If they want to save a PR list, ask what was surprising or non-obvious about it. That’s the part that belongs in memory.

The Background Extraction Problem

There’s a timing problem at the heart of memory systems: the best moment to extract memories from a conversation is after it ends — but that’s also when the user is waiting for the next thing.

Memory extraction requires an LLM call. If you run it synchronously at conversation end, you’re adding latency on every turn. That’s unacceptable.

The solution is background extraction: a forked agent that runs in parallel while the user continues.

The Fork Pattern

Claude Code extracts memories via runForkedAgent — a background agent that’s a near-perfect copy of the main conversation:

Same system prompt
Same tool definitions
Shared prompt cache

The fork triggers at the end of each complete query loop (when the model returns a text response with no tool calls pending).

Direct extraction in main loop:
  + Simple implementation
  - Adds wait time on every turn
  - Consumes main conversation's token budget
  - Extraction failures risk destabilizing the main session

Fork-based background extraction:
  + Zero user-visible latency
  + Independent token budget
  + Failures don't affect main conversation
  + Cache sharing dramatically reduces cost
  - Requires mutex mechanism to prevent duplicate writes

User experience wins. The implementation complexity is worth it.

The Mutex: Preventing Duplicate Extraction

There’s a logical conflict: if the main agent has already written a memory during a conversation, the background agent might independently analyze the same conversation and write the same memory.

The mutex check solves this cleanly: if the main agent has written any memory file during the current session, the background extraction skips entirely. The two are mutually exclusive — one runs, the other doesn’t.

This is eventual consistency rather than strict coordination. No locks, no inter-process communication. Just: “did anyone already handle this? If yes, skip.”

The Tool Permission Allowlist: Least Privilege

The background agent needs enough access to do its job. Not more.

Tool	Permission	Reason
Read / Grep / Glob	Unrestricted	Need to read code to understand conversation context
Bash	Read-only commands only	Can verify state, cannot modify files or execute destructive commands
Write / Edit	Memory directory only	Can write memory files, cannot touch project code
All other tools	Denied	No side effects — no network calls, no external services

The boundary is precise: read everything needed to understand context, write only to the memory directory, touch nothing else. A background agent that silently modifies project code during memory extraction would be dangerous and untraceable.

Throttling and Trailing Extraction

Extraction doesn’t run after every conversation. A counter-based throttle means it fires only every N turns. This is a cost-benefit trade: each extraction is an API call. For frequent short conversations (quick Q&A), extraction cost can exceed extraction value. Throttling improves information density per extraction.

But throttling creates a gap: what if two conversations complete while one extraction is running? The later conversation’s context could be lost.

The trailing extraction mechanism handles this. When an extraction is running and another conversation completes, the new context is staged. After the current extraction finishes, a trailing extraction runs immediately using the staged context — bypassing the throttle counter. Already-completed work shouldn’t be delayed.

Cache-Aware Architecture

The background agent reuses the main conversation’s prompt cache. This is a significant cost optimization that shapes an architectural decision you might not notice until you understand why it’s there.

The numbers:

System prompt + tool definitions ≈ 30,000 tokens
Message history ≈ 50,000 tokens (medium conversation)

Without cache sharing:
  Background agent resends 80,000 tokens → ~$0.24

With cache sharing:
  Background agent reuses cached prefix → ~$0.008

Savings: ~97%

For heavy users (dozens of conversations per day), this difference compounds.

The Hidden Constraint: Tool List Consistency

Cache sharing has a non-obvious requirement: the tool list is part of the API cache key. If the background agent uses a different set of tools than the main conversation, the cache key doesn’t match. No cache hit.

This explains a subtle design choice: instead of giving the background agent a smaller tool list, the background agent uses the same tool list with permissions enforced via a canUseTool callback at execution time. The tool definitions are identical — only the runtime behavior differs.

Approach A (breaks cache):
  Main agent tools:       [Read, Write, Edit, Bash, Grep, Glob, ...]
  Background agent tools: [Read, Grep, Glob, MemoryWrite]
  → Different cache keys → cache miss

Approach B (preserves cache):
  Main agent tools:       [Read, Write, Edit, Bash, Grep, Glob, ...]
  Background agent tools: [Read, Write, Edit, Bash, Grep, Glob, ...]  ← same
  Permission filter:       canUseTool() callback → blocks write outside memory dir
  → Same cache keys → cache hit

Design principle: Consistent interface, variable behavior. Keep cache-sensitive parameters (tool lists, system prompt prefixes) stable. Put differentiation in runtime execution control.

Memory Is a Clue, Not a Conclusion

The most important principle for reading memory is this: memory is a point-in-time snapshot, not current fact.

"Memory says X exists" ≠ "X currently exists."

Code gets refactored. Files move. Dependencies upgrade. A memory saved six months ago about src/auth/handler.ts is a pointer to investigate, not a guarantee the file is still there.

Before acting on a memory:

If it names a file path: check that the file exists
If it names a function or flag: grep for it
If the user is about to act on your recommendation: verify first

The three levels of trust are:

Level	Approach	Problem
Level 0	No trust — re-acquire everything	Memory has no value
Level 1	Trust as clue — verify before acting	← Correct balance
Level 2	Trust as fact — memory is current truth	Stale memories cause wrong actions

Level 1 is the right balance. Memories guide where to look. Current state determines what’s true.

This rule is easiest to follow for “why” memories (architecture decisions, motivation behind choices) — these almost never become stale. It’s most important for “what” memories (file locations, version numbers, team member roles) — these change regularly.

Key Takeaways

The right question for memory design is: what information can’t be re-acquired at runtime? Code, history, and structure can be. User preferences, decision rationale, and external pointers can’t.
A closed four-type system (user, feedback, project, reference) enables reliable relevance reasoning. Flexibility in type taxonomy costs you consistency in retrieval.
The MEMORY.md index has dual capacity protection: 200 lines (comprehension limit) and 25KB (token budget limit), with line truncation applied first.
Background fork extraction — triggered after each completed query loop, running in parallel — gives you memory extraction without user-visible latency.
The mutex between main agent writes and background extraction prevents duplicate memories. When the main agent writes, the background skips.
Cache-aware design: the background agent uses the same tool list as the main agent, with permissions enforced at runtime via callback. Same tool definitions = same cache key = cache hit.
Memory is a clue, not a conclusion. Verify file paths and flags before acting on them. Trust “why” memories directly; verify “what” memories against current state.

What’s Next

In Part 7: Context Management — The Compression Problem, we tackle the finite context window:

The four-level progressive compression cascade: why you try cheap methods first
The circuit breaker pattern: how to stop compression loops from running forever
Prompt cache stability: why the order of your preprocessing pipeline matters
How a compression summary differs from a compression log — and why it matters for agent reasoning

References

Memory and persistence

Building Effective Agents — Anthropic Research
Harness Design for Long-Running Applications — Anthropic Engineering
Claude Code Overview — Official docs

Architecture analysis

Inside Claude Code: Architecture Behind Tools, Memory, Hooks, and MCP — Penligent
Dive into Claude Code: Design Space of AI Agent Systems — arxiv
12 Agentic Harness Patterns from Claude Code — Generative Programmer

Configuration as Architecture: The Multi-Layer Settings Problem (Part 5)

2026-04-11T00:00:00+01:00

Series: The Agent Harness — Part 5 of 12

Every application has settings. But most applications have a single, well-understood set of users who control those settings.

An Agent harness doesn’t. It has to serve:

Individual developers who want personal model preferences and shortcut permissions
Project teams who need shared standards and consistent hooks across all contributors
Enterprise administrators who need to enforce security policies that can’t be overridden
Plugin authors who provide base defaults for their tools
CI/CD pipelines that inject one-time overrides without touching any persistent config

Each of these stakeholders has legitimate, non-overlapping needs. They all configure the same system. The needs conflict constantly. When “user allows npm publish” meets “project denies npm publish” meets “enterprise locks the model list” — who wins?

A flat config file has no good answer. A priority hierarchy does.

Part 4 covered the permission pipeline. This post covers the configuration system that feeds it.

The Six-Layer Priority Hierarchy

Claude Code resolves configuration conflicts through a six-layer priority system. Lower layers provide defaults; higher layers override them.

pluginSettings      (lowest — plugin base defaults)
userSettings        ↑ personal global preferences
projectSettings     ↑ team-shared, committed to git
localSettings       ↑ personal project overrides, gitignored
flagSettings        ↑ CLI-injected, one-time override
policySettings      (highest — enterprise lockdown)

Later layers shadow earlier ones. If userSettings says model: "claude-sonnet-4" and localSettings says model: "claude-opus-4", the effective value is "claude-opus-4".

The geological strata analogy is accurate: each layer of rock was deposited at a different time, and you can read the full history by looking at all layers — but the surface layer is what you see first.

Three Merge Semantics (And Why Each One Exists)

The merge isn’t a simple “later layer overrides earlier.” Different field types use different merge strategies. The choice of strategy isn’t arbitrary — each is designed to prevent a specific class of misconfiguration.

Arrays: Concatenate and Deduplicate

Permission rules, hooks, and allow-lists are arrays. They accumulate from all layers.

// userSettings
{ "permissions": { "allow": ["Bash(npm *)", "Bash(node *)"] } }

// projectSettings
{ "permissions": { "allow": ["Bash(npm run lint)", "Read(*)"] } }

// localSettings
{ "permissions": { "allow": ["Bash(git *)"] } }

// Result: concatenated and deduplicated
{ "permissions": { "allow": ["Bash(npm *)", "Bash(node *)", "Bash(npm run lint)", "Read(*)", "Bash(git *)"] } }

Why concatenate instead of replace? Because each layer should only declare the rules it wants to add. If a higher-priority layer’s array replaced a lower-priority layer’s array, you’d have to repeat every lower-layer rule in every higher layer to avoid accidentally losing coverage. Missing one rule becomes a security hole.

The anti-pattern: You cannot revoke a lower-layer rule by omitting it in a higher layer (arrays concatenate). To revoke, explicitly add a deny rule.

Objects: Deep Merge

Nested objects merge field by field. A higher-priority layer can override specific nested keys without replacing the whole object.

// projectSettings
{ "hooks": { "PreToolUse": [{ ... audit hook ... }] } }

// localSettings — overrides one nested field only
{ "hooks": { "PostToolUse": [{ ... my hook ... }] } }

// Result: both nested fields survive
{ "hooks": { "PreToolUse": [{ ... }], "PostToolUse": [{ ... }] } }

Scalars: Later Wins

Simple values (strings, booleans, numbers) follow straightforward override semantics. model: "claude-opus-4" in localSettings overrides model: "claude-sonnet-4" in userSettings.

Design lesson: Match your merge strategy to the semantic meaning of the field. Permission rules are additive (arrays concatenate). Configuration namespaces are hierarchical (objects deep-merge). Single-value preferences are override-able (scalars replace).

The Security Boundary: Why `projectSettings` Is Treated Differently

Here’s a security fact that most documentation glosses over: projectSettings (.claude/settings.json) lives in your project directory and gets committed to git. That means when you clone a third-party repository, you automatically load their configuration.

Now consider what configuration can do: configure hooks that execute shell commands, set permission modes, configure which model is used. A malicious .claude/settings.json could include a PreToolUse hook that silently exfiltrates environment variables (API_KEY, AWS_SECRET_ACCESS_KEY) on every tool call.

This is a supply chain attack vector unique to agent harnesses.

Claude Code’s defense: systematically exclude projectSettings from all security-sensitive checks.

The functions that determine whether auto mode can bypass permission dialogs, whether the permission prompt can be skipped, whether the classifier can auto-approve — all of them read from userSettings, localSettings, flagSettings, and policySettings. projectSettings is explicitly excluded.

The code comments say it directly: “projectSettings is intentionally excluded — a malicious project could otherwise auto-bypass the dialog (RCE risk).”

The trust levels reflect this:

Source	Trust	Why
`policySettings`	Highest	Enterprise-administered, audited
`flagSettings`	High	User explicitly passed this flag
`localSettings`	High	User wrote this file, on their own filesystem
`userSettings`	High	User’s own global config
`projectSettings`	Low	May come from a cloned third-party repo
`pluginSettings`	Lowest	Plugin ecosystem, requires separate verification

The lesson: not all config sources are equally trusted, and your architecture should make the trust levels explicit rather than treating all config as equivalent.

Enterprise Mode: `allowManagedHooksOnly`

When policySettings sets allowManagedHooksOnly: true, only hooks from policySettings itself are executed. All hooks from user/project/local sources are skipped.

For organizations with compliance requirements (financial institutions, healthcare), this ensures only audited, administrator-approved hooks ever run — regardless of what individual projects or developers configure.

Feature Flags: Compile-Time vs Runtime

Claude Code distinguishes between two types of feature flags:

Compile-Time Flags

The feature() function evaluates at build time. When a feature is disabled, the corresponding code is removed by the bundler’s tree-shaking. The tool doesn’t just fail to register — it doesn’t exist in the binary at all.

This has a security implication: internal tools (debugging tools, REPL tools, experimental features) that are disabled in external builds don’t appear in the distributed artifact. No dead code to reverse-engineer. No feature detection from the outside.

Runtime Flags (GrowthBook)

GrowthBook-based flags are evaluated at runtime. These enable A/B testing and gradual rollouts — enable a new tool for 10% of users, monitor behavior, expand to 50%, then 100%.

For an agent harness, the difference matters:

Compile-time: “This feature is not available in this build.” Zero runtime cost. Clean binaries.
Runtime: “This feature is being rolled out gradually.” Requires server-side configuration. Enables targeted rollouts.

Pattern to steal: Use compile-time flags to gate features that genuinely shouldn’t exist in certain builds (internal tools, experimental APIs). Use runtime flags for gradual rollout control. Don’t conflate the two.

AppState: A Minimalist State Store

Configuration defines what the agent can do. AppState holds what the agent is currently doing.

Claude Code’s AppState contains 50+ state fields covering:

Current settings and permission context
UI state (streaming, rendering)
Session state (messages, tool context)
MCP server connections
Plugin and skill registrations
Communication state (notifications, attachments)

The state store itself is remarkably small — approximately 34 lines. It follows the Zustand pattern:

function createStore<T>(initialState: T) {
  let state = initialState
  const listeners = new Set<() => void>()

  return {
    getState: () => state,
    setState: (updater: (prev: T) => T) => {
      const next = updater(state)
      if (next !== state) {  // Reference equality check
        state = next
        listeners.forEach(fn => fn())
      }
    },
    subscribe: (listener: () => void) => {
      listeners.add(listener)
      return () => listeners.delete(listener)  // Cleanup function
    }
  }
}

Three design decisions worth noting:

Updater function pattern. setState accepts (prev: T) => T rather than the new state value. This ensures every update explicitly derives from the previous state, preventing the “stale state” problem where two concurrent updates each read the same old state and one overwrites the other.

Reference equality check. Notifications only fire when the state object actually changes (next !== state). If an updater returns the same object reference (no-op update), no listeners are notified. This prevents unnecessary re-renders.

Cleanup functions. subscribe returns a function to remove the listener. No unsubscribe(listener) call needed — just call the returned function. This prevents memory leaks and makes cleanup explicit.

For the React/Ink UI layer, AppState integrates with React’s useSyncExternalStore hook — the official React API for subscribing to non-React state stores. This ensures the terminal UI re-renders exactly when state changes, without manual coordination.

Design lesson: A state store for an agent harness doesn’t need to be complex. The Zustand-style minimalist store — get/set/subscribe with updater functions and reference equality — handles most use cases in under 40 lines. Don’t reach for a heavy state management library until you’ve tried the simple version.

The policySettings Exception

There’s one rule that doesn’t follow the normal priority hierarchy: policySettings.

While userSettings through flagSettings use deep merge (each layer adding to the previous), policySettings uses “first non-empty source wins.” The sources it checks, in order:

Remote API settings (highest)
MDM settings (macOS plist / Windows HKLM)
managed-settings.json and managed-settings.d/*.json
HKCU registry (Windows user-level)

Why first-wins instead of merge? Enterprise security policies are typically complete, audited configuration schemes. Merging policies from different sources (a remote API policy + a local managed-settings.json) could create semantic conflicts: one policy restricts the model list, another restricts permissions, but the merged result accidentally allows using a restricted model to bypass permissions.

First-wins ensures policy comes from one authoritative source, not a combination of sources that may not have been designed to work together.

Configuration Patterns in Practice

Three patterns for teams at different scales:

Pattern 1: Personal-Team Separation (Most Common)

~/.claude/settings.json      → personal model, personal shortcuts
.claude/settings.json        → team lint rules, shared hooks, permission baseline
.claude/settings.local.json  → personal debug flags, personal fast paths

Pattern 2: CI/CD Injection

# Inject one-time config without touching persistent files
claude --settings /path/to/ci-settings.json

CI settings are temporary, don’t pollute local environments, and are auditable in the pipeline config.

Pattern 3: Enterprise Layering

policySettings    → model whitelist, mandatory security hooks, allowManagedHooksOnly: true
projectSettings   → team-specific (non-security) hooks, MCP configs
userSettings      → personal UI preferences, verbose mode

Enterprise admins lock security surface. Teams customize within allowed space. Users personalize within team space.

Key Takeaways

Configuration for an agent harness is a multi-stakeholder problem. Design a priority hierarchy — not a flat config — from the start.
Six layers: plugin → user → project → local → flag → policy. Later layers shadow earlier ones.
Three merge semantics: arrays concatenate (additive permission rules), objects deep-merge (namespace isolation), scalars override (single-value preferences).
projectSettings is explicitly excluded from security-sensitive checks — it may come from untrusted repositories. Trust levels are not uniform across config sources.
Feature flags: compile-time gates (dead code elimination, no runtime cost) vs. runtime flags (gradual rollout). Don’t conflate them.
AppState is 34 lines. The updater function pattern, reference equality check, and cleanup functions are the only patterns you need for a harness state store.

What’s Next

In Part 6: The Memory System — How Agents Remember Across Sessions, we cover the memory architecture:

The four memory types every agent harness should support
Why structured memory outperforms raw conversation history
The background extraction problem: writing memory without blocking the main loop
Capacity protection: what happens when memory grows unbounded
The “clue not conclusion” principle for verifiable memory records

References

Configuration architecture

Claude Code Overview — Official docs
Harness Design for Long-Running Applications — Anthropic Engineering
12 Agentic Harness Patterns from Claude Code — Generative Programmer

Security and supply chain

Building Effective Agents — Anthropic Research
Inside Claude Code: Architecture Behind Tools, Memory, Hooks, and MCP — Penligent

State management patterns

Dive into Claude Code: Design Space of AI Agent Systems — arxiv

The Permission Pipeline: Safety That Doesn’t Get in the Way (Part 4)

2026-04-09T00:00:00+01:00

Series: The Agent Harness — Part 4 of 12

Most agent safety discussions focus on the extremes: “ask the user before every action” or “just let it run.” Neither works in production.

Ask before everything, and users quickly learn to click “allow” without reading — the worst of both worlds. Let it run without checks, and one misunderstood instruction becomes an rm -rf on the wrong directory.

The goal is something harder: a permission system that matches the friction level to the actual risk. Read a file? No prompt needed. Delete a directory? Confirm. In CI? Auto-approve everything safe and block the dangerous operations.

Claude Code’s permission pipeline is built around this goal. Understanding it reveals a set of architectural patterns that apply to any agent harness that needs to stay safe without becoming useless.

Part 3 covered the tool system. This post covers what happens before a tool is allowed to run.

The Core Pattern: Fail Fast, Not Fail Safe

The naive permission system is a single check: “is this tool allowed?” The problem is that “allowed” depends on context. rm -rf node_modules in a dev environment is routine maintenance. rm -rf /etc anywhere is catastrophic. The same tool, different parameters, completely different risk level.

A flat allowlist can’t handle this. A pipeline can.

Claude Code’s permission pipeline has four stages that run in sequence. Each stage can short-circuit — if it makes a final decision, later stages don’t run. This is the Fail Fast principle: reject invalid or unauthorized requests as early as possible, at the cheapest checkpoint.

Stage 1: validateInput      → Is the data valid?
Stage 2: Rule matching      → Is there an explicit rule?
Stage 3: checkPermissions   → Does context analysis approve or deny?
Stage 4: Interactive prompt → Should the user or AI classifier decide?

Requests that fail Stage 1 never reach Stage 2. Requests explicitly denied in Stage 2 never reach Stages 3 or 4. Each stage is an independent checkpoint — and a cheaper one than the next.

Stage 1: Input Validation

The first checkpoint isn’t about permissions at all — it’s about data validity. Tool inputs are parsed through the Zod schema defined in the tool interface.

If the LLM passes a malformed parameter (wrong type, missing required field, out-of-range value), validation fails here. No permission check runs. No tool executes.

Note what happens on failure: the system degrades to ask (request user confirmation) rather than crashing. This is intentional — in security systems, errors should be “safe” rather than “correct.” Crashing would interrupt the session. Degrading to user confirmation gives the user a chance to decide whether to proceed with unexpected input.

Stage 2: Rule Matching

This is where explicit permission rules are checked. Three types of rules, in strict priority order:

Deny rules — checked first, always. If a deny rule matches, the operation is rejected immediately. No exceptions. No overrides.
Ask rules — if configured to “always ask,” the pipeline flows to Stage 4.
Allow rules — if an explicit allow rule matches, the operation is permitted.

Rules come from seven sources, prioritized by “proximity” (most specific wins):

session          (highest — most recent, most specific)
command          ↑
cliArg           ↑
policySettings   ↑
flagSettings     ↑
localSettings    ↑
projectSettings  ↑
userSettings     (lowest — most general)

The critical rule: deny always wins over allow, regardless of source. Even if a global user config allows a tool, a project-level deny rule blocks it. This is a security fundamental: the power of explicit denial is greater than explicit permission.

This enables a practical workflow: project settings define broad deny rules for dangerous operations. Local or session settings add temporary allow rules for specific tasks. The deny rules hold firm.

Stage 3: Context Evaluation

Each tool can implement a checkPermissions method for context-aware evaluation. This is where a tool’s own domain knowledge applies.

BashTool, for example, parses the command, inspects subcommands, checks path safety, and matches prefix rules. git status is read-only. git push --force origin main is destructive. Same tool, different parameters, different results.

The stage returns one of four outcomes:

Outcome	Meaning
`allow`	Permit immediately
`deny`	Reject
`ask`	Request confirmation
`passthrough`	No opinion — let Stage 4 decide

passthrough is worth explaining. It doesn’t mean “I don’t care.” It means “I have no specific rule for this — let the general pipeline handle it.” If a subsequent Stage 2 allow rule matches, passthrough is upgraded to allow. If nothing matches, passthrough becomes ask. An explicit ask result at Stage 3 cannot be upgraded to allow by Stage 2.

This subtle distinction: passthrough is “no strong opinion,” ask is “I believe this needs confirmation.”

Stage 4: The Race — Hook, Classifier, User

When the pipeline reaches Stage 4, three decision-makers run simultaneously:

1. Hook script — if a PreToolUse hook is configured, it fires first. Its decision (allow/deny/block) is final. Hook scripts represent system administrator intent and have the highest trust level. (We’ll cover hooks in depth in Part 8.)

2. AI Classifier — in auto mode, an asynchronous classifier evaluates the tool call against conversation context. 2-second timeout. Runs in parallel with the user prompt.

3. User prompt — the interactive confirmation dialog. “Allow / Deny / Allow this time.”

All three run concurrently. First come, first served — whichever resolves first takes effect, via a pattern called ResolveOnce.

The ResolveOnce Pattern

Multiple asynchronous participants racing to resolve the same decision is a classic concurrency problem. The user clicks “allow” at the exact moment the classifier returns “approve.” Which wins?

ResolveOnce solves this with a single atomic flag:

class ResolveOnce {
  private claimed = false

  claim(): boolean {
    if (this.claimed) return false
    this.claimed = true
    return true
  }
}

claim() succeeds once and only once. The first participant to call it wins. All others find claimed = true and their decision is discarded. No locks, no coordination overhead — just a “non-transferable ticket” pattern.

In JavaScript’s single-threaded model, the claimed flag check and set happens atomically within one event loop tick. Race conditions in the traditional sense don’t apply, but this pattern ensures logical consistency across async callbacks.

Design lesson: When multiple asynchronous participants might resolve the same decision (hook + classifier + user), use a one-shot claim pattern. The first resolution wins. Track which participant won for audit purposes.

Trust levels, for reference:

Hook — highest. Represents explicit system administrator rules.
User — medium. Represents the current operator’s intent.
Classifier — lowest. AI judgment, may be wrong. Certain operations are “classifier-immune.”

PermissionContext: Immutability as a Safety Property

ToolPermissionContext — the data structure carrying all permission state — has all fields marked readonly. Every permission update produces a new context object. The old one is unchanged.

Why does immutability matter for permissions?

Consider: Tool A and Tool B begin permission checks simultaneously. Mid-check, Tool A’s user confirmation fires and updates a permission rule (user selected “always allow”). If the context were mutable, Tool B might see a partially-updated rule set — the rules that existed before Tool A’s confirmation, mixed with the rules after. The check would use an inconsistent snapshot.

Immutability prevents this. Each tool reads a deterministic snapshot at the start of its permission check. Subsequent updates produce new snapshots for future checks. No tool sees a context it didn’t start with.

Five Permission Modes: A Spectrum, Not a Switch

The permission mode isn’t a single toggle. Claude Code defines five modes across a spectrum from strictest to most permissive:

Mode	Who approves	When to use
`default`	User confirms every tool call	Daily interactive use, maximum oversight
`plan`	Read tools auto-approved, write tools denied	Code review, exploration before committing to changes
`auto`	AI classifier handles approval; user for edge cases	Trusted tasks where you want speed but not full bypass
`bypassPermissions`	Everything auto-approved (except deny rules + safety checks)	CI/CD, containers, automated testing
`bubble` (internal)	Sub-agent inherits parent’s permission context	Used by AgentTool for sub-agent spawning

`plan` Mode

Write tools (Edit, Write) return deny from Stage 3. Read tools (Read, Grep, Glob, Search) return allow. The agent can explore but not act.

This is “understand before acting” — explore the codebase in read-only mode, propose a plan, then switch to execution mode when you’re ready.

`auto` Mode

The AI classifier replaces manual approval for most operations. Before calling the classifier, the system checks a safe-tool allowlist (Read, Grep, Glob, TodoWrite — inherently low-risk tools that skip classifier checking entirely). The classifier handles the rest.

Auto mode includes a circuit breaker: if the classifier rejects consecutively multiple times, the system falls back to interactive prompting. This prevents the agent from looping uselessly when the classifier is consistently uncertain.

Certain operations are classifier-immune: even in auto mode, operations involving .git/ and .claude/ directories cannot be classifier-approved. These directories contain configuration and state that could compromise the entire system if modified incorrectly.

`bypassPermissions` Mode

Everything auto-approved. But four defenses remain active even in bypass mode:

Stage 2 deny rules (checked before bypass)
requiresUserInteraction flag (operations that inherently need human input)
Content-level ask rules
safetyCheck (hardcoded dangerous operations)

Bypass mode doesn’t disable safety. It removes the friction for operations that don’t need it.

When to use bypass mode: CI/CD pipelines and automated testing environments where the agent runs in containers with filesystem isolation. Never for production deployments or operations involving credentials. Always pair with explicit deny rules for dangerous operations (rm -rf *, npm publish, git push --force origin main).

BashTool: Fine-Grained Command Control

BashTool warrants special treatment because shell commands are composable and expressive in ways other tools aren’t. git status is safe. git push --force origin main is destructive. A tool-level allow rule isn’t granular enough.

BashTool supports three rule formats for command-level control:

Format	Example	Matches	Use case
Exact	`Bash(npm test)`	Only `npm test`	Fixed steps in CI
Prefix	`Bash(npm:*)`	Any `npm ...` command	Whole toolchain family
Wildcard	`Bash(git commit *)`	`git commit` + any args	Command families

These form a spectrum: exact is safest (zero false-approvals), wildcard is most flexible (requires careful pattern design).

Two forms are equivalent: Bash(npm:*) and Bash(npm *) both match any npm command. The colon syntax is more explicit; the space+wildcard syntax is more readable.

For auto mode, the classifier also runs against BashTool commands — but classifier decisions are overridden by hardcoded rules for operations on .git/ and .claude/ directories regardless of what the classifier says.

Two-Phase Permission Persistence

When a user grants a permanent permission (“always allow”), the update propagates in two phases:

Phase 1: Synchronous in-memory update. Immediate. The new permission takes effect for the current session before the function returns.

Phase 2: Async file write. The updated permission is persisted to the appropriate config file in the background.

Separating these phases ensures responsiveness: the user’s choice takes effect immediately, without waiting for disk I/O. The file write happens asynchronously and doesn’t block the agent.

Only three config sources persist: localSettings, userSettings, and projectSettings. Session rules and CLI arguments are intentionally ephemeral — they don’t survive past the current run.

Enterprise Configuration Patterns

For teams deploying Claude Code at scale, a layered config strategy:

projectSettings (committed to git):
  deny: [Bash(rm -rf *), Bash(npm publish), Bash(git push --force *)]
  # Team-wide rules — every developer gets these

localSettings (not committed, per-developer):
  allow: [Bash(npm test), Bash(npm run build)]
  # Personal fast paths — override project settings for common safe operations

session rules (temporary, per-task):
  allow: [Bash(git push origin feature/*)]
  # Task-specific — don't persist, just for this session

The rules compose correctly: project deny rules block dangerous operations for everyone; personal allow rules speed up common operations; session allow rules handle task-specific needs without permanently widening permissions.

Key Takeaways

A permission system for agents should match friction to risk — not be a single allow/deny toggle.
The Fail Fast pipeline (4 stages) rejects requests at the cheapest applicable checkpoint. Invalid data is rejected at Stage 1 before any permission logic runs.
Deny always wins over allow, regardless of which config source each came from.
PermissionContext is immutable: every update produces a new object, preventing concurrent tools from seeing inconsistent rule sets.
Five modes span the spectrum from “confirm everything” (default) to “bypass everything safe” (bypassPermissions). Use plan for exploration, auto for trusted sessions, bypassPermissions in isolated CI environments.
ResolveOnce handles the race between concurrent decision-makers (hook, classifier, user) — first valid resolution wins.
BashTool’s three matching formats (exact, prefix, wildcard) enable fine-grained command-level control without per-command configuration.

What’s Next

In Part 5: Configuration as Architecture — The Multi-Layer Settings Problem, we go inside the configuration system:

Why agent configuration is a multi-stakeholder problem (user prefs vs project rules vs enterprise policy)
The priority pyramid: six layers with clear override semantics
How merge semantics (arrays concatenate, objects deep merge, scalars override) shape behavior
Feature flags: compile-time vs runtime, and why the distinction matters for agent rollout
AppState: 50+ fields managed by a 34-line state store

References

Permission systems and agent safety

Building Effective Agents — Anthropic Research
Harness Design for Long-Running Applications — Anthropic Engineering
Claude Code Hooks Reference — Official docs

Architecture analysis

Inside Claude Code: Architecture Behind Tools, Memory, Hooks, and MCP — Penligent
Dive into Claude Code: Design Space of AI Agent Systems — arxiv
12 Agentic Harness Patterns from Claude Code — Generative Programmer

The Tool System: How Agents Act on the World (Part 3)

2026-04-07T00:00:00+01:00

Series: The Agent Harness — Part 3 of 12

Without tools, an LLM is a very sophisticated text generator. It can reason about code, but it can’t read a file. It can plan a fix, but it can’t apply one. It can write a test, but it can’t run it.

Tools are what close the gap between reasoning and action. But a tool system for an agent isn’t a list of functions. It’s a protocol — one that enforces type safety, permissions, concurrency rules, and UI rendering through the same unified contract.

In this post we’ll examine what a production-grade tool system actually needs, then look at how Claude Code implements it across 45+ tools in 12 categories.

Part 2 covered the dialog loop — the engine. This post covers the tool system — the hands.

The Problem With “Just Add Functions”

The naive approach to tool integration: define a function, give it a name, and tell the model about it. The model calls it with JSON, you execute it, return the result.

This works for demos. In production, you immediately need answers to questions that the naive approach ignores:

Validation: The model will hallucinate parameter names, pass wrong types, omit required fields. Who validates inputs before execution? At what layer?

Permissions: rm -rf node_modules is safe. rm -rf /etc is not. The difference isn’t the tool — it’s the parameters and context. How do you express this?

Concurrency: The model often requests multiple tools at once. Which can run in parallel? Which must serialize? Executing file reads in parallel is fine. Running two bash commands that modify the same file in parallel is a data race.

Progress: Some tools take seconds or minutes. Users need to see what’s happening. How does the tool communicate progress without coupling to a specific UI?

UI rendering: When a tool starts, runs, succeeds, fails, gets rejected, or runs in parallel with others — the terminal needs different displays for each state. How does the tool control its own presentation?

Backward compatibility: Tool names change as the codebase evolves. Old configurations, scripts, and user habits reference the old names. How do you handle renames without breaking things?

A production tool system has to answer all of these. The answer is to model tools not as functions but as contracts.

The Five-Element Tool Protocol

Every tool in Claude Code implements a unified type contract: Tool. This contract defines five elements that every tool must provide.

Element 1: Name and Aliases

Each tool has a unique primary name and optional backward-compatibility aliases. When a tool is renamed, the old name remains valid through an alias.

The principle: renaming in a public API is add-only. Never remove the old name. Add an alias. This is why configurations, scripts, and habits don’t break when the tool system evolves.

Element 2: Zod Schema

Each tool defines its input parameters using a Zod schema. This single definition serves dual purpose:

Runtime validation — before execution, LLM-generated parameters are parsed through the schema. Type mismatches, missing required fields, and out-of-range values are caught and rejected before the tool runs.
API documentation — the same Zod schema is converted to JSON Schema and sent to the model API. The parameter descriptions the model sees come from .describe() calls in the schema.

The key insight: one definition drives both validation and documentation. There’s no chance for them to drift out of sync. This is the “Single Source of Truth” principle applied to tool interfaces.

// This one definition serves as runtime validator AND model documentation
const schema = z.object({
  path: z.string().describe("The file path to read"),
  limit: z.number().optional().describe("Max lines to return (default 2000)"),
})

Element 3: Permission Model

Three methods form a layered permission check inside every tool:

Layer 1: validateInput — runs before permission checks. Rejects malformed inputs. This is a data legitimacy check, independent of permission policy.

Layer 2: hasPermissionsToUseTool + checkPermissions — tool-specific permission logic. A file read tool checks path allowlists. A bash tool parses the command and assesses risk level. A web fetch tool validates the URL. Each tool knows its own danger profile.

Layer 3: isConcurrencySafe — marks whether this tool can run in parallel with others. This affects scheduling, not security. Read-only tools are safe. Tools with side effects are not.

Separating these three concerns — data validity, permission policy, concurrency safety — prevents each from coupling to the others.

Element 4: Execution Logic

The core method: runs the tool, receives parsed input, tool context, and a permission callback. Returns the output and an optional contextModifier.

The contextModifier is how tools influence subsequent behavior. FileWriteTool, after writing a file, uses contextModifier to update the file state cache — so the next FileReadTool call sees the latest content. Without this channel, tools would be isolated, unable to build on each other’s effects.

Element 5: UI Rendering

Tools have six rendering methods covering the complete lifecycle:

Method	When it fires
`renderToolUseMessage`	Tool call starts
`renderToolUseProgressMessage`	Tool is running (progress update)
`renderToolResultMessage`	Tool completed successfully
`renderToolUseRejectedMessage`	Permission denied
`renderToolUseErrorMessage`	Execution error
`renderGroupedToolUse`	Multiple tools running in parallel

Each method returns a React component. The tool controls its own presentation — progress bars, color highlighting, collapsible panels. The harness renders whatever the tool returns.

Why give rendering responsibility to the tool? Because only the tool knows what its output means. A file read displaying src/auth.ts → 247 lines is meaningfully different from a bash tool displaying npm test → exit 0. Generic rendering produces generic output. Tool-specific rendering produces useful output.

Design lesson: When building your own tool system, define a unified interface contract enforced by your type system. Don’t let tools be plain functions — they should declare their schema, permission requirements, concurrency safety, and rendering alongside their logic. The compiler becomes your enforcement mechanism.

The `buildTool` Factory and Safe Defaults

buildTool is the factory function for creating tools. It fills in safe defaults for any fields not provided.

The defaults follow the fail-closed principle: security-related defaults are the most restrictive option. isConcurrencySafe defaults to false — a tool must explicitly declare itself safe to run in parallel. isDestructive defaults to true — a tool must explicitly declare itself non-destructive to get lighter permission treatment.

This is airport security in reverse: default to “needs inspection,” require explicit clearance for the fast track. If a developer forgets to declare concurrency safety, the worst case is slower execution (serialized when it could have parallelized). If the default were true, forgetting the declaration means parallel execution of a tool with side effects — a data race.

Tool Registration and the Filtering Pipeline

getAllBaseTools() is the single source of truth for all available tools. Before the tool list reaches the model, it passes through a four-stage filtering pipeline:

getAllBaseTools()
    → Mode filtering (simple mode: Bash, Read, Edit only)
    → Deny rule filtering (remove blanket-denied tools)
    → Enabled status check
    → Pool assembly (merge built-in + MCP, sort by name, deduplicate)
    → Tool list sent to API

The sort step is worth noting. Tool lists are sorted alphabetically before sending to the model. Why? Because prompt caching uses byte-level comparison. If tools arrive in different orders across calls, the system prompt changes, cache keys change, and you pay for redundant computation. A stable sort makes the prompt stable, maximizing cache hit rates.

Deferred Loading: Don’t Send What Won’t Be Used

When the tool count exceeds a threshold (especially with MCP servers that register dozens of tools), Claude Code switches to deferred discovery. Instead of sending complete schemas for all 50+ tools upfront, it sends only tool names and lets the model request full schemas on demand via ToolSearchTool.

The savings are significant. A tool schema with name, description, and parameter definitions consumes 200–500 tokens. Multiply by 50 tools and you’re paying 10,000–25,000 tokens per API call just for the tool list — before any message content. Deferred discovery reduces this to a small name index.

ToolSearchTool itself is always-loaded (never deferred), as is AgentTool. Everything else can be deferred.

Pattern to steal: If your agent connects to external tool servers (MCP, custom APIs), implement deferred loading from the start. You’ll add tools over time. The token cost of sending full schemas for 100 tools is prohibitive.

Concurrency Partitioning: Safe Parallelism Without Data Races

When the model requests multiple tools in one response, the orchestration engine decides what runs in parallel and what must serialize. The algorithm is concurrency partitioning.

The rule: consecutive concurrency-safe tools form a parallel batch. Any unsafe tool breaks the batch and runs alone.

Example: model requests [Read(a.ts), Read(b.ts), Bash(ls), Read(c.ts)]

Batch 1: Read(a.ts) ‖ Read(b.ts)    [parallel — both safe]
Batch 2: Bash(ls)                    [serial — unsafe]
Batch 3: Read(c.ts)                  [serial — affected by Bash output]

Batch 1 runs in parallel. Batch 2 waits for Batch 1, then runs alone. Batch 3 waits for Batch 2.

The result ordering guarantee: even when tools execute in parallel, results are emitted in the original request order. The model sees [Read(a.ts) result, Read(b.ts) result, Bash(ls) result, Read(c.ts) result] — always in that sequence.

Error propagation in parallel batches: If BashTool fails during execution, all sibling bash tools in the same parallel batch are immediately cancelled. Bash commands often have implicit dependencies — if mkdir fails, subsequent commands that write to that directory are meaningless. Stopping the batch fast prevents cascading failures.

Pattern to steal: Add an isConcurrencySafe: boolean flag to your tool interface from day one. Most tools that read are safe. Most tools that write are not. Use this to drive your scheduler. Retrofitting this after you have 20 tools is painful.

The StreamingToolExecutor: Four-Stage State Machine

Standard tool execution is batch: wait for the model to finish generating all tool calls, then execute them. The streaming executor goes further: start executing a tool as soon as its parameters are complete, before the model finishes generating the rest of its response.

Every tool in the executor passes through four stages:

queued → executing → completed → yielded

queued: Parameters are accumulating from the streaming API. Tool is not yet runnable.
executing: Parameters complete. Execution started immediately.
completed: Execution finished. Result is ready.
yielded: Result emitted to the caller in request order.

The “yielded” stage is separate from “completed” because order must be preserved. Tool 1 might complete after Tool 2. But Tool 1’s result must be emitted before Tool 2’s. The state machine buffers completed-but-not-yet-ordered results until it’s their turn.

One exception: progress messages are emitted immediately regardless of order. Showing the user that Tool 2 is running doesn’t require waiting for Tool 1’s result.

Deep Dive: The Core Tools

Claude Code’s 45+ tools cover 12 categories. A few are worth understanding in detail because they illustrate the design principles in practice.

BashTool: The Most Powerful, Most Constrained

BashTool is the Swiss Army knife — it can do almost anything a shell command can. This is also why it’s the most carefully constrained.

Error propagation: When BashTool fails in a parallel batch, all sibling bash calls are cancelled. Bash commands have implicit dependencies; a failed mkdir makes subsequent writes to that directory meaningless.

Interrupt behavior: Unlike other tools, BashTool can customize its behavior on user interrupt. Long-running commands like test suites can choose to “block” (let the command finish, show current output) rather than cancel (stop immediately). The tool understands user intent better than a generic interrupt handler.

Semantic analysis: BashTool uses AST parsing to classify commands — distinguishing search/read operations from write operations. This drives collapsible display in the UI (read commands can collapse their output; commands with side effects stay expanded for audit).

The File Trio: Read, Edit, Write

Three tools, three scope levels, three permission tiers.

FileReadTool maintains a file state cache. If the same path is read twice in a session, the second read uses cached content. This prevents redundant I/O and, more importantly, keeps the file state consistent: if the agent reads a file at line 100 of a task, it gets the same content at line 200.

FileEditTool uses exact string matching, not line numbers. Why? Line numbers are fragile — another tool might have already shifted the lines between when you read the file and when you edit it. String matching is idempotent: as long as the target fragment exists, the edit lands correctly regardless of other changes.

FileWriteTool overwrites the entire file. It has the strictest permission checks of the three. The principle: prefer the narrowest-scope operation that accomplishes the task. Edit over Write. Read over Bash. Least privilege is both a security principle and an efficiency principle — narrower operations are faster to permission-check.

The Search Duo: Glob and Grep

GlobTool (filename pattern matching, powered by fast-glob) and GrepTool (content search, powered by ripgrep) are both read-only and concurrency-safe.

Why have dedicated search tools when BashTool can run find and grep? Three reasons:

Structured output — search tools return structured result arrays. The model parses JSON reliably. Shell text output requires parsing that the model can get wrong.
Lighter permissions — read-only tools don’t require the same level of permission confirmation as bash commands. More searches get through without interrupting the user.
Predictable performance — dedicated tools apply result limits and optimization strategies (parallel file traversal, skip binary files) that generic shell commands don’t.

What to Take for Your Own Agent

The tool system patterns that transfer to any agent harness:

1. Interface contract over function pointers. Define a typed interface every tool must implement. Let the type system enforce completeness. No tool ships without a schema, permission model, and concurrency declaration.

2. Schema as the single source of truth. Use one schema definition for both runtime validation and API documentation. Zod, Pydantic, JSON Schema — the specific library matters less than the principle.

3. Fail-closed defaults. isConcurrencySafe defaults to false. isDestructive defaults to true. Require explicit opt-in for optimizations and lighter treatment. Forgetting to declare safety produces slow output, not broken output.

4. Concurrency partitioning from day one. Add the isConcurrencySafe flag before you have many tools, not after. Tag your tools as you write them. The scheduler writes itself once the information is there.

5. Deferred loading for large tool sets. If you’ll have more than 20–30 tools (especially from external sources), implement deferred discovery before launch. Token costs accumulate quickly.

Key Takeaways

Tools in a production harness are contracts, not functions. They declare schema, permissions, concurrency safety, and UI rendering alongside their logic.
Zod (or equivalent) serves as a single source of truth for both input validation and API documentation. One definition, two uses.
buildTool factory with fail-closed defaults means forgetting to declare safety produces slower execution, not unsafe execution.
Concurrency partitioning: consecutive safe tools parallelize; any unsafe tool serializes. Results are always emitted in request order regardless of execution order.
The StreamingToolExecutor’s four-stage state machine (queued → executing → completed → yielded) enables starting execution before the model finishes generating — significantly reducing end-to-end latency.
Deferred loading saves 10,000+ tokens per call when tool counts are large.

What’s Next

In Part 4: The Permission Pipeline — Safety That Doesn’t Get in the Way, we go inside the permission system:

The Fail Fast pipeline: four stages that reject requests as early as possible
Why “deny always wins” is not just policy — it’s architecture
Five permission modes from strictest to most permissive, and when each makes sense
The ResolveOnce pattern: atomic race resolution for concurrent approval requests
How Claude Code’s BashTool applies three matching strategies for fine-grained command control

References

Tool system design

Building Effective Agents — Anthropic Research
Claude Code Overview — Official docs
12 Agentic Harness Patterns from Claude Code — Generative Programmer

Concurrency and streaming

Dive into Claude Code: Design Space of AI Agent Systems — arxiv
Complete guide to resolving Claude Code tool use concurrency errors — Apiyi

Tool interface patterns

Anatomy of an Agent Harness — Daily Dose of Data Science
Harness Design for Long-Running Apps — Anthropic Engineering

SherlockLiu

Build Your Own Agent Harness: The Practical Blueprint (Part 12)

Question 1: Do You Actually Need a Harness?

Question 2: Build Your Own or Use a Platform?

Question 3: What to Steal from Claude Code

1. Loops over recursion

2. Schema-driven, not hard-coded

3. Progressive permissions with a clear winner

4. Layered config with defined merge semantics

5. Memory is a clue, not a conclusion

6. Compress proactively, not reactively

7. Extension without forking

8. Minimum necessary context and tools for sub-agents

9. Streaming first, everywhere

10. Read before you write

The Smart Way to Start: The Agent Harness Kit

What This Series Has Actually Been About

References

Plan Mode: The Architecture of Thinking Before Acting (Part 11)

The Problem: Premature Action

The Mode Switch: How Read-Only Becomes Enforced

The Sub-Agent Constraint

Exiting Plan Mode: The Approval Gate

Plan-Execute Workflow in Practice

Background Scheduling: Cron and Remote Triggers

Two User Models: External vs. Internal

Key Takeaways

What’s Next

References

Streaming Architecture: Building Agents That Feel Fast (Part 10)

QueryEngine: The Session State Owner

Streaming vs. Non-Streaming: The Real Performance Difference

Streaming Processing: Token by Token

Incremental JSON Parsing

StreamingToolExecutor: Execute on Arrival

Safe vs. Unsafe: The Concurrency Matrix

Order Guarantee

Sibling Abort on Bash Failure

Startup Performance: Parallel Prefetch and Lazy Load

Prompt Cache Strategy

Key Takeaways

What’s Next

References

Sub-Agents, Coordinators, and Skills: Multi-Agent Orchestration (Part 9)

Four Built-In Agent Types: Specialist Design

Explore: Read-Only Code Archaeology

Plan: Software Architect

General Purpose: Default Executor

Verification: Adversarial Tester

The Fork Pattern: Cache-Safe Parallel Execution

How Cache Sharing Works

The Byte-Level Cache Requirement

Recursive Fork Protection

The Coordinator Pattern: Centralized Orchestration

The Coordinator’s Tool Set

Coordinator vs. Fork: When to Use Each

Skills: Packaged Reusable Behaviors

Four-Level Skill Hierarchy

Built-in Skills

MCP: The External Capability Protocol

Why a Standard Matters

Eight Transport Protocols

MCP Tools Are First-Class Citizens

Seven Configuration Scopes

The Capability Hierarchy

Key Takeaways

What’s Next

References

The Hook System: Extension Points That Don’t Break the Core (Part 8)

Five Hook Types: Choosing the Right Execution Engine

Command Hook: The Default Choice

Prompt Hook: When Rules Can’t Express It

Agent Hook: Multi-Step Validation

HTTP Hook: External System Integration

Function Hook: Runtime-Only

Three Execution Modes for Command Hooks

26 Lifecycle Events: The Agent’s Observable Moments

The Tool Call Sandwich: PreToolUse / PostToolUse / PostToolUseFailure

UserPromptSubmit: The Translation Layer

Stop: The Completion Gate

`user` — Who You’re Working With

`feedback` — Validated Rules

`project` — Why Things Are the Way They Are

`reference` — External System Pointers

The Security Boundary: Why `projectSettings` Is Treated Differently

Enterprise Mode: `allowManagedHooksOnly`