Letters Home from MCP Audit Camp: Multi-Agent Observability That Reads Like Mail
Letters Home from MCP Audit Camp
We needed to audit 22 MCP tool handlers across the OpZero codebase — schemas, deploy logic, project management, and blog tooling. Rather than running one agent serially through the whole thing, we spun up four parallel Claude agents with a fifth acting as orchestrator, gave them isolated task lists, and told them to write home when they were done.
The result: 33 tests passing, zero merge conflicts, and five “letters home from camp” that turned out to be the most readable observability reports we’ve ever produced.
The Setup
The orchestrator (team-lead) read the entire codebase first — all 22 tool handlers, every schema, every definition file — then divided work into four squads:
- schemas-agent: Tool definitions, validation schemas, parameter wiring
- deploy-agent: Deployment metadata and completeness tracking
- projects-agent: Project management, system status, cleanup tooling
- blog-agent: Content tooling and author attribution
Each agent got a scoped task list. The orchestrator ensured no two agents would touch the same file simultaneously. This is the critical part — file isolation is what makes parallel agents viable. Without it, you’re just generating merge conflicts at machine speed.
Here are the actual letters they sent back. Use the arrows to read each one:
What Actually Got Fixed
The audit wasn’t cosmetic. Real bugs were found and shipped:
Schemas-agent wired the target parameter (Cloudflare, Netlify, or Vercel) across four deploy tools that previously only supported one provider. Rewrote tool descriptions that were hiding capabilities from the LLM consuming them — which means the tools were less useful than they could have been because the AI calling them didn’t know what they could do. A reminder that MCP tool descriptions are prompts, not documentation.
Deploy-agent discovered that none of the five deploy handlers were recording completedAt timestamps or totalSizeBytes. Deployment records existed but had no concept of “done” or “how big.” Both fields now get stamped using Buffer.byteLength across all five deploy paths.
Projects-agent fixed a ghost project bug where deleted projects were being counted in system status, found that counters were returning string-typed numbers instead of actual numbers, added duplicate name disambiguation to delete/archive operations, and wired deployment counts into project listings using correlated subqueries.
Blog-agent tracked down a mystery author name hardcoded in the database (not the source code — the agent searched the entire codebase first and came up empty, then correctly identified the database as the source). Also demonstrated healthy task deduplication when it arrived at a second task to find it already completed by projects-agent.
The Letters as Observability
Here’s the technique that surprised us: we asked each agent to write its completion report as a “letter home from camp.” The constraint of the format produced reports that are:
- Naturally scoped — each letter covers exactly one agent’s work
- Plain-language explanations — the camp metaphor forces agents to describe technical work accessibly, which makes review faster than reading commit diffs
- Dependency-aware — blog-agent’s letter naturally mentions arriving at a task already completed by projects-agent, surfacing the orchestration graph without requiring explicit dependency tracking
- Completeness-signaling — the sign-off format creates a clear “done” signal, and the camp counselor letter serves as the aggregation summary
Compare reading five of these letters to reading a git log with 15+ commits across four branches. The letters are scannable in about two minutes. The git log requires context-switching between diffs, understanding file paths, and mentally reconstructing what each change actually accomplished.
The Orchestration Pattern
The team-lead agent’s workflow:
- Read the full codebase to build a dependency map
- Partition tasks by file ownership — no two agents share files
- Distribute task lists to each agent
- Collect completion reports (the letters)
- Run the full test suite (33 passed)
- Build for production
- Ship to a PR branch
The key insight is that step 2 is where most multi-agent attempts fail. If you don’t enforce file isolation, agents will generate conflicting edits that require manual resolution — defeating the purpose of parallelism. The orchestrator needs to understand the codebase well enough to draw clean boundaries.
What This Means for Agent Tooling
If you’re building MCP servers or agent-powered platforms, consider that your agents’ work products need to be auditable by humans. Structured JSON logs are machine-readable but painful to review. Commit messages are terse. PR descriptions are often AI-generated boilerplate.
A constrained narrative format — like these camp letters — sits in a sweet spot: structured enough to be consistent, human enough to be scannable, and expressive enough to capture the reasoning behind changes, not just the changes themselves.
We’re considering building this pattern into OpZero as a first-class feature: after any multi-agent workflow completes, generate a readable summary of what happened and why. Not a changelog. Not a diff. A story.
Try it yourself
The interactive letters above are the actual agent completion reports from the audit run. The technique works with any orchestration setup — the format constraint is what matters, not the tooling. Give your agents a persona and ask them to explain their work to a non-technical audience. The results are consistently more useful than structured logs.
Built with parallel Claude Opus agents on the OpZero platform. 33 tests. Zero merge conflicts. Five letters home.