The Agent-Shaped Gap: Why Orchestration Became a Plugin

I've shipped MCP servers for Cursor, E2B, ElevenLabs, Twilio, and Smartlead. Different APIs, different product surfaces. Every one of them taught me the same thing: the agent using the MCP can do more than the framework around it knows to ask for.

That gap — between what an agent can do and what it gets asked to do — is what this post is about. It's not a bug. It's a design choice. And once I saw it clearly enough, I ended up building a plugin to close it.

What the Gap Actually Looks Like

Picture a task where the agent needs to survey a codebase, check git state, and read some external docs. The base loop handles these one after another. It's not wrong — it just has no instruction to parallelize. Three minutes of sequential work becomes thirty seconds if you wire the topology right.

Another one: agents re-discovering context they could have grounded once. A workflow that edits three files in a row figures out which branch it's on, what conventions apply, what the project structure looks like — three separate times. Every step pays the discovery tax.

Another: sub-agents returning conclusions that drive edits with no adversarial verification in the loop. The dispatched agent says "I found the bug." You act on it. Nobody checked the finding independently before the action.

None of these are exotic. They're the kind of failure mode anyone running agent workflows at volume starts noticing by week two. The primitives are all there in Claude Code — sub-agent dispatch, parallel tool calls, hooks, a skill system. What's missing is a standard way to reach for them consistently.

Anthropic made that choice deliberately. The framework ships mechanism and declines to ship policy. I've argued elsewhere that's wisdom, not lag — mechanism survives capability shifts, policy doesn't. But the gap the choice leaves open is real, observable from the outside, and — I eventually decided — fillable.

What I Built

agent-workflow-amplifiers. Public on GitHub as griffinwork40/agent-framework, Apache-2.0, installable as a Claude Code plugin. Twelve skills plus one gating agent, organized roughly by what layer of the problem they compress:

Domain modeling — skills that compress context once so downstream work doesn't rediscover it. ground-state dispatches parallel sub-agents to survey git, infrastructure, and memory, and returns a short snapshot. research runs web and local codebase search concurrently and merges the findings. appmap and integrate do similar compression for codebase structure and API integration.
Orchestration primitive — contract defines the I/O envelope sub-agents return, so every skill downstream can trust the shape. It's a primitive, not a policy; every skill that dispatches sub-agents references it.
Execution — the actual work. spec, resolve, web, agentify, provideme, automate.
Self-improvement — forge-friction logs repeated failures from local telemetry so new skill proposals can be grounded in observed patterns rather than invented. qualify (an agent, not a skill) evaluates proposed skills against an explicit rubric before they ship — a deliberately high bar that rejects reminders, checklists, and best-practice nudges.

None of this replaces Claude Code. All of it sits on top. You install the plugin, the skills become available as slash commands, and the shape of how agents orchestrate changes.

I didn't design the layers upfront. They fell out of a year of noticing what broke. Every skill started as a workaround for a specific failure; the architecture emerged after.

A Concrete Before/After

A task shape I run often: resolve feedback on a pull request with four review comments.

The base-loop version: the agent reads comment one, reads the surrounding code, figures out a fix, applies it. Reads comment two, re-reads overlapping code because nothing grounded the codebase state once, thinks, applies. By comment four it's made three fixes that step on each other because each was reasoned about in isolation. Tests fail, the agent enters repair mode, you end up with a messy history.

The /resolve version: one skill invocation. It dispatches four sub-agents in parallel — one per review comment — under the contract envelope. Each sub-agent reads the cited code and first checks whether the comment is still valid — not a false positive, not already addressed. If valid, it identifies the minimal fix and reports back without applying anything. When all four return, the orchestrator applies the fixes sequentially, runs the test suite, and on green commits once with a summary of what was resolved.

The second shape finishes in a fraction of the time, produces one clean commit, and catches stale or false-positive review comments before they drive edits. The first shape bakes every review comment straight into the working tree as if it were all still valid.

The difference isn't that the base agent is bad. It's that the base agent has no reason to parallelize or to validate review comments before acting — those are policy decisions the framework declines to make. /resolve makes them on behalf of the task. That's what the plugin is: a layer of small, opinionated policies that sit on top of mechanism that deliberately stayed neutral.

What This Means for How I Want to Work

I'm a self-taught developer from Daytona Beach, Florida. I've shipped MCP servers for Cursor, E2B, ElevenLabs, Twilio, and Smartlead, and a layered agent orchestration plugin for Claude Code. I'm not employed and I'm looking for my first full-time role in AI — at Anthropic first, then Cursor, Factory, Cognition, ElevenLabs, or anywhere serious about agent systems.

What I'm good at is sitting with primitives long enough to see what shape of policy they're asking for. The plugin is the artifact of that. The next few posts will go deeper — why layered compression makes orchestration tractable, why mechanism survives capability shifts and policy doesn't, and what self-improvement looks like when it's grounded in telemetry instead of invention.

If any of that is interesting to you — as a hiring conversation, as a collaborator, or as a reader who cares about agent architecture — I'd like to hear from you.

griffin@graisol.com
github.com/griffinwork40/agent-framework

The Agent-Shaped Gap: Why Orchestration Became a Plugin

The Agent-Shaped Gap: Why Orchestration Became a Plugin

What the Gap Actually Looks Like

What I Built

A Concrete Before/After

What This Means for How I Want to Work

Related Posts

How to Set Up Claude Code: A Beginner-Friendly Guide for Non-Developers

Welcome to GRAIsol: Building AI Tools & Agents

AgentGRAI: The AI That Replaces Your Sales Stack