Agentic Coding for Engineers Who Already Ship | GRAIsolSkip to main content

Agentic coding · for engineers

Agentic coding for engineers who already ship.

Not “AI will 10x you.” Not vibe coding for people who can't read the diff. This is how to operate AI coding agents at a professional level when you already write software for a living — context engineering, using Claude Code well, AI pair programming, broad prompting, and verifying the work instead of trusting it.

The POV

The model is not the skill. The way you operate it is.

Most of what gets written about coding with AI is aimed at people who can't yet code, or at people selling you the dream that you won't have to. This isn't that. If you already ship software, the interesting question isn't “can AI write code?” — it obviously can — it's how you get reliable work out of a tool that is, by default, an optimistic finisher.

Because that's the catch. An agent will implement the happy path, declare victory, and move on. It will tell you it's done. “I ran the tests and they passed” and “the tests passed” are different sentences, and the whole game is learning to tell which one you're holding. The leverage isn't in the model — everyone has the same models. It's in how you scope the task, engineer the context, push past the first answer, and verify what comes back.

So this page is a practitioner's map, not a hype reel. Six practices, each with my take and a deep-dive if you want to go further. Then the runtime I built to do all of this for real. Read it like a colleague handed it to you, not like a landing page.

The practices

Six things engineers who ship with agents do differently.

01

Context engineering, not prompt cleverness

The first instinct is to write a better prompt. The leverage is in the context — what the model can actually see when it reaches for an answer. The right files, the failing test, the constraint that isn't written down anywhere, the directory it's allowed to touch. Get the context right and a blunt prompt works; get it wrong and the cleverest prompt in the world hallucinates an API that doesn't exist. Treat the context window as a thing you engineer, deliberately, the same way you'd engineer the inputs to any other system.

Advanced prompt engineering, with real examples

02

Using Claude Code (and tools like it) well

Knowing how to use Claude Code well is mostly knowing when to let it run and when to stop it. Scope the task so the agent has a clear target and a tight blast radius. Give it the commands to verify its own work — the test, the typecheck, the build — and make running them part of the loop, not a thing you do afterward. The engineers who get the most out of these tools aren't the ones who type the least; they're the ones who set up the problem so the agent can't drift far before it hits a wall it has to climb.

Cloud coding agents compared: Cursor, Codex, Copilot

03

AI pair programming that earns its name

A good pair pushes back. Most AI pair programming setups don't — the model agrees with you, implements the happy path, and tells you it's done. The fix is to make disagreement part of the workflow: ask it to argue against its own design, to find what breaks under load, to name the edge case it skipped. Pairing with an agent is worth it when you treat its first answer as a draft to interrogate, not a verdict to accept. The reviewer in the loop is still you.

Meta prompting: getting AI to write its own prompts

04

Broad prompting: never accept "I'm done"

Broad prompting is my own frame for it, and it's the single habit that separates working code from production code. Your agent will stop at "it works." You shouldn't. After every implementation, audit it broadly — security, error handling, edge cases, performance, the failure modes nobody asked about — then feed that audit back in as the next prompt. The difference between code that runs on your machine and code that survives production is usually about ten rounds of "what else could go wrong?" That's not a mindset; it's a loop you run on purpose.

Agentic coding with broad prompting: the iterative loop

05

Verification: finished is not correct

An agent edits eleven files, runs a command, and reports a green checkmark. "I ran the tests and they passed" and "the tests passed" are different sentences, and the gap between them is where the failures live. The agent's account of its own run is a story it tells after the fact, reconstructed from a context window it has probably already compacted. So don't trust the summary — read the receipt. Re-derive the diff, re-run the check yourself, and demand an audit trail you can inspect without trusting the thing that produced it.

Don't trust your agent. Read the receipt.

06

The agent-shaped gap

There's a gap between what an agent can do and what it's actually asked to do, and most of the disappointment with these tools lives in that gap. Closing it isn't a bigger model — it's orchestration: scoping, sub-agents, verification gates, the workflow that turns a capable model into a reliable one. That gap is exactly why I started building tooling around the raw API instead of waiting for the next release to fix it for me.

The agent-shaped gap: why orchestration became a plugin

The runtime

I didn't just write about this. I built the runtime.

Agent AFK is an open-source agent runtime I built on Anthropic's raw Messages API — a CLI, a background daemon, and a Telegram bot sharing one session manager and a library of orchestration skills. It exists because the practices on this page are easier to preach than to do by hand, every time, under deadline. So I made the loop the software runs, not the discipline I have to remember.

The part I care about most is the receipt. Every session writes an append-only trace of every tool call, every sub-agent, and every decision — so when an agent runs unattended and comes back with a green checkmark, you don't have to take its word for it. You read what actually happened, line by line. That's verification built into the runtime instead of bolted on after.

It's Apache-2.0, runs against your own key or subscription, and works with Claude, GPT, and local models. The best way to understand the practices above is to see them wired into something real.

FAQ

Straight answers.

What is agentic coding?

Agentic coding is writing software with AI agents that can take a goal, take actions — read files, run commands, edit code, run tests — and iterate toward a result, instead of just autocompleting a line. For engineers, the work shifts: less typing the implementation, more scoping the task, engineering the context, and verifying the output. It's a way of operating, not a feature you turn on.

Is this for beginners or "vibe coders"?

No. This is for working engineers who already ship — people who can read a diff, write a test, and tell when an agent is confidently wrong. It is not "learn to code," and it is not vibe coding, where you accept whatever the model produces because you can't evaluate it. The whole premise here is that you can evaluate it, and the skill is in doing that well at speed.

What is context engineering?

Context engineering is deliberately controlling what the model can see when it works — the relevant files, the failing test, the constraints, the directory it's allowed to touch. It's higher-leverage than prompt wording: get the context right and a blunt prompt works; get it wrong and no prompt saves you. Think of the context window as an input you design, not a box you dump text into.

Do I need Claude Code to do agentic coding?

No — the practices here (context engineering, broad prompting, verification) apply across tools: Claude Code, Cursor, Codex, your own harness. Claude Code is a strong, accessible place to start. If you haven't set it up yet, there's a step-by-step install guide on the blog. The principles matter more than the specific tool; the tool is just where you apply them.

What is broad prompting?

Broad prompting is my own term for refusing to accept "I'm done" from an agent. After it implements something, you prompt it to audit the work broadly — security, error handling, edge cases, performance, failure modes — and feed that back in as the next instruction. It's a loop, run on purpose, that turns code that works into code that survives production.

The fastest way to get this is to watch it run.

Agent AFK puts these practices into a runtime you can read the receipts from. Open source, your own key, no signup.

Leading a team that needs to adopt this? See the Adoption Sprint →