For three years the most valuable AI skill on a software team was prompt engineering: the craft of phrasing a single instruction so a model would do what you meant on the first try. In June 2026, Addy Osmani put a name to the skill that is quietly replacing it. He called it loop engineering — building on work by Peter Steinberger and Anthropic's Boris Cherny — and the distinction is not cosmetic. Prompt engineering optimizes one instruction you type by hand, one turn at a time. Loop engineering optimizes the autonomous system that decides what to prompt, when to prompt it, and whether the result is acceptable. The first treats the agent as a tool you hold. The second treats it as a long-running process you design.
The Shift: From a Tool You Hold to a Process You Design
The clearest way to understand loop engineering is to look at what each discipline actually optimizes. Prompt engineering is a craft of phrasing. You sit at the keyboard, you describe a task, you read the output, and if it is wrong you rephrase and try again. The unit of work is a turn. The human is in the loop on every iteration, supplying judgment, context, and correction. The model is a stateless function you call repeatedly; the intelligence about what to do next lives in your head.
Loop engineering moves that intelligence out of your head and into a system. Instead of you deciding what to prompt next, a controller decides. Instead of you noticing that a task came in overnight, a scheduled automation triages it. Instead of you eyeballing whether the output is good, a separate checking agent evaluates it against criteria you defined once. The model is no longer a function you call — it is one stage in a process that has memory, scheduling, evaluation, and orchestration around it.
"The leverage is no longer in the prompt. It's in the loop that decides which prompts to run, in what order, and whether to keep going."
This is the same transition software went through when it moved from scripts to services. A script is something you run by hand when you need it. A service is something that runs continuously, handles events as they arrive, persists state, and recovers from failure. Prompt engineering is scripting the model. Loop engineering is turning the model into a service — with all the infrastructure concerns that implies. The teams who internalize this are pulling away from the teams still optimizing individual prompts, in the same way that the engineers who orchestrate ten agents are pulling away from the ones who write the cleverest single request.
Optimizing the instruction
- • Unit of work is a single turn
- • Human supplies judgment every iteration
- • Model is a stateless function you call
- • Context lives in your head and the prompt
- • Quality control is you re-reading the output
Optimizing the system
- • Unit of work is a long-running process
- • A controller decides what to prompt next
- • Model is one stage in a stateful pipeline
- • Context lives in skills and a memory store
- • Quality control is a separate checker agent
The Anatomy of a Loop: Five Components Plus Memory
A loop is not a single clever prompt and it is not a single agent running in a terminal. It is an assembly of five distinct components, each solving a problem that prompt engineering simply cannot reach, organized around a shared memory store that gives the whole system continuity across runs. The genius of the framing is that it names the parts. Once you can name them, you can build them, test them, and replace them independently.
1. Scheduled Automations for Discovery and Triage
The first component is the thing that decides the loop should run at all. In prompt engineering, you are the trigger — nothing happens until you type. In a loop, scheduled automations do discovery and triage on their own. A cron job kicks off an agent every morning to scan new issues, label them, identify the ones that match a known-fixable pattern, and queue them for work. Another runs hourly to watch a CI dashboard and open a draft fix the moment a flaky test trips. The human never initiated any of it; the loop noticed the work and started it.
This is the component that converts an agent from reactive to proactive. It is also where most teams start, because it delivers value without ceding much trust — a triage agent that only labels and queues, but does not act, is low-risk and immediately useful.
2. Git Worktrees So Parallel Agents Don't Collide
The moment you run more than one agent at a time, you have a concurrency problem. Two agents editing the same working directory will clobber each other's changes, corrupt each other's builds, and produce a merge nightmare. Git worktrees solve this by giving each agent its own checked-out copy of the repository, sharing the same underlying object store but with isolated working trees. Agent A works on the auth refactor in one worktree; Agent B fixes the pagination bug in another; neither can see or break the other's in-progress state.
3. Skills That Capture Project Knowledge
A model arrives at your codebase knowing nothing about your conventions, your deployment quirks, or the three undocumented rules everyone on the team learned the hard way. Prompt engineering handles this by stuffing context into the prompt every single time — which is fragile, expensive, and forgotten the moment the session ends. Skills are the durable alternative: reusable, named bundles of project knowledge the agent loads when relevant. "How we write migrations." "Our PR review checklist." "The exact steps to run the integration suite." Captured once, applied automatically, versioned in the repo alongside the code they describe.
Skills are what let a loop accumulate institutional memory rather than relearning the same context on every run. They are the difference between an agent that is competent today and one that gets more competent as your team teaches it.
4. Plugins and MCP Connectors That Wire the Agent Into Real Tools
A loop that can only generate text is a glorified autocomplete. A loop that can read your issue tracker, query your database, open a pull request, and post to your incident channel is an operator. Plugins and Model Context Protocol connectors are the wiring that gives the agent hands. MCP in particular has become the de facto standard for this — the protocol crossed 97 million installs and won the integration war, which means the connector you need probably already exists rather than needing to be built.
This component is where loops earn their keep and where they get dangerous. Every tool you wire in expands what the agent can accomplish and what it can break. The connector layer is therefore also the permission layer: which actions require confirmation, which are irreversible, which are sandboxed. Designing it well is a core loop-engineering responsibility, not an afterthought.
5. Sub-Agents That Split the Maker From the Checker
The final component is the one that makes a loop trustworthy. A single agent that writes code and also judges whether the code is correct is grading its own homework — and models are notoriously bad at catching their own mistakes. Splitting the loop into a maker sub-agent that produces work and a separate checker sub-agent that verifies it against explicit criteria introduces the adversarial pressure that catches errors. The checker has a different prompt, a different role, and ideally a different incentive: its job is to find what is wrong, not to confirm what looks right.
This maker/checker pattern is the structural answer to the trust problem that defines agentic work, and it deserves its own treatment — we cover the verification loops in depth in our guide to maker-checker agents and verification loops. For now, the point is architectural: the checker is not an optional polish step. It is the component that lets you delegate with confidence instead of supervising every output by hand.
Plus: The Memory Store
Underneath all five components sits a memory store — the thing that gives the loop continuity. It holds the state of in-flight work, the outcomes of past runs, the corrections a human supplied last week so the loop does not repeat the mistake this week. Without memory, every run starts cold and the loop never improves. With it, the loop compounds: each run leaves the next one a little smarter. Memory is what turns a collection of components into a system that learns.
Where the Leverage Moves — and Who Wins
The most important consequence of loop engineering is not technical; it is about where value accrues on a team. In a prompt-engineering world, the high-leverage person is the one who writes the best prompts — a skill, but a personal and largely non-transferable one. In a loop-engineering world, the high-leverage person is the one who designs the system that writes the prompts. That system is an asset. It runs while you sleep, it serves the whole team, and it gets better every time someone improves a skill or tightens a checker.
This is the shift in plain terms: you stop being the person who prompts the agent and start being the person who designs the system that prompts it. That sounds like a small rewording. It is actually a relocation of where your effort produces compounding returns. A great prompt helps you once. A great loop helps your team indefinitely.
Who loses? The pure prompt specialist whose entire value was phrasing. As models get better at understanding intent, the premium on perfect phrasing shrinks — a well-built loop tolerates an imperfect prompt because the checker catches the failure and the memory store records the correction. The skill that does not commoditize is system design: deciding what to schedule, where to isolate, which knowledge to capture as skills, which tools to wire in, and how to structure the maker/checker split. That is engineering judgment, and it is exactly the thing models are slowest to replace.
"Prompt engineering made you faster at asking. Loop engineering makes the system stop needing you to ask. That is a different — and durable — kind of leverage."
How to Start: One Loop, Built Incrementally
The mistake teams make is trying to stand up all five components at once, autonomous and unattended, and then losing trust the first time the loop does something dumb in production. The right path is incremental, and it tracks the trust you can justify. Start with a single scheduled triage automation that only reads and labels — discovery without action. It is useful immediately and it can break nothing.
Next, add a checker sub-agent to one workflow where you already run an agent by hand. Let the maker produce, let the checker grade, and watch how often the checker catches something you would have missed. That single addition is usually what converts a skeptic, because it makes the quality of delegated work visible rather than a leap of faith. Once you trust the checker, add git worktrees so you can run two makers in parallel, then start capturing the context you keep re-typing as skills, and finally wire in the MCP connectors that let the loop act on the world instead of just proposing.
Notice that memory threads through every step — each stage records its outcomes so the next stage starts smarter. By the time you have all five components running, you have not built five things. You have built one system that happens to have five surfaces, and the judgment encoded in how those surfaces fit together is the part no model wrote for you.
Conclusion: The Skill Worth Building
Prompt engineering will not disappear — phrasing still matters at the leaf of every loop. But it is being subsumed. The instruction you type is now one input to a larger machine, and the value has migrated to whoever designs that machine. Loop engineering is the name for that design work: scheduling discovery, isolating parallel execution, capturing knowledge as skills, wiring in real tools, and splitting the maker from the checker, all around a memory store that lets the system compound.
The teams that win in 2026 are not the ones with the cleverest prompts. They are the ones whose loops run overnight, catch their own mistakes, and arrive each morning a little better than they were the day before. That is not a prompt you can write. It is a system you have to engineer.
Tags
Share
Building something like this? See how we ship it or start a project.