Two years ago, developers wrote code by hand. Last year, they prompted agents to write it for them. Now, according to Claude Code creator Boris Cherny, the next leap is already happening: agents that prompt other agents, running in endless loops that never stop improving your codebase. At Meta’s @Scale conference last Friday, Cherny called this shift “just as big as the step from source code to agents” — and the first audience question of the day was whether loops were real or just the next hype cycle. His answer was an emphatic yes.
What Agentic Loops Actually Are
At the @Scale talk, Cherny described two loops he keeps running in his own work: one agent continuously hunts for ways to improve code architecture, while another scans for duplicated abstractions that can be unified. Both submit pull requests like any human contributor, and because the codebase keeps changing, they never stop running. That departs from how most teams approach agentic AI today, where the standard playbook is to set clear goals, monitor discrete units of progress, and rein the agent in before it drifts.
Why it matters: the loop pattern flips the engagement model from “managed delegation” to “continuous background labor.” Instead of asking an agent to do one job, you grant a swarm of agents standing authority to keep working forever. If you’re a startup with a small platform team buried under refactor debt, this means you can have a persistent background process opening cleanup PRs every night without anyone scheduling the work. Cherny is the most credible voice in the agentic-coding space right now, and when he calls something a step-change, teams building developer tooling will follow.
The Ralph Loop and Why Simple Beats Clever
The Ralph Loop — named, delightfully, for Ralph Wiggum — shows how simple agentic loops can be. It summarizes everything the model has done so far and asks whether the goal has been reached. If not, it bounces the model back to keep working. It’s a non-deterministic version of the recursive loops every CS undergrad learns in week three, except the stopping condition is decided by a subagent rather than a clean boolean check.
Why it matters: developers have been trained to think of AI orchestration as a complex graph problem requiring frameworks, state machines, and elaborate planners. The Ralph Loop suggests the opposite — that a tight feedback loop with a summarizer and a “are we done?” check can outperform fancier scaffolding because it sidesteps the context-rot problem that plagues long-running agents. If you’re a team evaluating whether to build a custom orchestration layer or buy into a heavyweight framework, this favors staying lean. The teams that win at agentic coding over the next year will be the ones who treat orchestration as a 50-line script, not a platform.
Loops as Test-Time Compute, Weaponized
The article connects loops to a broader argument from OpenAI researcher Noam Brown, who observed earlier this month that contemporary models can solve nearly any problem if you throw enough compute at them. Loops are a way to operationalize that observation: keep spending tokens until the problem is solved, or in Cherny’s framing, keep making incremental improvements for as long as compute is available. It fits hill-climbing problems like code quality especially well, where there’s always another duplication to unify or another architectural seam to smooth.
Why it matters: this reframes how teams should budget for AI. Token spend stops being a per-task line item and becomes more like an infrastructure cost — a continuous burn rate tied to how aggressively you want the system to improve itself. If you’re running a 50-engineer org, you could realistically allocate a fixed monthly token budget to background refactor agents the same way you’d allocate compute to CI runners. The line between AI agents and broader automation strategy is going to blur fast once loops become standard, because a loop is an agent dressed up as an automation pipeline.
The Cost Problem Nobody Has Solved Yet
The economics are blunt: loops burn tokens far faster than chatbots or even single-shot agents, and because the entire point is to keep running indefinitely, there’s no natural ceiling on spend. That’s great news for Anthropic, which the article notes is ultimately in the token-selling business. For everyone else, it’s a budgeting nightmare without strong guardrails around oversight, drift detection, and token caps.
Why it matters: the first wave of production loops will require a control plane that doesn’t really exist yet — something that watches token spend in real time, kills loops that aren’t making measurable progress, and surfaces drift before an autonomous agent merges something nobody asked for. Engineering leaders should scrutinize this layer before greenlighting always-on agents. Expect a wave of startups in 2026 selling “loop observability” the same way Datadog sold APM a decade ago — and expect the build-versus-buy debate around custom AI infrastructure to get sharper as the bills come in.
FAQ
Q: What is an agentic loop in AI coding? A: An agentic loop is a setup where an AI agent runs continuously, repeatedly invoking itself or other agents to work on an open-ended goal like improving a codebase. Unlike a one-shot prompt, the loop only stops when a subagent decides the goal has been met — or when a human or budget cap intervenes.
Q: How is the Ralph Loop different from a normal recursive function? A: A normal recursive function has a deterministic stopping condition — a clear boolean check. The Ralph Loop, named after Ralph Wiggum, instead summarizes the work done so far and asks the model whether the goal has been accomplished, making the stop condition non-deterministic and judgment-based.
Q: Are AI loops practical for small teams? A: They can be, but only with strict token budgets and clear oversight. Because loops run continuously, costs scale with time rather than tasks, so small teams should start with narrowly scoped loops — like nightly dependency cleanup — before authorizing always-on agents on critical code paths.
Key Takeaways
- Teams still treating agents as one-shot tools will fall behind teams that build persistent background loops for hill-climbing problems like refactoring and dedup.
- The winning orchestration patterns will likely be embarrassingly simple — closer to the Ralph Loop than to heavyweight planner frameworks.
- Token spend is about to shift from a per-task cost to a continuous burn rate, and engineering leaders should start budgeting accordingly before sticker shock hits.
- Expect a new category of tooling for agent observability and continuous AI workflows to emerge in 2026, focused on watching drift, spend, and progress across always-on loops.
- The teams that treat loops as infrastructure — with monitoring, caps, and kill switches — will extract real value; the ones that treat them as magic will burn cash and ship regressions.