Monday, March 23, 2026

The Legendary Agent-Month – O’Reilly

The next article initially appeared on Wes McKinney’s weblog and is being republished right here with the writer’s permission.

Like lots of people, I’ve discovered that AI is horrible for my sleep schedule. Prior to now I’d get up briefly at 4:00 or 4:30 within the morning to have a sip of water or use the toilet; now I’ve hassle going again to sleep. I might be doing issues. Earlier than I might get a strong 7–8 hours an evening; now I’m fortunate once I get 6. I’ve largely stopped combating it: Now once I’m rolling round restlessly in mattress at 5:07am with concepts to feed my AI coding brokers, I simply rise up and begin my day.

Amongst my inside circle of engineering and information science buddies, there may be a number of dialogue about how lengthy our aggressive edge as people will final. Will having good concepts (and plenty of them) nonetheless matter because the brokers start having higher concepts themselves? The human-expert-in-the-loop feels important now to get good outcomes from the brokers, however how lengthy will that final till our wildest concepts might be become working, tasteful software program whereas we sleep? Will or not it’s a mild obsolescence the place we fortunately hand off the reins or one thing else?

For now, I really feel wanted. I don’t describe the way in which I work now as “vibe coding” as this appears like a pejorative “immediate and chill” method of constructing AI slop software program tasks. I’ve been constructing instruments like roborev to deliver rigor and steady supervision to my parallel agent classes, and to closely scrutinize the work that my brokers are doing. With this radical new method of working it’s onerous to not be contemplative about the way forward for software program engineering.

In all probability the ebook I’ve referenced probably the most in my profession is The Legendary Man-Month by Fred Brooks, whose now-famous Brooks’s regulation argues that “including manpower to a late software program challenge makes it later.” These days I discover myself asking whether or not the teachings from this ebook are relevant on this new period of agentic growth. Will a proficient developer orchestrating a swarm of AI brokers be capable of construct advanced software program quicker and higher, and can the short-term productiveness positive factors result in long-term challenge success? Or will we run into the identical bottlenecks—scope creep, architectural drift, and coordination overhead—which have plagued software program groups for many years?

Revisiting The Legendary Man-Month (TMMM)

One among Brooks’s central arguments is that small groups of elite individuals outperform giant groups of common ones, with one “chief surgeon” supported by specialists. This results in a excessive diploma of conceptual integrity in regards to the system design, as if “one thoughts designed it, even when many individuals constructed it.”

Agentic engineering seems to amplify these issues, for the reason that high quality of the software program being constructed is now solely pretty much as good because the people within the loop curating and refining specs, saying sure or no to options, and taming pointless code and architectural complexity. One of many metaphors in TMMM is the “tar pit”: “Everybody can see the beasts struggling in it, and it appears like every one among them may simply free itself, however the tar holds all of them collectively.” Now, now we have a brand new “agentic tar pit” the place our parallel Claude Code classes and git worktrees are engaged in fight with the code bloat and incidental complexity generated by their digital colleagues. You possibly can systematically refactor, however invariably an agentic codebase will find yourself bigger and extra overwrought than something constructed by human hand. That is technical debt on an unprecedented scale, accrued at machine pace.

In TMMM, Brooks noticed {that a} working program is perhaps 1/ninth the way in which to a programming product, one which has the mandatory testing, documentation, and hardening towards edge circumstances and is maintainable by somebody apart from its writer. Brokers are actually making the “working program” (or “appears-to-work” program, extra precisely) an important deal extra accessible, although many newly minted AI vibe coders clearly underestimate the work concerned with going from prototype to manufacturing.

These issues compound when contemplating the closely-related Conway’s regulation, which asserts that the structure of software program programs tends to resemble the organizations’ workforce or communication construction. What does that appear to be when utilized to a digital “workforce” of brokers with no persistent reminiscence and no shared understanding of the system they’re constructing?

One other “large thought” from TMMM that has caught with individuals is the n(n-1)/2 coordination drawback as groups scale. With agentic engineering, there are fewer people concerned, so the coordination drawback doesn’t disappear however moderately adjustments form. Totally different agent classes could produce contradictory plans that people need to reconcile. I’ll depart this agent orchestration query for one more put up.

No silver bullet

“There isn’t a single growth, in both expertise or administration approach, which by itself guarantees even one order-of-magnitude enchancment inside a decade in productiveness, in reliability, in simplicity.”
—“No Silver Bullet” (1986)

Brooks wrote a follow-up essay to TMMM to have a look at software program design by means of the lens of important complexity and unintended complexity. Important complexity is prime to reaching your objective: When you made the system any less complicated, it will fall wanting its drawback assertion. Unintended complexity is every little thing else imposed by our instruments and processes: programming languages, instruments, and the layer of design and documentation to make the system comprehensible by engineers.

Coding brokers are in all probability probably the most highly effective software ever created to sort out unintended complexity. To assume: I mainly don’t write code anymore, and now write tons of code in a language (Go) I’ve by no means written by hand. There may be a number of dialogue about whether or not IDEs are nonetheless going to be related in a 12 months or two, when perhaps all we want is a textual content editor to assessment diffs. The productiveness positive factors are huge, and I say this as somebody burning north of 10 billion tokens a month throughout Claude, Codex, and Gemini.

However Brooks’s “No Silver Bullet” argument predicts precisely the issue I’m experiencing in my agentic engineering: The unintended complexity isn’t any drawback in any respect anymore, however what’s left is the important complexity which was at all times the onerous half. Brokers can’t reliably inform the distinction. LLMs are extraordinary sample matchers educated on everything of humanity’s open supply software program, so whereas they’re good at coping with unintended complexity (refactor this code, write these exams, clear up this mess), they wrestle with the extra delicate important design issues, which frequently haven’t any precedent to sample match towards. In addition they usually are likely to introduce pointless complexity, producing giant quantities of defensive boilerplate that’s hardly ever wanted in real-world use.

Put one other method, brokers are so good at attacking unintended complexity that they generate new unintended complexity that may get in the way in which of the important construction that you’re attempting to construct. With a few my new tasks, roborev and msgvault, I’m already coping with this drawback as I start to succeed in the 100 KLOC mark and watch the brokers start to chase their very own tails and contextually choke on the bloated codebases they’ve generated. Sooner or later past that (the following 100 KLOC, or 200 KLOC) issues begin to disintegrate: Each new change has to hack by means of the code jungle created by prior brokers. Name it a “brownfield barrier.” At Posit now we have seen brokers wrestle way more in 1 million-plus-line codebases resembling Positron, a VS Code fork. This appears to assist Brooks’s complexity scaling argument.

I might hesitate to position a wager on whether or not the current is a ceiling or a plateau. The fashions are clearly getting higher quick, and the issues I’m describing right here could look charmingly quaint in two years. However Brooks’s important/unintended distinction provides me some confidence that this isn’t simply in regards to the present limitations of the expertise. Determining what to construct was the onerous half lengthy earlier than we had LLMs, and I don’t see how a flawless coding agent adjustments that.

Agentic scope creep

When producing code is free, realizing when to say “no” is your final protection.

With the price of producing code now converging to zero, there may be virtually nothing stopping brokers and their human taskmasters from pursuing all avenues that might have beforehand been price or time prohibitive. The temptation to spend your day prompting “and now are you able to simply…?” is overwhelming. However any new generated characteristic or subsystem, whereas low-cost to create, is just not costless to keep up, check, debug, and purpose about sooner or later. What appears free now carries a future contextual burden for future agent classes, and every new bell or whistle turns into a brand new vector of brittleness or bugs that may hurt customers.

From this angle, constructing nice software program tasks perhaps by no means was about how briskly you’ll be able to sort the code. We will “sort” 10x, perhaps 100x quicker with brokers than we may earlier than. However we nonetheless need to make good design choices, say no to most product concepts, keep conceptual integrity, and know when one thing is “performed.” Brokers are accelerating the “simple half” whereas paradoxically making the “onerous half” probably much more troublesome.

Agentic scope creep additionally appears to be actively destroying the open supply software program world. Now that the bar is decrease than ever for contributors to leap in and provide assist, tasks are drowning in torrents of three,000-line “useful” PRs that add new options. As builders turn into more and more hands-off and disengaged from the design and planning course of, the brokers’ runaway scope creep can get uncontrolled rapidly. When the particular person submitting a pull request didn’t write or totally learn the code in it, there’s doubtless nobody concerned who’s actually accountable for the design choices.

I’ve seen in my very own work on roborev and msgvault that brokers will suggest overwrought options to issues when a easy answer would do exactly effective. It takes judgment to know when to intervene and how you can hold the agent in test.

Design and style as our final foothold

Brooks’s argument is that design expertise and good style are probably the most scarce assets, and now with brokers doing the entire coding labor, I argue that these abilities matter extra now than ever. The bottleneck was by no means fingers on keyboards. Now with the brand new “Legendary Agent-Month,” we are able to moderately conclude that design, product scoping, and style stay the sensible constraints on delivering high-quality software program. The builders who thrive on this new agentic period received’t be those who run probably the most parallel classes or burn probably the most tokens. They’ll be those who’re in a position to maintain their tasks’ conceptual fashions of their thoughts, who’re shrewd about what to construct and what to go away out, and train style over the big quantity of output.

The Legendary Man-Month was printed in 1975, greater than 50 years in the past. In that point, lots has occurred: great progress in {hardware} efficiency, programming languages, growth environments, cloud computing, and now giant language fashions. The instruments have modified, however the constraints are nonetheless the identical.

Perhaps I’m attempting to justify my very own continued relevance, however the actuality is extra advanced than that. Not all software program is created equal: CRUD enterprise productiveness apps aren’t the identical as databases and different essential programs software program. I believe the median software program consulting store is totally toast. However my thesis is extra about growth work within the 1% tail of the distribution: issues inaccessible to most engineers. It will proceed to require knowledgeable people within the loop, even when they aren’t doing a lot or any handbook coding. As one latest adjoining instance, my good friend Alex Lupsasca at OpenAI and his world-class physicist collaborators have been in a position to create a formulation of a tough physics drawback and arrive at an answer with AI’s assist. With out such specialists within the loop, it’s way more doubtful whether or not LLMs would be capable of each pose the questions and provide you with the options.

For now, I’ll in all probability nonetheless be getting off the bed at 5am to feed and tame my brokers for the foreseeable future. The coding is simpler now, and truthfully extra enjoyable, and I can spend my time enthusiastic about what to construct moderately than wrestling with the instruments and programs across the engineering course of.

Because of Martin Blais, Josh Bloom, Phillip Cloud, Jacques Nadeau, and Dan Shapiro for giving suggestions on drafts of this put up.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles