1 min read
I was on a call with a friend this morning. Engineering leader at a company you have heard of. He was building an agent while we talked — writing code, shipping, the kind of thing that is just normal now for people who get it. I was writing this article while writing a different article. Two things happening at once. That is also normal now.
He told me the same story I hear every week.
“We rolled out Copilot. Cursor. Claude. Whatever. Adoption is through the roof. Engineers love it. PRs are up forty percent.”
Then I ask one question: “What happened to your review queue?”
Silence.
The Fastest Way to Destroy Your Codebase
Here is what you did. You took a team that was shipping at a certain velocity — maybe not fast enough, maybe frustratingly slow — and you handed every engineer a tool that generates code at ten times the speed they used to type it. You did not change your review process. You did not update your quality gates. You did not rethink your governance model. You did not redefine what “done” means when an agent can produce a thousand lines before lunch.
You just added AI.
And for about three weeks, it felt like magic. PRs were flying. Tickets were closing. Your Jira board looked healthier than it had in years. Someone on your leadership team probably sent a Slack message that said something like “This is what transformation looks like.”
Then your bottlenecks showed up.
The Bottleneck Did Not Disappear. It Moved.
You used to have a generation problem. Code was slow to write. Engineers spent hours on boilerplate, on scaffolding, on the mechanical parts of building software. AI solved that problem overnight.
But you never had a generation problem. You had a throughput problem. And throughput is not just how fast code gets written. It is how fast code gets reviewed, tested, validated, merged, deployed, and monitored in production. Your SDLC is a pipeline. You accelerated one stage of that pipeline by an order of magnitude and left every other stage untouched.
You know what happens when you pour ten times more water into a pipe that is the same diameter downstream? It does not flow faster. It backs up. It floods. It breaks things.
That is your codebase right now.
Your review queue is drowning. Your senior engineers — the ones who actually understand the system well enough to review — are buried under a mountain of machine-generated pull requests. Each one looks plausible. Each one compiles. Each one passes the linter. And each one requires a human who understands context, architecture, and intent to determine whether it actually belongs in your system.
You did not save your senior engineers any time. You tripled their workload and called it progress.
Slop Has a Shape
Let me tell you what AI slop looks like in a codebase. It is not obviously broken. That is the danger.
Slop compiles. Slop passes tests — especially when the tests were also generated by AI and validate the wrong things. Slop follows patterns, but the wrong patterns. Slop introduces subtle coupling because the model does not understand your domain boundaries. Slop duplicates logic that already exists three directories away because the agent’s context window did not include it. Slop names things almost right but not quite — close enough to merge, different enough to confuse the next engineer who touches it six months later.
Slop is not a bug. Slop is a thousand tiny decisions that no one with real context made.
Your PR count is up forty percent. Your defect rate is up too — you just have not measured it yet. Or you have, and you are telling yourself it is a temporary adjustment period. It is not. This is the new steady state unless you change something structural.
The Governance Model You Needed Yesterday
Here is the part nobody wants to hear. When you adopted AI tooling, you needed to simultaneously redesign your governance model. Not “think about it.” Not “put it on the roadmap.” Simultaneously.
Your old governance model was designed for a world where the bottleneck was generation. Where writing code was slow enough that by the time something reached review, a human had spent hours or days thinking about it. That enforced a natural quality gate — the speed of human thought.
That gate is gone.
Now you need to build intentional gates to replace the ones that human speed used to provide for free. And those gates need to be different from what you had before.
Review is not optional and it is not the same. When a human writes code over two days, another human can review it in thirty minutes. When an agent generates code in ten minutes, the review might take longer than the writing did. Your process needs to account for that inversion. Review allocation is now your most critical capacity planning problem. Not sprint planning. Not backlog grooming. Review capacity.
Architecture review moves upstream. In the old model, you could catch architectural mistakes in PR review because PRs were small enough and infrequent enough to scrutinize. Now you are getting ten PRs where you used to get one. You cannot do deep architectural review on all of them. Which means architectural decisions need to happen before the agent starts writing — in design docs, in interface contracts, in guardrails that constrain what the agent can generate. You are reviewing the blueprint, not every brick.
Testing standards must be ruthless. AI-generated tests that validate AI-generated code is a closed loop that proves nothing. Your testing standards need to specify what gets tested, at what level, with what coverage expectations — and humans need to own the test strategy even if agents write the test code. The question is not “do we have tests.” The question is “do the tests validate the right behavior from the user’s perspective.”
Merge criteria need teeth. If your merge criteria is “two approvals and green CI,” congratulations — you have automated the rubber stamp. Merge criteria in an AI-augmented world needs to include architectural conformance, duplication analysis, domain boundary checks, and explicit verification that the change was reviewed by someone who understands the subsystem. Not just someone who clicked approve.
The Intentionality Problem
You know what separates organizations that are actually getting value from AI and the ones running slop factories? One word: intentionality.
The winners did not just “adopt AI.” They redesigned their SDLC with AI as a first-class participant. They asked hard questions before the first line of generated code hit a branch:
How do we maintain architectural coherence when generation is cheap?
Who reviews what, and how do we fund that time?
What does “senior engineer” mean when the junior can generate code at the same speed?
How do we prevent duplication across a codebase when agents do not have full-system context?
What are our quality signals now that “it compiles and passes tests” is basically free?
The losers — and I say that with empathy because most of the industry is in this camp right now — did none of that. They bought licenses. They tracked adoption metrics. They celebrated the PR velocity spike. And they are now six months into a codebase that is thirty percent larger, twenty percent more coupled, and meaningfully harder to reason about than it was before they “adopted AI.”
That is not adoption. That is a slop factory with enterprise pricing.
The Teams You Actually Need
Here is the uncomfortable math. You do not need a bigger organization. You need smaller teams working on intentional things.
Teams of engineers who are excellent at software engineering and excellent at working with AI agents. Not one or the other. Both. The engineers who produce value in this model are not the ones who can prompt an agent into generating code. Anyone can do that. The engineers who matter are the ones who can constrain an agent — who can build the guardrails, the interface contracts, the architectural boundaries, the coding standards, and encode those constraints directly into the agent’s workflow so that what comes out the other end is not slop. It is functioning code that meets your standards on the first pass. Not after review. Not after QA catches it. On generation.
That is a fundamentally different skill than what you have been hiring for.
What you need is fewer handoffs. Less bureaucracy that is manual. More guardrails and guidelines that are automated. The agent does not need a standup. The agent does not need a ticket updated in Jira. The agent needs constraints, context, and clear architectural intent. Everything else is overhead that slows down the one thing that actually got faster.
Sound familiar? It should. We read this story fifteen years ago. DevOps told you to break down the wall between dev and ops. Agile told you to shrink your iterations and ship faster. Lean told you to eliminate waste and reduce handoffs. Every one of those movements said the same thing: smaller teams, less process, more trust, faster feedback. And most organizations paid lip service to the principles while keeping the bureaucracy intact.
This time it is different. Not because the philosophy changed — because the economics did.
A startup called OpenClaw just got acquired for a billion dollars. Team of one. One engineer. One person with agents, constraints, and the skill to direct them — building a product valuable enough to attract a ten-figure acquisition. That is not a cute anecdote. That is the market telling you what the new unit economics of software look like.
Big tech figured out their version of it already. The major platforms are reporting thirty to thirty-five percent of their shipped code is now AI-generated. Not AI-assisted. AI-generated and released to production. But they did not get there by handing Copilot to a thousand engineers and hoping for the best. They got there by building the constraints into the system — the linting, the architecture enforcement, the automated quality gates — so that agents produce code that is releasable by design.
Smaller teams. Fewer handoffs. Automated guardrails instead of manual bureaucracy. Engineers who understand the system deeply enough to tell the agent what not to do. That is the model that works. Not your current model with AI bolted on top.
Your Workflow Is Backwards
Look at what you are actually doing right now. Be honest about it.
Your developers generate code with an agent. Then your QA team uses a different agent to write unit tests after the fact. The tests validate what was generated — not what should have been generated. You have built a circular system where AI checks AI’s homework and everyone pretends that counts.
And that is only when your product manager can actually tell the team what to build. Half the time the requirements are vague enough that the agent is guessing — and an agent that guesses at scale produces slop at scale. Garbage in, garbage out. That was true before AI and it is aggressively true now.
Meanwhile, there are agentic systems — right now, in production, in R&D environments you can go look at — that build entire products end to end. Humans guiding. Agents driving. Not generating code for a human to review and a QA team to test and a product manager to re-explain. The agent builds, tests, validates, and iterates — with a human providing direction, constraints, and judgment at the decision points that matter.
That is the direction this is heading. Your workflow where a developer prompts, QA tests, and a reviewer rubber-stamps is not a step on the way to that future. It is a dead end. You automated the wrong thing. You need to rethink the entire flow — not just who types the code.
Your PRs Are Not a Productivity Metric
Let me say the quiet part out loud. Your PR count going up is not a sign of productivity. It might be a sign of waste.
In lean manufacturing, producing more units faster when your downstream processes cannot absorb them is not productivity. It is overproduction. It is the most dangerous form of waste because it looks like success.
Your engineers are generating more PRs. Your review queue is growing. Your merge-to-deploy cycle time is increasing. Your production incidents are creeping up. And you are reporting to your board that AI adoption is going great because the activity metrics are up.
Activity is not output. Output is not outcome.
The metric that matters is not how many PRs your team opens. It is how much validated, production-ready software ships per unit of time with quality that does not degrade. If that number has not improved — or has gotten worse — then you have not adopted AI. You have adopted chaos that compiles.
Retrofitting Governance Is Harder Than Doing It Right
Here is the worst part. If you are reading this and recognizing your organization, you already know the next question: how do I fix it?
The answer is harder than it should be because you now have a cultural problem on top of a process problem. Your engineers have spent six months in the habit of generating first, thinking second. Your review culture has eroded because your reviewers are overwhelmed and have started rubber-stamping to keep the queue moving. Your codebase has accumulated six months of architectural drift that nobody caught because nobody had time to catch it.
Retrofitting governance onto a team that has already internalized bad AI habits is significantly harder than building the right governance from day one. Not impossible. But harder. And you need to name that honestly instead of pretending a new policy document will fix it.
You need to slow down to speed up. That is the hardest sentence to say to a leadership team that just spent a quarter celebrating velocity numbers. But the velocity was an illusion. The speedometer was climbing while the engine was burning oil.
What Intentional AI Adoption Actually Looks Like
Stop counting PRs. Start measuring cycle time from commit to production with quality held constant.
Shrink your teams. Not as a cost-cutting exercise — as an engineering design decision. Smaller teams of engineers who deeply understand your system and know how to constrain agents will outship larger teams of engineers who use agents as fancy autocomplete. Every time. The bottleneck should be on the decisions that matter — architecture, domain boundaries, what to build and why — not on how many lines of code get generated per sprint.
Build the standards into the agent, not into the review. If your quality gates only exist in the PR review step, you are already too late. The engineers who are getting this right encode their architectural constraints, their coding standards, their testing expectations directly into the agent’s context. The agent produces code that conforms — not because a reviewer caught the deviation, but because the agent was never allowed to deviate.
Kill the backwards QA loop. Tests written after generation by a separate team using a separate agent are theater. Testing strategy needs to be part of the generation process — the agent builds the feature and the tests together, constrained by a human-defined test strategy that specifies what matters. Not coverage percentages. Behavior. Edge cases. Failure modes. The things a product person and an engineer agree on before a single line gets generated.
Fix your upstream problem. If your product manager cannot clearly define what needs to be built, AI does not help. AI makes it worse. A vague requirement that used to produce one confused pull request now produces ten confused pull requests in a fraction of the time. Requirements clarity is not a nice-to-have in an AI-native workflow. It is the single highest-leverage investment you can make. An agent with clear constraints and clear requirements produces working software. An agent with neither produces slop.
Fund review capacity like you fund development capacity. If you still have engineers generating code that needs human review, you need explicit budget for that review load. But the endgame is not more reviewers. The endgame is agents that are constrained well enough that review becomes verification, not discovery.
Have the governance conversation now. Not next quarter. Not when you finish the current sprint. Now. Because every week you wait, the codebase drifts further, the habits calcify deeper, and the cost of correction increases.
The Factory Metaphor Is Not an Accident
A slop factory is still a factory. It has inputs. It has outputs. It has processes and capacity and throughput. The problem is not the machinery. The problem is that nobody designed the quality system for what the machinery now produces.
You would never install a manufacturing line that produces ten times the volume and keep the same two inspectors at the end of the belt. That is obviously insane. But that is exactly what you did with your SDLC.
AI is the most powerful tool your engineering organization has ever been handed. It is genuinely transformative. But transformation without intention is just faster entropy.
You do not have an AI problem. You have a governance problem that AI made visible. The codebase was always going to reflect the quality of your process. You just never generated enough code fast enough for the gaps to matter.
Now you do.
Fix the process. Redesign the governance. Make the intentional choices you skipped when the demo looked impressive and the adoption metrics made the quarterly slide look good.
Or keep running the slop factory. Your competitors who figured this out are not waiting.