The Silent Enemy: A First-Principles Look at the Cost of Rework
Rework, not slow developers, is what kills engineering momentum. A first-principles look at why it scales with AI-generated code and how to attack it at the source.
Rework is a tax you pay in compounding interest
Treat a defect as a financial instrument and the economics become clear. The price of a flaw scales with how far it travels before discovery. Caught in the author's own change, it costs minutes. Caught in code review, it costs a colleague's context-switch. Caught in staging, it costs a coordinated fix. Caught in production, it costs an incident, a postmortem, a customer-trust hit, and a backlog of follow-up work that crowds out the roadmap.
That escalation is not linear. Each boundary a defect crosses, the cost multiplies, because more people, more dependent code, and more assumptions have built on top of it. By the time a production incident pulls four engineers into a war room, you are not paying for one bug. You are paying interest on a flaw that should have been retired at the source.
This is why the macro number is so large. Industry research puts the cost of poor software quality near $2.41 trillion. That figure is not mostly the original defects. It is the rework, the rediscovery, the remediation, and the second-order disruption those defects create as they propagate. Rework is the principal and the interest combined.
Why the enemy is winning right now
The economics of rework have always been brutal. What changed is the volume of changes entering the system and the rate at which those changes are defective.
Roughly 41% of codebases are now AI-generated. That is not a future projection; it is the current intake rate. And industry research puts the share of AI coding tasks that introduce critical flaws or security issues near 45%. Read those two numbers together. A large and growing fraction of your codebase is arriving through a pipe that is roughly a coin-flip on whether any given task introduces a serious problem.
The instinct is to slow the pipe down. That instinct is wrong, and the business will not allow it anyway. The generation rate is not going back down. So the question for an engineering manager is not "how do we write less code" or "how do we move faster." It is "how do we keep the rework tax from scaling linearly with our intake volume." If every increase in change velocity produces a proportional increase in defects-that-escape, your team runs faster on a treadmill that tilts steeper every quarter.
There is a human cost layered on top. When defects routinely escape, engineers stop trusting the pipeline. They add manual checks, they hedge, they review defensively, and the ones who can will route around the slowest controls. Industry research puts policy bypass at roughly 80% of developers, which is less an indictment of discipline than a signal that the guardrails were advisory and the deadline was real. Advisory controls lose every race they enter.
Speed and quality are not a tradeoff. They are the same variable.
The most expensive belief in engineering is that you choose between shipping fast and shipping safe. That framing made sense when the only way to catch defects earlier was to add more human gates, which genuinely did slow delivery. But the tradeoff is an artifact of the tooling, not a law of nature.
Consider where the time actually goes. A team that ships fast and reworks constantly has a high gross velocity and a low net velocity. A team that catches defects at the source spends less time in war rooms and more time on the roadmap, so its net velocity is higher even if its gross commit rate looks identical. The lever that improves both is the same: shrink the distance between a defect's creation and its detection. Do that, and you get faster delivery and fewer escapes from the same action. The tradeoff dissolves.
This is the first-principles case for treating reliability as infrastructure rather than a phase. Reliability should be the default, not the exception you bolt on before a big release. You do not attack rework by working harder downstream. You attack it by closing the gap upstream, structurally, so the cheap-to-fix window is the one defects are actually caught in.
Attacking rework at the source
If rework is created at the moment a defect escapes detection, the intervention is to make detection change-aware, continuous, and governed. That is what reliability infrastructure does, and it maps cleanly to a closed loop: understand the system, test what changed, reproduce the failure, remediate under policy, and verify the fix.
A few mechanisms matter most for an engineering manager trying to bend the rework curve:
- Know what a change actually touches. Most escaped defects are integration problems, not local ones. A live System Graph of services, dependencies, and CI/CD lets validation target the real blast radius of a change instead of re-running a static suite that has no idea what moved.
- Validate continuously, not once. Static test scripts rot as the system evolves; they pass while the world changes underneath them. Testing Fleets plan, execute, and maintain validation as the system changes, so coverage tracks reality.
- Prioritize by what is reachable. Not every flaw is exploitable, and triaging a flat list of findings is its own form of rework. Reachability-based prioritization can mean 70 to 90 percent less exploitable exposure, because you spend remediation effort on what is actually reachable in the live system.
- Govern the fix. Remediation is the hardest and most consequential part of the loop, which is exactly why it cannot be unsupervised. Remediation Fleets propose scoped fixes; Governance decides whether and how they execute. Agents propose, humans authorize, and every step is auditable.
The governance point is not a hedge. A serious enterprise does not want more autonomous AI making unsupervised changes to production. It wants control over what that automation is allowed to do. The engineering is in the policy and the audit trail, not in removing the human from decisions that warrant one.
What to do Monday morning
You do not need a platform migration to start measuring the enemy.
- Quantify your rework rate. For the last sprint, tag every task as new work or rework. Most teams have never looked, and the ratio is sobering.
- Find your detection boundary. For your last five incidents, mark where the defect was created and where it was finally caught. The distance between those two points is your tax.
- Make one advisory check enforceable. Pick a guardrail that lives in a wiki or a non-blocking warning and turn it into a gate that travels with the change.
- Pick one class of fix to govern, not just automate. Choose remediations you would trust as a proposal but want to authorize. That is the shape of governed autonomy.
For the longer argument on why AI-generated code raises the stakes, the AI code testing imperative makes the case, and how it works shows the loop end to end.
The bottom line
Related guides
Related product
Continue Reading
From Microsoft Scale to a New Category: How TAS23 Became Zof
The founder arc behind Zof: running engineering at Microsoft scale, a 2023 conference talk, and the reframe from QA tooling to governed reliability infrastructure.
The Closed Loop: Why Reliability Is Five Steps, Not One Tool
A founder's case for why reliability is an operating loop, not a tool: Understand, Test, Reproduce, Remediate, Verify, built for SREs drowning in AI-speed change.
Agents Propose, Humans Authorize: The Principle Behind Governed Autonomy
Why \"agents propose, humans authorize\" is the founding design rule that separates a credible reliability control layer from reckless autonomous fixing.
