AI Won't Fix a Broken Governance Model

Google's DORA team concluded that AI will not fix broken engineering systems. Faros AI measured what the breakage looks like, quality and stability collapsing as velocity climbs, a condition it named Acceleration Whiplash. Two studies, two methods, one diagnosis. And it is a diagnosis, not a cure. Naming the wound tells you where it hurts. It does not tell you what to build.

Two independent research efforts converged this year on the same uncomfortable finding, from opposite directions. Google's DORA team, working from surveys of nearly five thousand technology professionals and more than a hundred hours of interviews, concluded that AI amplifies whatever engineering system it lands in, that it will not fix a broken one, and will often make a broken one worse faster. Faros AI, working from delivery telemetry across twenty-two thousand developers rather than surveys, measured the mechanism in hard numbers and gave it a name: Acceleration Whiplash. Real throughput gains at the top of the funnel, compounding quality and stability costs at every stage below. The survey and the telemetry tell the same story. The thing both are describing is not a tooling problem. It is a governance problem wearing a tooling costume.

The Faros numbers are worth stating plainly, because they quantify exactly where the whiplash lands. Median time in pull request review is up 441 percent against the prior dataset. Thirty-one percent more pull requests are now merging with no review at all, because the queue became impossible. Bugs per developer are up 54 percent. Incidents per pull request are up 242.7 percent, meaning for every change that merges, the probability of a production incident has more than tripled. Throughput climbed exactly as promised. Everything downstream of throughput buckled.

This is the same failure, on a new surface

Enterprise governance has lost this exact fight before, and it is worth being precise about how, because the shape is identical. For thirty years the answer to "how do we control our data" was mastery: build the canonical record, stand up the committee, reconcile everything to one golden source. It failed every time, for one structural reason that had nothing to do with effort or tooling. Mastery is a snapshot, and the business is a stream. By the time the master record was authoritative, the world it described had already moved, and the record was governing a past that no longer existed.

Code review is mastery applied to merges. The pull request queue, the manual sign-off, the post-incident retrospective: these are the governance committee in another costume. They worked for a long time, because the rate of change matched the rate of review: humans wrote code at human speed, and humans reviewed it at human speed, and the two stayed roughly in balance. AI broke the balance on one side only. The volume of decisions exploded; the decision mechanism did not. Acceleration Whiplash is precisely what a thirty-year-old governance model looks like when the input rate triples and the control rate stays flat. It is the master-data failure, replayed on the merge.

The reflex is to slow the input down: add more reviewers, more required approvals, more gates. That is mastery doubling down on itself, and the Faros data shows exactly where it leads: review time up 441 percent, and a third of pull requests slipping through with no review because the queue became physically impossible to clear. You cannot out-review a tripling. The throughput is real, the business has already priced it in, and it will not be given back. A governance model that can only function by slowing the business to the speed of the committee is a model the business will route around, and the 31 percent no-review merge rate is the business routing around it in real time.

The change rate tripled. The control rate stayed flat. Acceleration Whiplash is the gap between them, and you cannot close it by reviewing harder.

What actually changed

Two things changed at once, and only one of them is widely discussed. The first is volume, which is the subject of every AI-coding study and needs no further argument. The second is more important and almost never stated: the merge stopped being a human action. When a developer accepts AI-generated code and ships it, the meaningful decision was made by the model, and the human increasingly functions as a stamp on a decision they did not substantively make. Stanford's research puts a hard edge on this: developers using AI assistance rated their own insecure output as secure more often than developers writing unaided, which means the stamp is not just thin but actively miscalibrated. It carries false confidence.

That reframes the whole problem. A merge used to be a person's deliberate choice that occasionally needed oversight. It has become a cross-system, AI-originated action that lands in production, which is exactly the class of event that needs a governing decision before it executes, not an audit after it breaks. The merge has quietly become an agentic transaction. The industry is still governing it like a human one, with human-speed instruments, and Acceleration Whiplash is the measurable cost of that mismatch.

The principle: govern the decision, not the diff

Every agentic action needs an answer to one question before it proceeds: should this execute? A code merge is now squarely inside the scope of that question, and the answer cannot come from the instrument the industry keeps reaching for: a better scanner. A scanner answers "is this code free of known-bad patterns," which is a snapshot of known patterns checked against a candidate. That is mastery again, in miniature: a golden list of bad things, reconciled against the diff. It is useful, and it is not governance.

The governing question is different in kind. Not "is this code bad" but "should this specific change, of this origin, against this blast radius, proceed right now," scored at the moment of action, logged as a decision, resolved before the change lands rather than discovered after it breaks. This is the federation move applied to code, the same move N° 004 made for data: you do not master the codebase into a golden state and review against it; you resolve the risk of each action at the instant it wants to happen, with confidence scoring, and you let the routine ninety percent through untouched so that scarce human judgment lands only on the changes that actually carry consequence. Done this way, the control rate finally scales with the change rate, because the control is one decision per action, not one queue per team. The queue was the bottleneck. A per-action decision has no queue.

What builders and buyers should do differently

For builders, the instruction is to stop treating AI code safety as a scanner problem. The scanner answers "is this code bad," a known-pattern question on a snapshot, and the Faros numbers show that pouring more of that into the pipeline does not close the gap; it lengthens the queue. The unanswered question is "should this action execute," which is a governance question, not a security one. The teams that come through the next phase intact will be the ones who move the decision upstream of the merge and make it a scored, logged, in-flight adjudication, rather than another gate the queue eventually defeats.

For buyers, the accountable line on the org chart is shifting. When the merge is an agentic transaction touching regulated data and customer-facing systems, "did CI pass" is not an answer a CFO, a general counsel, or a regulator will accept. The question they will ask, who decided this should execute, and on what basis, is the same question that governs every other agentic action in the enterprise. Code is not a special case. It is simply the first case most engineering organizations will feel directly, because it is the surface where the change rate tripled first.

DORA named the condition: AI will not fix a broken engineering system. Faros measured the wound and called it Acceleration Whiplash. Both are diagnoses, and a diagnosis is not a cure. The cure is not slower code, and it is not a heavier review queue. Both of those are the thirty-year-old governance model insisting it can win a fight it has already lost. The cure is to stop governing the merge like a human action and start governing it like the agentic transaction it has become: a decision before the merge, scored in-flight, accountable after.

The change rate tripled. The control rate stayed flat. The fix is not slower code. It is a decision that happens before the merge. Govern the decision, not the diff.

End N° 014