Agents Propose, Humans Authorize: A Reference Architecture for Governed Autonomy
A reference architecture for letting agents act on production safely: the four control surfaces, policy, approval, evidence, attribution, and how they wire into the loop.
Why this needs an architecture, not a feature
The pressure is structural, not hypothetical. Roughly 41% of codebases are now AI-generated, and industry research suggests around 45% of AI coding tasks introduce a critical flaw or security issue. The cost of poor software quality sits near $2.41 trillion. You are not deciding whether to admit fast, occasionally-wrong systems into your change pipeline. They are already there, and they generate change faster than human review can absorb.
The common response is to reach for a better model or sharper prompts. That is the wrong layer. A better model still produces a non-trivial defect rate, and you cannot audit a probability distribution. What an enterprise needs is not more intelligence inside the agent, it needs a control layer around it. As we have argued in the control-layer thesis, AI is missing a control layer, not more models.
A control layer is not one feature. It is a small number of surfaces that every autonomous action must pass through, each answering a distinct question:
- Policy, *is this action allowed at all?*
- Approval, *who authorizes it, and on what evidence?*
- Evidence, *what proves the change is safe?*
- Attribution, *who or what did this, and can we prove it later?*
Miss any one and the others degrade. Approval without evidence is a rubber stamp. Evidence without attribution is unprovable under audit. Policy without approval is advisory, and advisory governance gets bypassed, about 80% of developers admit to routing around guardrails when those guardrails slow them down.
Surface 1: Policy, the boundary of allowed action
Policy is the first gate because it is the cheapest. It answers a yes/no question before any work happens: is an agent permitted to touch this surface, in this environment, under these conditions? Everything downstream assumes the action was admissible to begin with.
The failure mode here is policy-as-prose. When the rules live in a wiki, a runbook, or a quarterly change-advisory meeting, they are unenforceable at machine speed and they get skipped at exactly the moment they matter. Policy has to be code, evaluated inline, with no path around it.
Two design choices make policy real. First, express authority along axes that map to risk, not to surface signals like lines changed or file count: blast radius, data sensitivity, environment, and reachability. A three-line change to a shared authentication library is more dangerous than a 600-line change to an isolated internal tool, and only a policy that reasons about the dependency graph knows that. This is why policy cannot be evaluated from the diff alone, it needs the System Graph, a live map of services, dependencies, and CI/CD that tells the policy engine what a change actually touches. Second, default agents to propose-only. They may plan, generate, and stage a change, but moving it into a protected environment is a separate, governed event. That separation is the architecture's backbone.
Surface 3: Evidence, what makes approval defensible
An approval is only as trustworthy as the evidence behind it. The reason most teams cannot safely auto-merge is that they have no defensible proof a change is good, so they fall back to a human eyeball as a weak substitute for a test that should have run.
Invert that. Every proposal should arrive at the gate already carrying the evidence the gate needs: which paths were exercised, what regressed, whether the original failing behavior was reproduced before the fix, and what reachability analysis says about exposure. That last point sharpens security gates specifically, reachability-based prioritization, asking whether a flaw sits on a path actually reachable in your deployed system, can mean 70 to 90% less exploitable exposure to triage. An unreachable flaw need not block a release; a reachable one routes straight to the strictest tier.
Evidence is not a static suite passing. It is validation that knows what changed and what depends on it. That is the job of coordinated Testing Fleets, agents that plan, execute, and maintain change-aware validation as the system evolves, rather than scripts that ignore the dependency graph and quietly rot. Watch for coverage laundering: a change that shows "tests passed" while the validation never exercised the changed path. The gate must read coverage *of the change*, not aggregate green.
Surface 4: Attribution, proving who did what
Attribution is the surface most teams forget, and it is the one that decides whether the other three survive an audit. It answers the question an examiner actually asks: can you prove that *this specific change* was authorized by someone permitted to authorize it, on evidence that existed *before* approval, and that no control was bypassed?
"Do you have logs" is the wrong test. Logs can be edited and they decouple the proposal from the evidence from the approval. Attribution requires those to be a single immutable, linked artifact: the proposal, the validation evidence, the System Graph context at the moment of decision, and the authorization (or rejection), bound together. The trail should be a byproduct of how the system runs, not a reconstruction project that starts when the examiner arrives.
Two constraints raise the bar. Auto-merged changes, the ones no human watched, need the *same* evidence record as reviewed ones; the absence of a human in the path strengthens the attribution requirement, it does not relax it. And when data cannot leave your boundary, attribution still has to hold: Edge Runners execute as signed capsules inside a secure enclave and emit audit-ready evidence outward, so the proof comes to you while the data stays put.
How the four wire into the loop
These surfaces are not a checklist run once. They map onto the closed loop, Understand, Test, Reproduce, Remediate, Verify, and each loop pass writes to each surface. *Understand* is the System Graph feeding policy. *Test* and *Reproduce* generate evidence. *Remediate* is the propose step that approval gates, and remediation is the hardest, most critical surface to govern, which is exactly why letting agents fix code unsupervised is reckless and the approval machinery is the engineering. *Verify* confirms the change held and closes the attribution record. Remove any surface and the loop leaks: changes act without admissibility, approve without proof, or land without a trail.
The bottom line
Guías relacionadas
Producto relacionado
Continuar leyendo
More Models Won't Save You: Why AI-Generated Code Needs a Control Layer, Not Smarter Autocomplete
Better code generation can't validate its own output. Why AI-written code needs a governed control layer that maps, tests, and proves every change.
Code Without Provenance: The Real Risk When 41% of Your Codebase Has No Author
When 41% of your codebase has no author, the real risk isn't bugs, it's lost intent. How a System Graph restores the provenance AI-generated code strips away.
The Audit Trail Is the Product: Evidence-Grade Logging for Autonomous Agents
Why the audit trail is the primary system of record for autonomous agents in fintech, and how to make it evidence-grade: attributable, complete, and tamper-evident.
