自律的な信頼性

Agents Propose, Humans Authorize: How to Encode Authority Into Autonomous Systems

A practical guide for fintech risk officers on encoding policy, approval, and audit into autonomous agents so they act without ceding control.

Book a demo

Zof Reliability Team · エンジニアリング & プロダクト

2026年2月24日 · 読了時間 7 分 · 2026年2月24日更新

概要

Autonomous agents are already writing, testing, and proposing changes to systems that move regulated money. The open question for a risk officer is no longer whether agents will act inside your stack, but who authorizes each action, on what evidence, and whether you can prove that authority held under audit. That is an engineering problem before it is a policy one, and treating it as an afterthought is how control quietly leaks out of the building. The principle is simple to state and hard to implement: agents propose, humans authorize. This guide covers the patterns that make that principle real in a fintech environment, where a single unauthorized change can become a regulatory event.

The industry data should reset everyone's risk appetite.
Start by separating the two acts that ungoverned automation collapses together: proposing a change and authorizing it.
The fastest way to get governance bypassed is to make every change wait for the same heavyweight review.

Why "trust the model" is not a control

The industry data should reset everyone's risk appetite. Roughly 41% of codebases are now AI-generated, and around 45% of AI coding tasks introduce a critical flaw or security issue. Meanwhile, the cost of poor software quality sits near $2.41 trillion. You are not deciding whether to let capable, occasionally-wrong systems into your change pipeline. They are already there.

The instinct of many teams is to manage this with more model quality or sharper prompts. That is the wrong layer. A better model still produces a non-trivial defect rate, and you cannot audit a probability. What a regulated institution needs is not more intelligence in the agent, it needs a control layer around it: a place where every proposed action is mapped, validated, gated, and recorded before anything reaches production.

The other uncomfortable number is behavioral. Roughly 80% of developers bypass policy or guardrails. This is rarely malice. It is friction. When governance lives in a wiki, a checklist, or a Friday change-advisory-board meeting, people route around it to ship. Any authority model that depends on humans remembering to follow a document will be bypassed at exactly the moment it matters. Authority has to be encoded into the system itself, where it cannot be skipped.

Encode authority as policy-as-code, not a meeting

Start by separating the two acts that ungoverned automation collapses together: proposing a change and authorizing it. An agent that both writes a fix and applies it has merged the maker and the checker, which is precisely the separation of duties your auditors expect to see preserved.

Concretely, give your agents a propose-only default. They can plan, generate, test, and stage a change, producing a complete proposal with the supporting evidence attached. They cannot move it into a protected environment. The transition from proposed to authorized is a separate, policy-governed event.

That policy belongs in code, not prose. A workable policy layer expresses authority along a few axes:

Blast radius. Which services, data stores, and customer boundaries does this change touch? A change to a logging label is not a change to the payments ledger.
Confidence and evidence. What validation passed, against what version of the system, and was the failing behavior reproduced before the fix?
Risk tier. Is this an externally reachable, regulated, or revenue-critical path?
Authorizer. Who, by role, is allowed to approve this class of change, and is that person distinct from anyone who proposed it?

When these are explicit rules rather than tribal knowledge, you get something a checklist never gives you: the same decision every time, on every change, with no quiet exceptions at 2 a.m. This is the role Zof's Governance surface plays, and the change-awareness that makes it possible comes from the System Graph, a live map of services, dependencies, and CI/CD that tells the policy engine what a given change actually touches.

Tier your approvals so the gate doesn't become the bottleneck

The fastest way to get governance bypassed is to make every change wait for the same heavyweight review. If a typo fix and a settlement-logic change sit in the same approval queue, your reviewers will rubber-stamp both, and the rubber stamp is where control dies.

Risk-tier the gates instead. A defensible pattern for a fintech team:

Auto-apply with rollback for low-blast-radius, high-confidence changes on non-regulated paths, with the System Graph confirming no reachable critical dependency. Every action is still logged; the human authority is the policy that permitted it.
Propose-and-pause for anything touching reachable, regulated, or revenue paths. A named, role-appropriate human authorizes, with the full evidence package in front of them.
Escalate when confidence is low, the change crosses a customer boundary, or the validation could not reproduce the original behavior. These go to a senior reviewer, never to auto-apply.

The point is not to slow everything down. It is to spend human attention on the genuinely risky minority and let governed automation handle the rest without ceding the decision. Reachability-based prioritization is what makes this honest: focusing on what is actually exploitable rather than every theoretical issue can mean 70 to 90% less exploitable exposure, which is also what lets you safely automate the low-risk long tail. This is governed autonomy. Reliability becomes the default, not the exception, without anyone pretending oversight has been removed.

Make the audit trail a byproduct, not a project

For a compliance officer, the deciding question is usually the last one: when an examiner asks why a change went live, can you answer in minutes with evidence, or in weeks with a reconstruction?

The trail should be a byproduct of how the system runs, not a separate logging effort bolted on afterward. Every proposal, every piece of validation evidence, every approval (and rejection), and the System Graph context at the moment of decision should be linked into one immutable record. An auditor's real test is not "do you have logs," it is "can you prove that this specific change was authorized by someone permitted to authorize it, on evidence that existed before approval, and that the control was not bypassed." That requires the proposal, the evidence, and the authorization to be a single linked artifact.

The closed loop Zof operates, Understand, Test, Reproduce, Remediate, Verify, produces this evidence at each stage because validation is performed by Testing Fleets and fixes by governed Remediation Fleets, not by static scripts that leave no defensible record. Remediation is the hardest and most critical part to govern; letting agents fix code unsupervised is reckless, which is exactly why the approval and audit machinery is the engineering, not a feature you add later.

### When the data cannot leave your boundary

Fintech rarely lets a vendor exfiltrate production data or run change machinery in someone else's cloud. The pattern that resolves this is to run the agents inside your boundary while keeping the authority model intact. Zof's Edge Runners are signed capsules that execute inside a secure enclave or your own perimeter and emit audit-ready evidence outward. The data stays put; the proof comes to you. Authority and residency stop being a tradeoff.

What to do Monday morning

You do not need a year-long program to start encoding authority. A practical sequence:

Inventory where agents already act. Find every place an automated tool can write to a protected environment today. You will likely find more than you expected.
Split propose from authorize. Set agents to propose-only by default and require an explicit, role-checked authorization for protected paths.
Write three policies, not thirty. Define your auto-apply, propose-and-pause, and escalate tiers in code. Tune from there.
Link the evidence. Ensure each authorization is bound to the validation evidence and the system context it was based on, in one record an examiner can pull.

The bottom line

エンタープライズAI AIガバナンス System Graph テスティングフリート修復フリート

続きを読む

自律的な信頼性

The Control Layer for Regulated Software: Signed Capsules, Enclaves, and Customer-Controlled Evidence

How Zof's control plane reaches into secure enclaves via signed capsules and Edge Runners, giving regulated buyers governed autonomy with audit-ready, customer-controlled evidence.

Zof Reliability Team2026年6月25日読了時間 7 分

自律的な信頼性

The 7 Signs Your QA Has Outgrown Test Automation

Flaky scripts, coverage that ignores risk, release anxiety. Seven signs your QA has outgrown test automation and needs Quality Intelligence instead.

Zof Reliability Team2026年6月4日読了時間 8 分

自律的な信頼性

The Reliability Control Loop: Understand, Test, Reproduce, Remediate, Verify

A platform engineer's walkthrough of the five-stage reliability control loop, Understand, Test, Reproduce, Remediate, Verify, and how each maps to a governed control layer.

Zof Reliability Team2026年6月1日読了時間 7 分

Why "trust the model" is not a control

Encode authority as policy-as-code, not a meeting

Tier your approvals so the gate doesn't become the bottleneck

Make the audit trail a byproduct, not a project

What to do Monday morning

The bottom line

続きを読む

The Control Layer for Regulated Software: Signed Capsules, Enclaves, and Customer-Controlled Evidence

The 7 Signs Your QA Has Outgrown Test Automation

The Reliability Control Loop: Understand, Test, Reproduce, Remediate, Verify

姿勢、操作、次に注意が必要なことを 1 つの面で確認できます。