Skip to content
自律的な信頼性

Agents Propose, Humans Authorize: How to Encode Authority Into Autonomous Systems

A practical guide for fintech risk officers on encoding policy, approval, and audit into autonomous agents so they act without ceding control.

Zof Reliability Team · エンジニアリング & プロダクト

2026年2月24日 · 読了時間 7 分 · 2026年2月24日 更新

Share
01

Why "trust the model" is not a control

The industry data should reset everyone's risk appetite. Roughly 41% of codebases are now AI-generated, and around 45% of AI coding tasks introduce a critical flaw or security issue. Meanwhile, the cost of poor software quality sits near $2.41 trillion. You are not deciding whether to let capable, occasionally-wrong systems into your change pipeline. They are already there.

The instinct of many teams is to manage this with more model quality or sharper prompts. That is the wrong layer. A better model still produces a non-trivial defect rate, and you cannot audit a probability. What a regulated institution needs is not more intelligence in the agent, it needs a control layer around it: a place where every proposed action is mapped, validated, gated, and recorded before anything reaches production.

The other uncomfortable number is behavioral. Roughly 80% of developers bypass policy or guardrails. This is rarely malice. It is friction. When governance lives in a wiki, a checklist, or a Friday change-advisory-board meeting, people route around it to ship. Any authority model that depends on humans remembering to follow a document will be bypassed at exactly the moment it matters. Authority has to be encoded into the system itself, where it cannot be skipped.

02

Encode authority as policy-as-code, not a meeting

Start by separating the two acts that ungoverned automation collapses together: proposing a change and authorizing it. An agent that both writes a fix and applies it has merged the maker and the checker, which is precisely the separation of duties your auditors expect to see preserved.

Concretely, give your agents a propose-only default. They can plan, generate, test, and stage a change, producing a complete proposal with the supporting evidence attached. They cannot move it into a protected environment. The transition from proposed to authorized is a separate, policy-governed event.

That policy belongs in code, not prose. A workable policy layer expresses authority along a few axes:

  • Blast radius. Which services, data stores, and customer boundaries does this change touch? A change to a logging label is not a change to the payments ledger.
  • Confidence and evidence. What validation passed, against what version of the system, and was the failing behavior reproduced before the fix?
  • Risk tier. Is this an externally reachable, regulated, or revenue-critical path?
  • Authorizer. Who, by role, is allowed to approve this class of change, and is that person distinct from anyone who proposed it?

When these are explicit rules rather than tribal knowledge, you get something a checklist never gives you: the same decision every time, on every change, with no quiet exceptions at 2 a.m. This is the role Zof's Governance surface plays, and the change-awareness that makes it possible comes from the System Graph, a live map of services, dependencies, and CI/CD that tells the policy engine what a given change actually touches.

03

Tier your approvals so the gate doesn't become the bottleneck

The fastest way to get governance bypassed is to make every change wait for the same heavyweight review. If a typo fix and a settlement-logic change sit in the same approval queue, your reviewers will rubber-stamp both, and the rubber stamp is where control dies.

Risk-tier the gates instead. A defensible pattern for a fintech team:

  • Auto-apply with rollback for low-blast-radius, high-confidence changes on non-regulated paths, with the System Graph confirming no reachable critical dependency. Every action is still logged; the human authority is the policy that permitted it.
  • Propose-and-pause for anything touching reachable, regulated, or revenue paths. A named, role-appropriate human authorizes, with the full evidence package in front of them.
  • Escalate when confidence is low, the change crosses a customer boundary, or the validation could not reproduce the original behavior. These go to a senior reviewer, never to auto-apply.

The point is not to slow everything down. It is to spend human attention on the genuinely risky minority and let governed automation handle the rest without ceding the decision. Reachability-based prioritization is what makes this honest: focusing on what is actually exploitable rather than every theoretical issue can mean 70 to 90% less exploitable exposure, which is also what lets you safely automate the low-risk long tail. This is governed autonomy. Reliability becomes the default, not the exception, without anyone pretending oversight has been removed.

04

Make the audit trail a byproduct, not a project

For a compliance officer, the deciding question is usually the last one: when an examiner asks why a change went live, can you answer in minutes with evidence, or in weeks with a reconstruction?

The trail should be a byproduct of how the system runs, not a separate logging effort bolted on afterward. Every proposal, every piece of validation evidence, every approval (and rejection), and the System Graph context at the moment of decision should be linked into one immutable record. An auditor's real test is not "do you have logs," it is "can you prove that this specific change was authorized by someone permitted to authorize it, on evidence that existed before approval, and that the control was not bypassed." That requires the proposal, the evidence, and the authorization to be a single linked artifact.

The closed loop Zof operates, Understand, Test, Reproduce, Remediate, Verify, produces this evidence at each stage because validation is performed by Testing Fleets and fixes by governed Remediation Fleets, not by static scripts that leave no defensible record. Remediation is the hardest and most critical part to govern; letting agents fix code unsupervised is reckless, which is exactly why the approval and audit machinery is the engineering, not a feature you add later.

### When the data cannot leave your boundary

Fintech rarely lets a vendor exfiltrate production data or run change machinery in someone else's cloud. The pattern that resolves this is to run the agents inside your boundary while keeping the authority model intact. Zof's Edge Runners are signed capsules that execute inside a secure enclave or your own perimeter and emit audit-ready evidence outward. The data stays put; the proof comes to you. Authority and residency stop being a tradeoff.

05

What to do Monday morning

You do not need a year-long program to start encoding authority. A practical sequence:

  1. Inventory where agents already act. Find every place an automated tool can write to a protected environment today. You will likely find more than you expected.
  2. Split propose from authorize. Set agents to propose-only by default and require an explicit, role-checked authorization for protected paths.
  3. Write three policies, not thirty. Define your auto-apply, propose-and-pause, and escalate tiers in code. Tune from there.
  4. Link the evidence. Ensure each authorization is bound to the validation evidence and the system context it was based on, in one record an examiner can pull.
06

The bottom line

続きを読む

01Zof Console

姿勢、操作、次に注意が必要なことを 1 つの面で確認できます。

エンジニアリング、QA、SREの各チームが毎日開く認証済みのホーム。品質の姿勢、進行中の実行、モジュールごとのカバレッジ、そして次に注目すべきことが分かります。

運用上の KPI

実行数、カバレッジ、リスク

出荷先のあらゆる環境に対応します。

ワークスパイン

仕様・テスト・スケジュール

仕様から計画された回帰まで。

ガードレール

RBAC・SSO・監査

指定された人間に起因するすべての行為。

LIVE/console
Zof AI ホーム コマンド センターには、94% パスでの 12 件の実行、3 つの未解決の重大な問題、84% のカバレッジ、4 つのモジュール トレーサビリティ バー、仕様パイプライン、今後のスケジュール、アクティブ実行サイドバー付きの推奨される次のアクションが表示されます。
ホーム ビュー · チェックアウト サービス · ステージング · 製品からライブでキャプチャ。
  • 01 · RUNS · 24H

    94% pass

    12 runs across staging

  • 02 · COVERAGE

    84%

    Across four modules

  • 03 · ACTIVE RUNS

    3 running

    Live on this branch

  • 04 · NEXT ACTIONS

    Recommended

    Triage gaps, new spec

Agents Propose, Humans Authorize: How to Encode Authority Into Autonom