Skip to content
Architecture de déploiement

The Signed Capsule: How Immutable, Customer-Controlled Test Execution Actually Works

A technical deep-dive on Zof Edge Runner capsules: how signing, provenance, immutability, and chain-of-custody make test execution evidence you can defend.

Équipe Fiabilité Zof · Ingénierie et produit

6 mai 2025 · 8 min de lecture · Mis à jour le 6 mai 2025

Share
01

Why ephemeral test execution fails audit

Start with the reachability problem an SRE actually faces. Roughly 41% of codebases are now AI-generated, and close to 45% of AI coding tasks introduce a critical flaw or security issue. The change rate is up and the per-change defect rate is up at the same time. So you want more validation, more often, against more surface area.

The naive way to do that with agents is to let a model decide what to test at runtime and execute it on the spot. For an identity or security workload, that is exactly backwards. The system that decides what to run is now an unbounded actor inside your trust boundary. There is no stable artifact to review before it acts, no signature to verify, and no way to prove afterward that the execution stayed inside its intended scope. When the audit comes, you have logs of effects but no attestable description of intent.

The capsule inverts the order of operations. Work is assembled and reviewed *before* it can run, frozen, signed, and only then admitted for execution. Intent becomes an artifact. Execution becomes the controlled replay of an approved artifact, not an open-ended session.

02

Anatomy of a signed capsule

A Zof capsule is an immutable, versioned package, not an ad hoc script. Concretely, it carries:

  • A constrained manifest that declares exactly what may run: target endpoints, allowed actions, fixtures, environment scope, and resource limits. Nothing outside the manifest executes.
  • Content hashes over every included artifact, so the bytes that run are provably the bytes that were reviewed.
  • Approval records binding the capsule to the humans and policies that authorized it.
  • A version identity that places this capsule in a lineage you can walk.

Three properties matter, and they are worth separating because teams tend to collapse them.

Immutability means the capsule cannot change after it is sealed. The manifest plus the content hashes form a fixed object; mutate any byte and the hashes no longer match. There is no "quick edit before the run." If you need a different test, you produce a new, separately approved capsule. That constraint is the point: it removes the entire class of "the script drifted from what we reviewed" failures.

Provenance means the capsule knows where it came from. It records the planning inputs, the System Graph context that scoped it, the policy version in force, and the approvers. You can answer "why does this capsule test this path" without reconstructing it from memory.

Signature means the capsule is cryptographically attested. Before anything runs, the receiving side verifies the signature against your trusted keys. A capsule that is unsigned, signed by an untrusted key, or altered after signing simply does not execute. The signature is the difference between "we think this is what we approved" and "we can prove it."

03

Splitting intelligence from execution

The capsule only makes sense alongside the deployment split it rides on. Zof separates *thinking* from *doing* across planes with deliberately different trust requirements, described in the secure-enclave deployment model.

Planning and generation, where the powerful models live, happen on the intelligence side: System Graph modeling, risk prioritization, test generation, capsule assembly. That side never executes against your protected systems. Governance sits in between: human approval, role-based controls, signing, policy checks, capsule versioning and promotion. Execution happens entirely inside your boundary on a customer-deployed runner that makes no outbound model calls at runtime.

The architectural payoff is specific. The plane that needs the most capable models touches the least sensitive data. The plane that touches identity-adjacent systems runs no models and is driven only by signed, frozen capsules. You get coordinated agent planning without putting inference on the critical path inside your most sensitive segment.

04

Crossing the boundary without weakening it

A signed capsule still has to enter the boundary, and an SRE's reflex here is correct: any inbound path is attack surface. So the Edge Runner deployment is built to require none.

The runner pulls. There is no listening port for an external orchestrator to reach in, no inbound rule to justify to your network team. The runner reaches out, fetches an approved capsule, and verifies its signature locally before doing anything else. Verification is the gate. If the signature does not validate against your trusted keys, or the content hashes do not match the manifest, the capsule is rejected at the edge and never runs.

That sequence (pull, verify, then execute) is what lets the control plane "reach into" the boundary without ever crossing it inbound. The boundary stays one-directional. The runner integrates with your existing segmentation and zero-trust posture instead of asking you to carve an exception, and execution stays on the private network with no inbound required.

This is also where reachability earns its keep. Because the System Graph makes validation change-aware, capsules can be scoped to what is genuinely reachable and exploitable rather than the whole surface. Reachability-based prioritization can mean 70 to 90% less exploitable exposure, which means fewer, sharper capsules to approve and run, not a firehose of low-value executions waiting on human sign-off.

05

Chain of custody for evidence

Now the part auditors actually care about. A capsule executes and produces evidence: results, screenshots, logs, telemetry. The chain of custody is the unbroken link from *approved intent* to *recorded outcome*, and the capsule is what makes that link verifiable end to end.

Walk it backward from a finding:

  1. A result references the run that produced it.
  2. The run references the exact capsule version that executed, by hash.
  3. The capsule references its manifest, its approval records, and the policy version in force.
  4. The approval references the human and role that authorized it.

Every hop is attestable. No step rests on "trust the orchestrator." For an identity or security system, this answers the regulator's real question (what ran, who approved it, can you prove it did nothing else) with cryptographic links rather than narrative.

Evidence handling is yours to govern. The runner produces a complete, redactable bundle locally, and how it leaves the boundary is a policy decision routed to Reliability Analytics: local-only for the highest sensitivity, sanitized egress with field masking, or metadata-only correlation IDs for dashboards. The default is not "exfiltrate everything." The default is "you decide what leaves," which keeps validation from creating a new data-residency problem on identity data.

06

Governance is the engineering

The temptation with autonomy is to let the system close the loop on its own, including the fix. For identity and security infrastructure that is reckless, and the capsule model is what makes the disciplined alternative practical. Agents propose; humans authorize. Capsule promotion and any governed remediation pass through human approval and role-based governance before production impact, with separation of duties on production changes.

The capsule is what makes that governance enforceable rather than aspirational. Because the artifact is immutable and signed, "approved" means a specific set of bytes, not a vague intent that drifts between review and execution. Around 80% of developers admit to bypassing guardrails that slow them down; a signed-capsule pipeline is hard to route around precisely because the runner refuses anything that is not signed and in scope. That is autonomy a serious enterprise can defend, and it is a meaningful slice of the roughly $2.41 trillion annual cost of poor software quality that comes from changes nobody could vouch for.

07

What to do Monday morning

  • Pick one privileged path (a token issuance or auth flow) and model it in the System Graph so capsules are scoped to what changed.
  • Run the first capsule in local-only evidence mode. Prove the chain of custody with zero egress before deciding what should leave.
  • Verify the verification. Confirm that an unsigned or altered capsule is actually rejected at the edge. The guarantee is only real if the failure path works.
  • Wire approvals to existing change control rather than building a parallel process, and start with validation before granting any governed remediation.
08

The bottom line

Guides associés

Continuer la lecture

01Zof Console

Une surface pour la posture, les opérations et ce qui nécessite une attention particulière.

Le foyer authentifié que les équipes d'ingénierie, de QA et de SRE ouvrent chaque jour : posture de qualité, exécutions en vol, couverture par module et ce qui requiert de l'attention ensuite.

KPI OPÉRATIONNELS

  • Courses
  • Couverture
  • Risque

Vivez dans tous les environnements dans lesquels vous expédiez.

TRAVAIL DE LA Colonne Vertébrale

  • Spécifications
  • Tests
  • Horaires

De la spécification à la régression planifiée.

GARDE-CORPS

  • RBAC
  • SSO
  • audit

Chaque action attribuable à un humain nommé.

LIVE/console
Centre de commande domestique Zof AI affichant 12 exécutions à 94 % de réussite, 3 problèmes critiques ouverts, une couverture de 84 %, quatre barres de traçabilité des modules, le pipeline de spécifications, les calendriers à venir et les prochaines actions recommandées avec une barre latérale d'exécutions actives.
Vue d'accueil · Service de paiement · Mise en scène · capturé en direct à partir du produit.
  • 01 · RUNS · 24H

    94% pass

    12 runs across staging

  • 02 · COVERAGE

    84%

    Across four modules

  • 03 · ACTIVE RUNS

    3 running

    Live on this branch

  • 04 · NEXT ACTIONS

    Recommended

    Triage gaps, new spec

The Signed Capsule: How Immutable, Customer-Controlled Test Execution