Architecture de déploiement

The Signed Capsule: How Immutable, Customer-Controlled Test Execution Actually Works

A technical deep-dive on Zof Edge Runner capsules: how signing, provenance, immutability, and chain-of-custody make test execution evidence you can defend.

Book a demo

Équipe Fiabilité Zof · Ingénierie et produit

6 mai 2025 · 8 min de lecture · Mis à jour le 6 mai 2025

Résumé

When you run validation against an identity provider, a token service, or a key-management path, the test is itself a privileged action. It touches systems where "what executed, on whose authority, and can you prove it didn't do anything else" is not a nice-to-have. It is the audit. Most AI testing tools cannot answer that question, because the thing that ran was synthesized at runtime and is gone by the time anyone asks. This piece is about the artifact that fixes that: the signed capsule. It is the unit of governed work that Zof's Edge Runners admit into your boundary. If you operate reliability for identity or security infrastructure, the capsule is the part of the architecture worth understanding in detail, because it is what turns autonomous validation into evidence you can hand to an auditor without flinching.

Start with the reachability problem an SRE actually faces.
A Zof capsule is an immutable, versioned package, not an ad hoc script.
The capsule only makes sense alongside the deployment split it rides on.

Why ephemeral test execution fails audit

Start with the reachability problem an SRE actually faces. Roughly 41% of codebases are now AI-generated, and close to 45% of AI coding tasks introduce a critical flaw or security issue. The change rate is up and the per-change defect rate is up at the same time. So you want more validation, more often, against more surface area.

The naive way to do that with agents is to let a model decide what to test at runtime and execute it on the spot. For an identity or security workload, that is exactly backwards. The system that decides what to run is now an unbounded actor inside your trust boundary. There is no stable artifact to review before it acts, no signature to verify, and no way to prove afterward that the execution stayed inside its intended scope. When the audit comes, you have logs of effects but no attestable description of intent.

The capsule inverts the order of operations. Work is assembled and reviewed *before* it can run, frozen, signed, and only then admitted for execution. Intent becomes an artifact. Execution becomes the controlled replay of an approved artifact, not an open-ended session.

Anatomy of a signed capsule

A Zof capsule is an immutable, versioned package, not an ad hoc script. Concretely, it carries:

A constrained manifest that declares exactly what may run: target endpoints, allowed actions, fixtures, environment scope, and resource limits. Nothing outside the manifest executes.
Content hashes over every included artifact, so the bytes that run are provably the bytes that were reviewed.
Approval records binding the capsule to the humans and policies that authorized it.
A version identity that places this capsule in a lineage you can walk.

Three properties matter, and they are worth separating because teams tend to collapse them.

Immutability means the capsule cannot change after it is sealed. The manifest plus the content hashes form a fixed object; mutate any byte and the hashes no longer match. There is no "quick edit before the run." If you need a different test, you produce a new, separately approved capsule. That constraint is the point: it removes the entire class of "the script drifted from what we reviewed" failures.

Provenance means the capsule knows where it came from. It records the planning inputs, the System Graph context that scoped it, the policy version in force, and the approvers. You can answer "why does this capsule test this path" without reconstructing it from memory.

Signature means the capsule is cryptographically attested. Before anything runs, the receiving side verifies the signature against your trusted keys. A capsule that is unsigned, signed by an untrusted key, or altered after signing simply does not execute. The signature is the difference between "we think this is what we approved" and "we can prove it."

Splitting intelligence from execution

The capsule only makes sense alongside the deployment split it rides on. Zof separates *thinking* from *doing* across planes with deliberately different trust requirements, described in the secure-enclave deployment model.

Planning and generation, where the powerful models live, happen on the intelligence side: System Graph modeling, risk prioritization, test generation, capsule assembly. That side never executes against your protected systems. Governance sits in between: human approval, role-based controls, signing, policy checks, capsule versioning and promotion. Execution happens entirely inside your boundary on a customer-deployed runner that makes no outbound model calls at runtime.

The architectural payoff is specific. The plane that needs the most capable models touches the least sensitive data. The plane that touches identity-adjacent systems runs no models and is driven only by signed, frozen capsules. You get coordinated agent planning without putting inference on the critical path inside your most sensitive segment.

Crossing the boundary without weakening it

A signed capsule still has to enter the boundary, and an SRE's reflex here is correct: any inbound path is attack surface. So the Edge Runner deployment is built to require none.

The runner pulls. There is no listening port for an external orchestrator to reach in, no inbound rule to justify to your network team. The runner reaches out, fetches an approved capsule, and verifies its signature locally before doing anything else. Verification is the gate. If the signature does not validate against your trusted keys, or the content hashes do not match the manifest, the capsule is rejected at the edge and never runs.

That sequence (pull, verify, then execute) is what lets the control plane "reach into" the boundary without ever crossing it inbound. The boundary stays one-directional. The runner integrates with your existing segmentation and zero-trust posture instead of asking you to carve an exception, and execution stays on the private network with no inbound required.

This is also where reachability earns its keep. Because the System Graph makes validation change-aware, capsules can be scoped to what is genuinely reachable and exploitable rather than the whole surface. Reachability-based prioritization can mean 70 to 90% less exploitable exposure, which means fewer, sharper capsules to approve and run, not a firehose of low-value executions waiting on human sign-off.

Chain of custody for evidence

Now the part auditors actually care about. A capsule executes and produces evidence: results, screenshots, logs, telemetry. The chain of custody is the unbroken link from *approved intent* to *recorded outcome*, and the capsule is what makes that link verifiable end to end.

Walk it backward from a finding:

A result references the run that produced it.
The run references the exact capsule version that executed, by hash.
The capsule references its manifest, its approval records, and the policy version in force.
The approval references the human and role that authorized it.

Every hop is attestable. No step rests on "trust the orchestrator." For an identity or security system, this answers the regulator's real question (what ran, who approved it, can you prove it did nothing else) with cryptographic links rather than narrative.

Evidence handling is yours to govern. The runner produces a complete, redactable bundle locally, and how it leaves the boundary is a policy decision routed to Reliability Analytics: local-only for the highest sensitivity, sanitized egress with field masking, or metadata-only correlation IDs for dashboards. The default is not "exfiltrate everything." The default is "you decide what leaves," which keeps validation from creating a new data-residency problem on identity data.

Governance is the engineering

The temptation with autonomy is to let the system close the loop on its own, including the fix. For identity and security infrastructure that is reckless, and the capsule model is what makes the disciplined alternative practical. Agents propose; humans authorize. Capsule promotion and any governed remediation pass through human approval and role-based governance before production impact, with separation of duties on production changes.

The capsule is what makes that governance enforceable rather than aspirational. Because the artifact is immutable and signed, "approved" means a specific set of bytes, not a vague intent that drifts between review and execution. Around 80% of developers admit to bypassing guardrails that slow them down; a signed-capsule pipeline is hard to route around precisely because the runner refuses anything that is not signed and in scope. That is autonomy a serious enterprise can defend, and it is a meaningful slice of the roughly $2.41 trillion annual cost of poor software quality that comes from changes nobody could vouch for.

What to do Monday morning

Pick one privileged path (a token issuance or auth flow) and model it in the System Graph so capsules are scoped to what changed.
Run the first capsule in local-only evidence mode. Prove the chain of custody with zero egress before deciding what should leave.
Verify the verification. Confirm that an unsigned or altered capsule is actually rejected at the edge. The guarantee is only real if the failure path works.
Wire approvals to existing change control rather than building a parallel process, and start with validation before granting any governed remediation.

The bottom line

Enclave sécurisée Runners en périphérie System Graph Flottes de remédiation SRE

Guides associés

Secure enclave testing

Produit associé

Continuer la lecture

Architecture de déploiement

Audit-Ready by Default: Turning Reliability Runs Into SOC 2 and GDPR Evidence

Turn governed reliability runs into continuous, customer-controlled SOC 2 and GDPR evidence. A compliance playbook for making audits a query, not a scramble.

Équipe Fiabilité Zof2 juin 20267 min de lecture

Architecture de déploiement

The Conservative Pilot Path: From Read-Only Reliability to Governed Remediation in a Bank

A staged adoption playbook that takes a risk-averse bank from read-only reliability observation to governed autonomous remediation, with exit criteria at every stage.

Équipe Fiabilité Zof15 avr. 20267 min de lecture

Architecture de déploiement

When 41% of Your Codebase Is AI-Generated and It Lives Behind a Firewall

When 41% of your codebase is AI-generated and your enclave can't reach cloud testing tools, in-enclave reliability becomes mandatory. A POV for healthcare CTOs.

Équipe Fiabilité Zof5 mars 20267 min de lecture

Why ephemeral test execution fails audit

Anatomy of a signed capsule

Splitting intelligence from execution

Crossing the boundary without weakening it

Chain of custody for evidence

Governance is the engineering

What to do Monday morning

The bottom line

Continuer la lecture

Audit-Ready by Default: Turning Reliability Runs Into SOC 2 and GDPR Evidence

The Conservative Pilot Path: From Read-Only Reliability to Governed Remediation in a Bank

When 41% of Your Codebase Is AI-Generated and It Lives Behind a Firewall

Une surface pour la posture, les opérations et ce qui nécessite une attention particulière.