Does any of our production-like test data leave our network?

By default, no. Test and remediation execution run on Edge Runners inside your enclave, private cloud, or on-prem footprint, and evidence such as screenshots, traces, and logs stays in customer-controlled storage. Only telemetry you explicitly permit egresses, and it is minimized and scrubbed on the runner side before it crosses the boundary. Egress is default-deny: every transfer is enforced and logged against the policy that allowed it.

What can the vendor control plane actually see or do inside our perimeter?

The control plane plans work from the System Graph and sanitized telemetry, then emits signed capsules with scoped commands, allowed endpoints, timeouts, and data classification labels. It cannot push work the enclave has not agreed to accept, because runners reject anything unsigned or out of policy. It never receives raw customer payloads, and it cannot reach internal systems directly; only the runner inside your network executes, using short-lived credentials from your PAM and secret vaults.

How do we prove to auditors what an agent did?

Every action traces back to a signed, scoped capsule rather than an opaque model decision. You can answer who published each capsule and under whose identity, what executed in which environment and against which endpoints, what evidence was produced and where it resides, and what egress occurred under which policy. Because evidence stays in your storage, retention and access remain inside controls your auditors have already approved. Zof maintains SOC 2 Type II and GDPR controls mapped to this data flow.

Which deployment model fits a strict data-residency or air-gapped requirement?

For regulated hybrid environments, a SaaS control plane with enclave execution keeps intelligence assessable while execution stays inside your boundary; the tradeoff is operating the runners. For strict residency, a private cloud control plane gives you more ownership at higher infrastructure cost. For air-gapped or sovereign requirements, a full on-prem deployment removes the dependency entirely at the cost of a longer rollout. The separation of intelligence from execution holds in all three.

Arquitectura de despliegue

Llevar la fiabilidad autónoma a enclaves seguros

Arquitecturas con el cerebro fuera y la ejecución dentro para empresas reguladas.

Request architecture review

Equipo de Fiabilidad de Zof · Ingeniería y producto

9 de mayo de 2026 · 12 min de lectura · Actualizado 19 de mayo de 2026

Why banks and regulated enterprises cannot use normal SaaS testing tools

Security review starts with three questions: where does test data live, who can reach the execution environment, and what leaves the network. Multi-tenant SaaS that ingests production-like data answers all three wrong, even when the vendor is reputable and well-funded.

Autonomy raises the stakes. A passive tool stores data; an agent observes, decides, and acts inside your systems. Without boundary-aware design, autonomous reliability stops being an asset and becomes an unbounded liability your CISO has to underwrite. This is why the secure enclave is a deployment model, not a configuration flag.

The architecture principle: brain outside, execution inside

Intelligence and orchestration run in a control plane your security team can assess. Test and remediation execution run inside your enclave, private cloud, or on-prem footprint, where data never crosses an unapproved boundary.

The split is deliberate. The model that plans work and the runtime that touches sensitive systems are different trust zones with different controls. The System Graph and the governance layer live outside; the workload stays inside.

Secure enclave pattern

Control plane (policy, graph, orchestration)
        | signed work packages only
        v
Customer enclave: Edge Runners + local evidence
        | sanitized egress
        v
Aggregated telemetry (no raw customer data)

Signed test capsules

Work sent to enclave runners arrives as signed capsules: scoped commands, timeouts, allowed endpoints, and data classification labels. Runners reject anything unsigned or out of policy, so the control plane cannot push work the enclave has not agreed to accept.

The capsule is the contract. It is also the audit record: every action an agent takes inside your perimeter traces back to a signed, scoped instruction rather than an opaque model decision.

Local edge runners

Edge Runners execute capsules against internal URLs, desktop clients, and private APIs that never resolve outside your network. They stream artifacts to local evidence stores, not to arbitrary vendor buckets.

Runners are the only component with reach into sensitive systems, which makes them the right place to concentrate isolation, least-privilege identity, and network policy.

Customer-controlled transfer boundary

Customers define what may egress: pass/fail summaries, redacted traces, content hashes, or nothing at all. Transfer policy is enforced at the boundary and recorded, so what left and under which rule is always answerable.

Default-deny is the correct posture. The enclave should treat egress as an exception that policy explicitly permits, not a convenience that operations quietly enables.

Local evidence stores

Screenshots, HAR files, traces, and logs remain in customer-controlled storage by default. Reviewers reach evidence through the security tooling they already operate, not a vendor console outside the perimeter.

Evidence stays where the data does. That keeps retention, access, and discovery inside the controls your auditors have already approved.

Sanitized egress

When telemetry leaves the enclave, it is minimized and scrubbed before it crosses the boundary. The goal is operational visibility, fleet health, run status, aggregate pass rates, without exfiltrating sensitive payloads.

Sanitization is performed inside your perimeter, on the runner side, so the control plane never sees raw customer data even momentarily.

PAM and secrets

Runners integrate with privileged access management and secret vaults using short-lived credentials, with no long-lived keys held in vendor SaaS. Secrets are injected at execution time and never appear in agent prompts, capsule payloads, or external logs.

This matters because ~80% of developers bypass security policy under delivery pressure, per our analysis. Governed runners remove the path of least resistance: there is no convenient place to paste a credential where it can leak.

What one governed run looks like inside the perimeter

Concretely, a single run proceeds in stages, each crossing a boundary on purpose. The control plane plans the work from the System Graph and emits a signed capsule. An Edge Runner inside your enclave validates the signature, pulls short-lived credentials from your vault, and executes against internal endpoints. Artifacts land in your local evidence store; only a sanitized summary egresses.

One run, boundary by boundary

Plan: control plane selects affected services from the System Graph and scopes the work
Sign: work is packaged as a signed capsule with endpoints, timeouts, and data labels
Verify: the Edge Runner rejects the capsule unless signature and policy match
Execute: the runner authenticates via PAM with short-lived credentials, inside your network
Record: screenshots, traces, and logs are written to your local evidence store
Egress: a minimized, scrubbed summary leaves; raw payloads never do
Authorize: any proposed fix routes to Remediation Fleets for staging and human approval before a PR

For a deeper, end-to-end view of how a fleet plans, executes, and produces evidence, see Inside a Zof run. The enclave changes where each step happens, not what governed autonomy does.

Auditability

Every claim above has to be answerable after the fact. A defensible enclave produces a contiguous record from capsule to evidence to egress, with no gap a reviewer has to take on trust.

Who published each capsule and under whose identity
What executed in which environment, and against which endpoints
What evidence was produced and where it resides
What egress occurred and under which policy

The objection: does the boundary cripple autonomy?

A fair challenge from engineering leaders is that a hard execution boundary will starve the agents of context and slow everything down. In practice it does not, because the boundary separates intelligence from data, not intelligence from outcomes. The control plane plans from the System Graph and from sanitized telemetry; it does not need raw payloads to decide what to validate next.

The enclave does not make autonomy weaker. It makes autonomy something a regulated buyer can actually authorize.

Deployment models

The right boundary depends on your residency and sovereignty posture. The control plane stays assessable in every model; what moves is where execution and evidence live.

Model	Best for	Tradeoff
SaaS control + enclave execution	Regulated hybrid	Requires runner ops
Private cloud control plane	Strict data residency	Higher infra ownership
Full on-prem	Air-gapped or sovereign	Longer rollout

How to evaluate vendors

Ask for reference architectures, data-flow diagrams, and named failure modes, not marketing claims. Then validate runner isolation, capsule signing, egress policy, and evidence retention in your own environment before you trust any of it. The reliability patterns differ by sector, and the right baseline for a bank is not the right baseline for healthcare; reliability patterns by industry is a useful frame for that conversation.

Procurement checklist: verify, do not trust

Show the execution boundary in one diagram, including where raw data can and cannot travel
Demonstrate a runner rejecting an unsigned or out-of-policy capsule
Prove egress is default-deny and every transfer is logged with its governing policy
Confirm evidence stays in customer-controlled storage by default, with your retention rules
Confirm PAM integration with short-lived credentials and no long-lived vendor-held keys
Provide SOC 2 Type II coverage and GDPR controls mapped to the data flow above

Final takeaway

Autonomous reliability can run inside secure enclaves when the architecture respects the separation of intelligence and execution. Regulated buyers, especially in financial services, should demand this by default and verify it in their own environment, not accept it as a bespoke project.

If the boundary is real, the agents propose and your people authorize, with every action signed, scoped, and recorded. That is the version of autonomy a security review can sign.

Preguntas frecuentes

: By default, no. Test and remediation execution run on Edge Runners inside your enclave, private cloud, or on-prem footprint, and evidence such as screenshots, traces, and logs stays in customer-controlled storage. Only telemetry you explicitly permit egresses, and it is minimized and scrubbed on the runner side before it crosses the boundary. Egress is default-deny: every transfer is enforced and logged against the policy that allowed it.

Enclave seguro Edge Runners On-Premise Nube privada IA empresarial

Guías relacionadas

Producto relacionado

Testing Fleets

Continuar leyendo

Compañía

Los agentes de IA empresariales necesitan planos de control

A medida que los agentes pasan de asistentes a operadores, las empresas necesitan planos de control. La fiabilidad es el lugar adecuado para empezar.

Equipo de Fiabilidad de Zof15 may 202613 min de lectura

Seguridad y gobernanza

Remediación de IA gobernada: corregir software sin perder el control

Por qué la remediación es la parte más difícil de la fiabilidad autónoma, y cómo las empresas pueden adoptar correcciones con IA de forma segura.

Equipo de Fiabilidad de Zof5 may 202611 min de lectura