Llevar la fiabilidad autónoma a enclaves seguros
Arquitecturas con el cerebro fuera y la ejecución dentro para empresas reguladas.
Why banks and regulated enterprises cannot use normal SaaS testing tools
Security review starts with three questions: where does test data live, who can reach the execution environment, and what leaves the network. Multi-tenant SaaS that ingests production-like data answers all three wrong, even when the vendor is reputable and well-funded.
Autonomy raises the stakes. A passive tool stores data; an agent observes, decides, and acts inside your systems. Without boundary-aware design, autonomous reliability stops being an asset and becomes an unbounded liability your CISO has to underwrite. This is why the secure enclave is a deployment model, not a configuration flag.
The architecture principle: brain outside, execution inside
Intelligence and orchestration run in a control plane your security team can assess. Test and remediation execution run inside your enclave, private cloud, or on-prem footprint, where data never crosses an unapproved boundary.
The split is deliberate. The model that plans work and the runtime that touches sensitive systems are different trust zones with different controls. The System Graph and the governance layer live outside; the workload stays inside.
Secure enclave pattern
Control plane (policy, graph, orchestration)
| signed work packages only
v
Customer enclave: Edge Runners + local evidence
| sanitized egress
v
Aggregated telemetry (no raw customer data)Signed test capsules
Work sent to enclave runners arrives as signed capsules: scoped commands, timeouts, allowed endpoints, and data classification labels. Runners reject anything unsigned or out of policy, so the control plane cannot push work the enclave has not agreed to accept.
The capsule is the contract. It is also the audit record: every action an agent takes inside your perimeter traces back to a signed, scoped instruction rather than an opaque model decision.
Local edge runners
Edge Runners execute capsules against internal URLs, desktop clients, and private APIs that never resolve outside your network. They stream artifacts to local evidence stores, not to arbitrary vendor buckets.
Runners are the only component with reach into sensitive systems, which makes them the right place to concentrate isolation, least-privilege identity, and network policy.
Customer-controlled transfer boundary
Customers define what may egress: pass/fail summaries, redacted traces, content hashes, or nothing at all. Transfer policy is enforced at the boundary and recorded, so what left and under which rule is always answerable.
Default-deny is the correct posture. The enclave should treat egress as an exception that policy explicitly permits, not a convenience that operations quietly enables.
Local evidence stores
Screenshots, HAR files, traces, and logs remain in customer-controlled storage by default. Reviewers reach evidence through the security tooling they already operate, not a vendor console outside the perimeter.
Evidence stays where the data does. That keeps retention, access, and discovery inside the controls your auditors have already approved.
Sanitized egress
When telemetry leaves the enclave, it is minimized and scrubbed before it crosses the boundary. The goal is operational visibility, fleet health, run status, aggregate pass rates, without exfiltrating sensitive payloads.
Sanitization is performed inside your perimeter, on the runner side, so the control plane never sees raw customer data even momentarily.
PAM and secrets
Runners integrate with privileged access management and secret vaults using short-lived credentials, with no long-lived keys held in vendor SaaS. Secrets are injected at execution time and never appear in agent prompts, capsule payloads, or external logs.
This matters because ~80% of developers bypass security policy under delivery pressure, per our analysis. Governed runners remove the path of least resistance: there is no convenient place to paste a credential where it can leak.
What one governed run looks like inside the perimeter
Concretely, a single run proceeds in stages, each crossing a boundary on purpose. The control plane plans the work from the System Graph and emits a signed capsule. An Edge Runner inside your enclave validates the signature, pulls short-lived credentials from your vault, and executes against internal endpoints. Artifacts land in your local evidence store; only a sanitized summary egresses.
One run, boundary by boundary
- Plan: control plane selects affected services from the System Graph and scopes the work
- Sign: work is packaged as a signed capsule with endpoints, timeouts, and data labels
- Verify: the Edge Runner rejects the capsule unless signature and policy match
- Execute: the runner authenticates via PAM with short-lived credentials, inside your network
- Record: screenshots, traces, and logs are written to your local evidence store
- Egress: a minimized, scrubbed summary leaves; raw payloads never do
- Authorize: any proposed fix routes to Remediation Fleets for staging and human approval before a PR
For a deeper, end-to-end view of how a fleet plans, executes, and produces evidence, see Inside a Zof run. The enclave changes where each step happens, not what governed autonomy does.
Auditability
Every claim above has to be answerable after the fact. A defensible enclave produces a contiguous record from capsule to evidence to egress, with no gap a reviewer has to take on trust.
- Who published each capsule and under whose identity
- What executed in which environment, and against which endpoints
- What evidence was produced and where it resides
- What egress occurred and under which policy
The objection: does the boundary cripple autonomy?
A fair challenge from engineering leaders is that a hard execution boundary will starve the agents of context and slow everything down. In practice it does not, because the boundary separates intelligence from data, not intelligence from outcomes. The control plane plans from the System Graph and from sanitized telemetry; it does not need raw payloads to decide what to validate next.
The enclave does not make autonomy weaker. It makes autonomy something a regulated buyer can actually authorize.
Deployment models
The right boundary depends on your residency and sovereignty posture. The control plane stays assessable in every model; what moves is where execution and evidence live.
| Model | Best for | Tradeoff |
|---|---|---|
| SaaS control + enclave execution | Regulated hybrid | Requires runner ops |
| Private cloud control plane | Strict data residency | Higher infra ownership |
| Full on-prem | Air-gapped or sovereign | Longer rollout |
How to evaluate vendors
Ask for reference architectures, data-flow diagrams, and named failure modes, not marketing claims. Then validate runner isolation, capsule signing, egress policy, and evidence retention in your own environment before you trust any of it. The reliability patterns differ by sector, and the right baseline for a bank is not the right baseline for healthcare; reliability patterns by industry is a useful frame for that conversation.
Procurement checklist: verify, do not trust
- Show the execution boundary in one diagram, including where raw data can and cannot travel
- Demonstrate a runner rejecting an unsigned or out-of-policy capsule
- Prove egress is default-deny and every transfer is logged with its governing policy
- Confirm evidence stays in customer-controlled storage by default, with your retention rules
- Confirm PAM integration with short-lived credentials and no long-lived vendor-held keys
- Provide SOC 2 Type II coverage and GDPR controls mapped to the data flow above
Final takeaway
Autonomous reliability can run inside secure enclaves when the architecture respects the separation of intelligence and execution. Regulated buyers, especially in financial services, should demand this by default and verify it in their own environment, not accept it as a bespoke project.
If the boundary is real, the agents propose and your people authorize, with every action signed, scoped, and recorded. That is the version of autonomy a security review can sign.
Preguntas frecuentes
- By default, no. Test and remediation execution run on Edge Runners inside your enclave, private cloud, or on-prem footprint, and evidence such as screenshots, traces, and logs stays in customer-controlled storage. Only telemetry you explicitly permit egresses, and it is minimized and scrubbed on the runner side before it crosses the boundary. Egress is default-deny: every transfer is enforced and logged against the policy that allowed it.
Guías relacionadas
Producto relacionado
Continuar leyendo
Los agentes de IA empresariales necesitan planos de control
A medida que los agentes pasan de asistentes a operadores, las empresas necesitan planos de control. La fiabilidad es el lugar adecuado para empezar.
Remediación de IA gobernada: corregir software sin perder el control
Por qué la remediación es la parte más difícil de la fiabilidad autónoma, y cómo las empresas pueden adoptar correcciones con IA de forma segura.
