Governance
Remediation
Governed remediation with policy-driven approval and audit trails.
Overview
Remediation in the Zof Console proposes and executes fixes for detected reliability issues within enterprise-defined policy. The workflow spans detection from runs and Test Health, fix proposal generation, human authorization, controlled apply, and post-fix verification, all captured in audit trails.
Governed remediation bridges validation and engineering action without sacrificing control. Autonomous proposal generation accelerates triage; explicit approval gates ensure no fix reaches production systems without authorized human consent.
Remediation integrates with source control and issue tracking when configured, so fix proposals align with existing developer workflows and change management practices.
Who should read this
- SREs, senior engineers, QA leads, and release managers participating in fix approval and verification.
Prerequisites
- Failure or health signal triggering remediation eligibility, typically from runs or Test Health
- Organization remediation policy configured with approver roles and scope limits
- Integrations to source control or ticketing systems if automated apply is enabled
When to use this workflow
- Onboarding new team members to Zof terminology and workflows
- Authoring internal runbooks aligned with Console labels
- Designing CI/CD or webhook integrations against documented behavior
Step-by-step procedure
Detect and qualify issues
Identify failures or clusters in Test Health linked to actionable root causes.
Confirm the issue is eligible for remediation under your organization policy, some categories may require manual engineering only.
Associate the issue with affected applications, services, and owning teams.
Generate remediation proposal
Initiate remediation from the failure context or Remediation area in Governance.
Review the proposed fix description, affected files or configuration, and expected outcome.
Validate that the proposal scope matches the diagnosed issue without unrelated changes.
Route for human authorization
Submit the proposal to designated approvers based on policy, team leads, release managers, or security reviewers.
Approvers evaluate blast radius, test evidence, and alignment with change windows.
Reject or request revision when proposals are incomplete, overly broad, or lack verification plans.
Apply under policy constraints
Upon approval, execute the apply step through configured integrations or documented manual procedures.
Monitor apply progress and capture errors for rollback consideration.
Ensure apply actions are logged with approver identity and timestamp in audit records.
Verify fix effectiveness
Re-run targeted suites that originally failed to confirm resolution.
Compare results against pre-remediation baselines in run detail and Test Health.
Mark remediation complete only after verification passes or document residual risk with approver acknowledgment.
Close loop and document
Link remediation records to change tickets and release notes for stakeholder visibility.
Update runbooks if the failure mode reveals gaps in detection or response procedures.
Export audit evidence when remediation supports compliance or customer incident reports.
Key concepts
- Organization scope
- All Zof Console and API operations are isolated to your authenticated tenant.
- Governed execution
- Agent output and remediation follow policy packs with human approval when configured.
Best practices
- Separate approver roles from proposal initiators to enforce four-eyes principles.
- Limit automated apply scope to non-production environments until confidence and policy mature.
- Always require verification runs before closing remediation items affecting release gates.
- Review rejected proposals for patterns indicating test or generation quality issues upstream.
- Include remediation metrics, time to propose, time to approve, verification pass rate, in reliability reviews.
Common issues
- Proposal scope too broad
- Reject and request a focused fix. Broad proposals increase review burden and rollback risk.
- Apply succeeds but verification fails
- Treat as incomplete remediation. Roll back if policy requires, re-diagnose root cause, and avoid closing the item prematurely.
- Approvers unavailable during release window
- Define backup approvers in policy and document escalation paths in on-call runbooks before critical releases.
Was this page helpful?