Tutorial

Manage autonomous reliability in Console

SRE and EM workflow spanning runs, test health, releases, and remediation.

Overview

SRE and EM workflow spanning runs, test health, releases, and remediation.

Tutorial details

Audience
Engineering manager / SRE
Duration
45 min
Prerequisites
Existing project with run history

Tutorial steps

Review reliability posture

Home metrics and Reports overview.

Navigation: Console Home quick actions or the project wizard for new initiatives. Use ⌘K / Ctrl+K to jump to any surface.

Verification: Confirm organization and team context in the Console header before making changes.

Analyze Test Health

Identify flakiness and failure clusters.

Navigation: Console Home quick actions or the project wizard for new initiatives. Use ⌘K / Ctrl+K to jump to any surface.

Verification: Note project, run, or agent IDs if you may need support escalation.

Evaluate release gates

Releases → gate status before ship.

Navigation: Console Home quick actions or the project wizard for new initiatives. Use ⌘K / Ctrl+K to jump to any surface.

Verification: Confirm UI state matches your runbook. Retry once on transient errors before opening a ticket.

Review remediation queue

Remediation → approvals if policies enabled.

Navigation: Console Home quick actions or the project wizard for new initiatives. Use ⌘K / Ctrl+K to jump to any surface.

Verification: Confirm UI state matches your runbook. Retry once on transient errors before opening a ticket.

Expected outcome

Operational reliability workflow understood across Console areas.

After completing this tutorial

  • Capture run IDs and screenshots for your team runbook
  • Share learnings with QA, SRE, or platform stakeholders
  • Proceed to related how-to guides for operational hardening

Continue learning

Was this page helpful?

01The operational surface

One surface for posture, operations, and what needs attention next.

The Zof home is not a marketing dashboard. It is the operational surface engineering, QA, and SRE teams use every day, quality posture, in-flight runs, coverage by module, and the actions a leader should look at next.

OPERATIONAL KPIs

  • Runs
  • Coverage
  • Risk

Live across every environment you ship to.

WORK SPINE

  • Specs
  • Tests
  • Schedules

From specification to scheduled regression.

GUARDRAILS

  • RBAC
  • SSO
  • audit

Every action attributable to a named human.

STAGING · LIVE/home
Zof AI home command center showing 12 runs at 94% pass, 3 open critical issues, 84% coverage, four module traceability bars, the specification pipeline, upcoming schedules, and recommended next actions with an active-runs sidebar.
Home view · Checkout Service · Staging · captured live from the product.
  • 01 · RUNS · 24H

    94% pass

    12 runs across staging

  • 02 · COVERAGE

    84%

    Across four modules

  • 03 · ACTIVE RUNS

    3 running

    Live on this branch

  • 04 · NEXT ACTIONS

    Recommended

    Triage gaps, new spec

Manage autonomous reliability in Console | Zof AI Documentation