Overview

Agent Console is the operational surface inside the Zof Console for managing cloud agents, endpoint agents, registered applications, active executions, and fleet telemetry. It provides the unified view SRE and QA platform teams need when validation runs at enterprise scale.

From Agent Console, operators monitor fleet health, diagnose connectivity issues, launch targeted executions, and correlate agent behavior with run outcomes. Telemetry includes heartbeat status, queue depth, resource utilization, and execution timelines.

Agent Console is not synonymous with Zof Console. The latter is the full product control plane; Agent Console is a dedicated Automation area focused on execution infrastructure.

Who should read this

Platform engineers, QA operations leads, and SREs managing validation fleet health and execution routing.

Prerequisites

Role with permission to access Automation → Agent Console
At least one cloud agent pool or endpoint agent registered in your organization
Applications registered with environment and authentication metadata

When to use this workflow

Onboarding new team members to Zof terminology and workflows
Authoring internal runbooks aligned with Console labels
Designing CI/CD or webhook integrations against documented behavior

Step-by-step procedure

Open Agent Console overview

Navigate to Automation → Agent Console in the Zof Console.

Review fleet summary metrics: online agents, queued jobs, recent failures, and regional distribution.

Identify agents requiring attention based on offline status or elevated error rates.

Inspect agent inventory

Open the Agents tab to list cloud and endpoint agents with status, labels, and capabilities.

Filter by environment, region, or label to isolate staging versus production pools.

Drill into individual agent records for version, last heartbeat, and assigned workloads.

Manage applications and endpoints

Review registered applications and endpoint applications linked to execution targets.

Confirm authentication configuration and environment URLs match current deployment state.

Update metadata when applications migrate regions or change access patterns.

Launch or monitor executions

Start executions from Agent Console when operators need direct fleet control outside standard runs.

Track active executions for progress, agent assignment, and intermediate telemetry events.

Cross-reference execution IDs with Runs for unified result analysis and stakeholder reporting.

Analyze telemetry

Open the Telemetry area for time-series views of agent health, queue latency, and error categories.

Set baseline thresholds aligned with your on-call runbooks and SLO definitions.

Export or screenshot telemetry snapshots for post-incident reviews when required by governance.

Respond to fleet incidents

For offline agents, verify network policy, host status, and agent service health before re-registration.

For stuck queues, identify policy bottlenecks or exhausted pools and escalate per capacity runbooks.

Document remediation actions in your incident tracker and link associated run IDs for traceability.

Key concepts

Organization scope: All Zof Console and API operations are isolated to your authenticated tenant.
Governed execution: Agent output and remediation follow policy packs with human approval when configured.

Best practices

Maintain separate Agent Console bookmarks or dashboards for staging and production fleets.
Integrate Agent Console telemetry checks into pre-release runbooks for critical deployments.
Assign fleet ownership to a named team with on-call rotation during release windows.
Avoid manual execution overrides that bypass policy, use documented break-glass procedures only.
Review endpoint agent host inventories quarterly for decommissioned machines still registered.

Common issues

Heartbeat gaps without execution failures: Transient network blips may cause brief offline states. Investigate sustained gaps exceeding your SLO threshold; single-minute gaps may self-resolve.
Executions complete in Agent Console but missing in Runs: Verify the execution was linked to a project run context. Ad hoc launches may require manual correlation using execution and run identifiers.
Telemetry charts empty for new agents: Allow one full execution cycle for metrics to populate. Confirm the agent completed at least one job after registration.