Incidents

Incident list, detail, and execution detail.

The most heavily used section of the dashboard. Three routes: the list, the incident detail, and the per-execution live log.

List

Route: /incidents Role gating: none.

Searchable, filterable table of every incident in the tenant.

Columns

Column	Notes
Severity	`critical` / `high` / `medium` / `low` / `info`
Type	`service_down`, `disk_full`, `cpu_high`, `memory_high`, `port_unavailable`, `custom`, etc.
Server	Hostname; links to the server detail
Status	`open` → `classifying` → `recipe_proposed` → `awaiting_approval` → `executing` → `resolved` (or `failed` / `escalated`)
Occurrences	Dedup counter for repeat alerts
Assigned	User or agent name
Source	`daemon`, `webhook`, `manual`, `proactive`
Created	Absolute timestamp
Resolution timer	Live ticker; turns red on SLA breach

Filters

Status, severity, source.
Free-text search across hostname, type, evidence.

Actions

Create incident manually. Modal: pick a server, type, severity, and initial evidence. Setting type to custom triggers an informational query — the agent runs the requested check fresh and resolves with the output.
Delete an incident.
Click a row → incident detail.

Incident detail

Route: /incidents/{id} Role gating: none for read; approve / reject requires admin.

Header

Severity and status badges.
Live SLA timer.
Assignment control. The incident may be assigned to a human user or to an AI agent; the Assign Agent button starts the agent pipeline immediately.

Tabs

Timeline. The agent's full reasoning trace. Every stage handoff (triage → diagnose → execute → review), every tool call with its arguments and output, every event recorded by the agent. This is the audit trail for what the agent did and why.
Evidence. JSON dump of all evidence collected on the incident (monitor output, daemon report, alert payload, anything the agent observed).
Executions. List of recipe executions tied to this incident. Pending executions show Approve and Reject buttons; approval starts the playbook immediately. Each row links to the execution detail.
Report. Post-mortem RCA generated by the review agent on resolved incidents.

Review section

Resolved incidents show a Review Agent Performance panel that lets operators score the diagnosis and remediation quality. Reviews feed into agent learning over time.

Execution detail

Route: /executions/{id} Role gating: none for read; rollback requires admin.

Reached from the Executions tab on an incident.

Sections

Metadata. Recipe, server, parent incident, who approved it, timestamps.
Live Output. WebSocket-fed stdout from the running playbook (/ws/executions/{id}).
Playbook Output. Per-Ansible-task summary with status (ok / changed / failed), expandable stdout/stderr, return code.

Actions

Rollback — only enabled when the execution succeeded and the recipe defines a rollback playbook. Re-runs the rollback against the same target.

Related routes

servers — incidents are scoped to a server
recipes — executions reference a recipe
agents — assignments reference an agent
sla — the resolution timer compares against an SLA policy