Reference → System Status — Reference

System Status — Reference

The System Status surface is the internal-operator dashboard for platform health. It surfaces uptime indicators, per-service health checks, recent incidents, and the bridge to the externally published status page. Anything that answers the "is this part of SGEN healthy right now?" question for an operator traces back to a view managed here.

This page is a reference for platform engineers and integrators who need to understand the surface before extending it, scripting against it, or wiring it into an internal dashboard. Customer-facing how-tos live in the customer docs set; this page describes the shape of the surface, not the steps to drive it.


Overview

System Status lives under the System Status module in SG-Admin. The module renders three primary views — the overview dashboard (per-service traffic-light grid), the incident timeline (recent and active incidents with timestamps and operator notes), and the configuration page (status-page sync settings, notification routes) — and exposes a small set of write operations for incident open, incident update, incident close, mute a service, and unmute.

The module reads from two upstream sources. Service health is consumed from the platform's internal health-check signal — a periodic probe of each service that returns a structured pass / degraded / fail verdict. Recent incidents are consumed from the incident store, which is appended to by operators during recovery work and by the automated incident-detection layer.

A second surface — status-page sync — bridges System Status to the externally published status page. When sync is enabled, incident lifecycle events (open, update, close) are mirrored to the public status page, with operator-supplied summaries softened to customer-facing language before publication.

Where it lives in SG-Admin:

  • Sidebar: SG-Admin → System Status
  • URL prefix: /sg-admin/system-status/
  • View templates: application/views/Admin/SystemStatus/
The module is operator-facing. Customer-facing status updates flow through the published status page, not through this surface directly.
┌──────────────────────────────────────────────────────────────────────┐│ SG-Admin → System Status Last refresh: 0:32 │├──────────────────────────────────────────────────────────────────────┤│ Service Status Last check Uptime (30d) ││ ─────────────────── ───────── ─────────── ──────────────── ││ Admin (SG-Admin) ● healthy 0:31 ago 99.98% ││ Public site renderer ● healthy 0:30 ago 99.99% ││ Builder (SG-Builder) ● healthy 0:32 ago 99.95% ││ Media pipeline ◐ degraded 0:29 ago 99.71% ← active ││ Backups worker ● healthy 0:30 ago 99.99% ││ Email delivery ● healthy 0:31 ago 99.92% ││ Activity Log ingest ● healthy 0:32 ago 99.99% ││ ││ Active incident: Media pipeline — increased latency [Open ticket] │└──────────────────────────────────────────────────────────────────────┘

Actions

The System Status surface exposes the following operations. Each is described by what it does to the data, not by its internal method name.

Read overview

Returns the current health verdict for every monitored service, the last-check timestamp per service, and the rolling uptime percentages for 24-hour and 30-day windows. Health verdicts are consumed from the upstream health-check signal and are typically less than two minutes stale.

Read incident timeline

Returns the recent incident records, ordered newest first, with open / closed status, affected services, summary, and operator notes. Supports filtering by status (active only, closed only, all) and by service.

Open an incident

Creates a new incident record. Required at minimum: affected service, summary, severity (sev1, sev2, sev3, sev4). Optional: detail body, expected resolution time, mirror-to-status-page flag. On submit, the record is written and — if mirroring is enabled — the status page is updated.

Update an incident

Appends a new note to an existing incident. Notes carry their own timestamp and operator identifier, so the incident reads as a timeline of operator actions. If the incident is mirrored to the public status page, the note is softened to customer-facing language and posted as an update.

Close an incident

Marks the incident closed and records the resolution time. The summary may be edited at close to reflect the final understanding of what happened. Mirrored incidents post a final update to the status page on close.

Mute a service

Suppresses notifications for a single service for a configured duration. Used during planned maintenance to avoid noise from expected failures. The service's health verdict continues to be recorded; only the operator notifications are paused.

Unmute

Reverses a mute. Notifications resume immediately on the next health verdict.

Set status-page sync

Stores the configuration for the status-page bridge: target status-page URL, authentication token, mirror-enabled flag, and the severity threshold for auto-mirroring (sev1 always mirrors, sev2 and below are operator-choice per incident).

Set notification routes

Stores where status-change alerts are sent: email distribution list, internal chat channel, on-call rotation reference. Routes are evaluated per incident based on affected service and severity.


Data model

An incident record carries the following fields. Field names below are the conceptual shape — the on-disk column names match closely but are not contractually stable across releases.

FieldTypeNotes
idintegerPrimary key. Stable across edits.
servicestringService slug. One of the monitored service identifiers.
severityenumOne of sev1, sev2, sev3, sev4.
statusenumOne of active, closed.
summarystringShort label, customer-readable when softened.
detailstringLonger operator body. Not surfaced on the public status page.
opened_attimestampSet on initial open.
closed_attimestampSet on close, NULL while active.
opened_byintegerUser identifier.
mirror_to_status_pagebooleanWhen true, this incident's lifecycle posts to the public status page.
Incident note (append-only stream):
FieldTypeNotes
incident_idintegerForeign key.
notestringBody.
posted_attimestampWhen the note was added.
posted_byintegerUser identifier.
mirroredbooleanWhether this note was posted to the public status page.
Service health record (read-only stream):
FieldTypeNotes
servicestringService slug.
verdictenumOne of healthy, degraded, failed.
checked_attimestampProbe time.
latency_msintegerOptional, when the check measures response time.
error_codestringOptional, when the verdict is degraded or failed.
INCIDENT RECORD├── id integer primary key├── service string slug├── severity enum sev1 | sev2 | sev3 | sev4├── status enum active | closed├── summary string softened for public mirror├── detail string operator body, not mirrored├── opened_at / closed_at timestamp├── opened_by integer└── mirror_to_status_page booleanINCIDENT NOTES (append-only)├── incident_id (FK)├── note · posted_at · posted_by└── mirrored (boolean)SERVICE HEALTH (read-only stream)├── service · verdict (healthy | degraded | failed)├── checked_at · latency_ms└── error_code (optional)

Permissions

Access to the System Status surface is gated at two layers.

Layer 1 — admin gate. Every action under SG-Admin passes through the platform's standard admin access check at request entry. An unauthenticated request never reaches the System Status surface.

Layer 2 — per-action capability. Within SG-Admin, each System Status action checks a capability associated with the calling operator's role. The default role configuration ships with three roles — Administrator, Editor, Viewer — and the capability map is:

CapabilityAdministratorEditorViewer
Read overview
Read incident timeline
Open an incident
Update an incident
Close an incident
Mute a service
Unmute
Set status-page sync
Set notification routes
Custom roles defined under Settings → Roles override the default map. The capability slugs are stable; the role slugs are configurable.

Severity rules. sev1 incidents always mirror to the public status page regardless of the per-incident mirror flag — the platform contract requires customer transparency for the highest severity. The mirror flag controls behavior for sev2 through sev4 only.

Audit. Every write — open, update, close, mute, unmute, sync configuration change, notification-route change — emits an Activity Log entry. The log records the acting operator, the affected service, and the change shape. The incident note stream is its own audit trail and is preserved alongside the Activity Log.

HEALTH SIGNAL│▼┌───────────────────────────┐│ Verdict = degraded/failed │ auto-incident OR operator-opened└─────────────┬─────────────┘▼┌───────────────────────────┐│ Incident OPEN │ severity assigned · routes notified│ │ mirror flag evaluated└─────────────┬─────────────┘▼┌──────────────┐│ sev1 OR ││ mirror=true? │└──────┬───────┘no │ yes▼ ▼private PUBLIC STATUS PAGEincident posts initial update│▼operator notes appended│▼incident CLOSED│▼final mirror update(if mirrored)

Related references

  • Activity Log — Reference. Records every System Status write. Notification routes and sync configuration changes show up there.
  • Notifications — Reference. Owns the message-delivery surface that System Status calls into. Notification route slugs resolve against the Notifications module.
  • Settings — Reference. Hosts the role definitions and the global on-call rotation reference. Changes there reshape the System Status permission map.
  • Tools — Reference. Internal operator tooling that pairs with System Status during incident response — log search, diagnostic snapshots, escalation routing.
  • Cron Jobs — Reference. The health-check pipeline runs as a cron job. Schedule changes there change the freshness of the verdict stream.
  • API Keys — Reference. The status-page sync token is stored alongside other outbound integration credentials.
For the corresponding customer-facing surface — the public status page, incident archive, subscriber notification opt-in — see the published status page itself; there is no separate /docs/system-status customer doc, because the customer surface is the status page.
On this page