2026-05-15 — Sidecars topology animation is now event-driven¶
Motivation¶
The "Control Plane Sidecars" dashboard card rendered four particle dots
that animated infinitely on a CSS keyframe loop, gated only by each
endpoint being health === "ok". The card's own legend said
"animated dot = live traffic", but the dots had no relationship to
real traffic — they fired forever even when the system was idle.
This violated the dashboard's contract: every number on the page should come from a real Azure / Kubernetes / sidecar reading. A purely decorative animation that lies about activity is worse than no animation.
User-facing change¶
Each row of the topology now spawns particles per real event drained from the previous 5-second snapshot tick:
| Row | Event source | Counter incremented when |
|---|---|---|
row1 Browser → frontend → api |
RequestIdMiddleware in api/main.py |
every non-/api/health, non-/api/monitor/sidecars* HTTP request lands on the api sidecar |
row2 api → redis → worker |
Celery before_task_publish signal in api/celery_app.py (when SIDECAR_NAME != "beat") |
api enqueues a Celery task |
row3 beat → redis |
same signal, but only when SIDECAR_NAME == "beat" |
beat publishes a periodic task |
row4 api ↔ terminal |
RequestIdMiddleware for any path starting with /api/terminal/ |
terminal WebSocket / exec proxy used |
A small numeric badge next to each row label shows the exact count
for the most recent tick (capped at 6+ so a sudden burst doesn't render
a wall of dots — the badge reads 6+ to make the cap visible). When the
count is zero the badge is hidden, so an idle system shows zero motion
and zero badges. The legend now reads ● dot = real event since last tick.
API / IaC diff summary¶
Backend (api/)¶
- New
api/services/event_emitter.py: Cross-process counter at the natural shared point — Redis hashsidecar:events.emit(row, count=1)does a best-effortHINCRBYwith a cachedredis.Redisclient (OPS_REDIS_URL) and never raises — decorative telemetry must never affect a request. The request-path timeout defaults are intentionally tiny (EVENT_EMIT_CONNECT_TIMEOUT_SECONDS=0.05,EVENT_EMIT_SOCKET_TIMEOUT_SECONDS=0.05) and a Redis failure opens a short circuit breaker (EVENT_EMIT_FAILURE_COOLDOWN_SECONDS=5) so a degraded broker cannot add repeated half-second stalls to API traffic. Counts are clamped byEVENT_EMIT_MAX_COUNT=1000to keep a corrupt or bursty counter from inflating the SSE payload.drain(client)does a pipelinedHGETALL+DELETEso each snapshot tick atomically reads-and-resets the four row counters. api/services/sidecar_metrics.pycollect_snapshotnow drains the counters and includesevents: { row1, row2, row3, row4 }in the payload (the_all_down_snapshotfallback also returns zeros so the SPA can rely on a stable shape).api/main.pyRequestIdMiddleware.dispatchemitsROW_TERMfor/api/terminal/*paths,ROW_HTTPotherwise, excluding/api/healthand the sidecar monitor endpoints (those fire from the polling/SSE itself and would self-pollute).api/celery_app.pyregisters abefore_task_publishsignal handler that emitsROW_SCHEDfrom thebeatsidecar (envSIDECAR_NAME=beatset inscripts/dev/docker-compose.full.yml) andROW_ASYNCfrom everywhere else.
Frontend (web/)¶
web/src/hooks/useSidecarMetrics.tsSidecarsSnapshotgets the optionalevents?: { row1?: number; … }field with a docstring naming each row.web/src/components/cards/SidecarsCard.tsx:RowParticleacceptsonEnd, forcesanimationIterationCount: "1"andanimationFillMode: "forwards", and the.topo-row-particleCSS class no longer specifiesinfinite.- New
useEventParticles(data)hook keeps aParticleEvent[]queue, de-dupes onsnapshot.ts(so a re-render or fallback poll reading the same snapshot doesn't double-fire), spawns up toPARTICLES_PER_TICK_CAP = 6per row per tick with a 0.18 s stagger, and exposeslastCountsfor the row badge. - Each row drops the old health-gated single particle and renders
particles.filter(p => p.row === N).map(...). Row 3 keepsdurationSec={0.9}and the specialendRight="calc((100% - 458px) / 2 + 250px)"so the dot stops at the broker label. - Legend updated to "● dot = real event since last tick".
Infra¶
None. The Redis sidecar already exists with OPS_REDIS_URL shared by
all sidecars (db 2 in
scripts/dev/docker-compose.full.yml).
Validation evidence¶
uv run pytest -q api/tests— 110 passed, including the newapi/tests/test_event_emitter.pycoveringemitincrement, swallowed unknown rows / non-positive counts,drainempty / non-empty / unknown-fields / Redis-error paths, plus acollect_snapshotintegration that proves the snapshot drains the hash atomically.uv run pytest -q api/tests/test_event_emitter.py— 11 passed after hardening, covering count clamping, Redis-error cooldown, and tolerant fallback when optionalEVENT_EMIT_*tuning env vars are malformed.cd web && npm run build— clean (tsc -b && vite build✓).- Compose
docker compose -f scripts/dev/docker-compose.full.yml up -d --no-deps --force-recreate api worker beat frontend— all 6/6 healthy. - Backend smoke (no SSE drain in between):
- Browser screenshot of dashboard after a 5×enqueue + 4×/api/me burst:
Browser badge shows
6+, Async badge shows5, Scheduled / ws-exec show no badge (no activity → no dots). With the previous infinite CSS animation those last two rows would have been firing dots regardless.