2026-05-15 — Fix local compose browser terminal reachability¶
Motivation¶
The browser terminal did not work in the full local Docker Compose stack even though the terminal service reported healthy. There were three root causes.
First, the local networking model was wrong. Production Container Apps sidecars share loopback, but Docker Compose services do not.
ttyd was always bound to 127.0.0.1:7681 inside the terminal container while the api container was configured to proxy to http://terminal:7681. From the api container that produced Connection refused. The exec server on :7682 worked because compose already overrode EXEC_HOST=0.0.0.0, so the terminal sidecar healthcheck missed the broken interactive shell path.
Second, the terminal image created an azureuser account but ended with USER 1000:1000. On Ubuntu 24.04 that UID/GID belongs to the base image's ubuntu user, while azureuser was created as UID 1001. As a result the container ran as ubuntu with HOME=/home/ubuntu, while the intended working directory /home/azureuser was owned by azureuser. After the bind fix, WebSocket upgrade succeeded but ttyd closed immediately with uv_write: ESRCH (no such process) because the shell/pty process died at session startup.
Third, the React terminal client used text frames ("0" + data, "1" + resizeJson) from an incorrect reading of ttyd's protocol. The ttyd web client actually sends the first message as raw JSON bytes ({columns, rows}) and then sends binary frames where byte 0 is the command ("0" input, "1" resize). Sending the text-frame resize as the first message made ttyd close the session before any shell output was produced.
User-facing change¶
In the full local compose stack, opening /terminal through http://127.0.0.1:18080/ can now reach the ttyd browser shell through the api WebSocket proxy.
Production posture is unchanged: ttyd still binds to 127.0.0.1 by default. Only scripts/dev/docker-compose.full.yml sets TTYD_HOST=0.0.0.0 because compose containers need service-network reachability.
The terminal container also now runs as the intended azureuser account with HOME=/home/azureuser, so shell startup, Azure CLI profile state, and kubeconfig paths agree with the documented browser-terminal contract.
The React terminal now speaks ttyd's binary framing protocol, so input and resize events are accepted by ttyd instead of closing the session.
The terminal page is also more defensive around degraded conditions: the ticket request has an 8 second abort, terminal dimensions are clamped before they reach ttyd, late WebSocket/timer callbacks are ignored after unmount, and input/resize/output framing is isolated in test-covered helpers.
The api WebSocket proxy now verifies the upstream ttyd socket before accepting the browser WebSocket. If ttyd is unavailable, the browser does not see a false connected state. Proxy forwarding also tears down the paired forwarding task as soon as either side closes, so tab closes or upstream disconnects do not leave a dangling half-session.
API / IaC diff summary¶
terminal/entrypoint.shnow readsTTYD_HOST, defaulting to127.0.0.1, and passes it tottyd -i.terminal/entrypoint.shnow forces the operator home to/home/azureuserunlessTERMINAL_HOMEis explicitly set.terminal/DockerfilesetsHOME,USER, andSHELL, and switches toUSER azureuser:azureuserinstead of numeric1000:1000.scripts/dev/docker-compose.full.ymlsetsTTYD_HOST=0.0.0.0for theterminalservice.- The compose
terminalhealthcheck now probes bothhttp://127.0.0.1:7681/andhttp://127.0.0.1:7682/healthz, so an interactive-shell outage is no longer hidden by a healthy exec server. web/src/pages/RemoteTerminal.tsxnow sends ttyd's initial JSON bytes and command-prefixed binary frames for input and resize.web/src/pages/remoteTerminalProtocol.tscentralises ttyd frame encoding/decoding and clamps terminal size to a stable range.web/src/pages/RemoteTerminal.tsxnow aborts slow ticket requests, avoids state updates after unmount, disposes the xterm input listener, clears timers, and closes the WebSocket during cleanup.api/routes/terminal_ws.pynow connects to ttyd before accepting the browser WebSocket and cancels the opposite forwarding task when either side closes.
Validation evidence¶
- Before the fix, from inside the
apicontainer: http://terminal:7681/failed withConnection refused.http://terminal:7682/healthzreturned200.- After the bind fix but before the user fix, direct WebSocket to
ws://terminal:7681/wsnegotiated subprotocolttybut timed out without shell output; terminal logs showeduv_write: ESRCH (no such process). Runtime inspection showeduid=1000(ubuntu),HOME=/home/ubuntu, and/home/azureuserowned byazureuser. - After the user fix, direct WebSocket still closed when using the old text-frame protocol. Inspecting ttyd's served JS showed
onSocketOpen()sends raw JSON bytes first, whilesendData()and resize sendUint8Arrayframes prefixed with ASCII command bytes. Retesting with binary framing returnedDIRECT_READYfrom the shell. bash -n terminal/entrypoint.shpassed.docker compose -p elb-control-local -f scripts/dev/docker-compose.full.yml configpassed.docker compose -p elb-control-local -f scripts/dev/docker-compose.full.yml up -d --build terminal apirebuilt the terminal image and recreated the local terminal / api services.- Runtime user validation inside the terminal container returned
uid=1001(azureuser),HOME=/home/azureuser,SHELL=/bin/bash,USER=azureuser. - From inside the
apicontainer: http://terminal:7681/returned200with ttyd HTML.http://terminal:7682/healthzreturned200.http://127.0.0.1:8080/api/terminal/healthreturned{ "status": "ok", "upstream_status": 200 }.- Ticketed WebSocket round-trip through
ws://127.0.0.1:18080/api/terminal/ws?ticket=...negotiated subprotocolttyand returned shell output containingELB_TERMINAL_READY_.... cd web && npm run buildpassed after the React protocol fix.cd web && npm run test -- remoteTerminalProtocol.test.tspassed with coverage for terminal-size clamping, initial-size encoding, command-prefixed frames, and output-frame decoding.uv run ruff check api/routes/terminal_ws.pypassed after the upstream-before-accept and forwarding cleanup changes.uv run pytest -q api/tests/test_smoke.py api/tests/test_terminal_exec.pypassed (31 passed).uv run pytest -q api/testspassed (120 passed).- Runtime ticket edge checks through
http://127.0.0.1:18080confirmed reused, invalid, and missing tickets are rejected with WebSocket HTTP 403. - Runtime WebSocket round-trip through
ws://127.0.0.1:18080/api/terminal/ws?ticket=...returned shell output containingELB_HARDEN_...after the proxy cleanup fix.