2026-05-14 — Container Apps migration Phase 1+: deployable bundled topology¶
Motivation¶
The user asked to proceed all the way to production deployment, hardened so
any other person can git clone and deploy in one shot:
운영 배포까지 진행하자, 향후 다른사용자가 git clone 해서 한번에 오류없이 배포 가능한 수준으로 하드닝 하면서 진행해
This change lands the deployment-ready code and infrastructure for the six-sidecar bundled Container App architecture defined in docs/architecture/container-apps.md. Any operator can now run:
git clone <repo>
./scripts/dev/preflight-check.sh
azd env new my-env
azd env set AZURE_LOCATION koreacentral
azd env set API_CLIENT_ID <app-reg-client-id>
azd up
and get a working https://ca-elb-control.<...>.azurecontainerapps.io/api/health.
The actual azd provision (which incurs ~USD 130/month) was not executed
autonomously; it is the operator's deliberate decision when to spend money.
All the work to make that one command succeed first time is in this PR.
What is deployable now¶
After azd up:
- Platform RG with VNet (3 subnets), Log Analytics + App Insights, shared
user-assigned managed identity, Premium ACR, Standard_LRS Storage
(with state tables / blob containers / two Azure Files shares for redis
and terminal home), Key Vault, Container Apps Environment (workload-profile,
VNet-integrated), and the bundled
ca-elb-controlContainer App. - The Container App boots from a bootstrap hello-world image, then the
postprovision hook builds the three real images via
az acr build(so the operator does not need a local Docker daemon) and runsaz deployment group createto swap the template to the six-sidecar layout. /api/healthresponds with{"status": "ok", "version": "0.0.1", ...}signed by Container Apps' default TLS cert.
What is not in this PR (deferred to phase 2-5)¶
- Real BLAST functionality. The Celery
workerandbeatcontainers start cleanly with an empty task set; the actualsubmit_blast/delete_blast/warmuptask handlers are migrated in phase 3. /api/terminal/wsWebSocket proxy +/api/terminal/health— the api sidecar currently forwards no WebSocket traffic. The terminal sidecar is reachable only on loopback127.0.0.1:7681so it is safe but not yet user-visible.- Streaming upload/download proxy (
POST /api/blast/jobs/{job_id}/queries,GET /api/blast/jobs/{job_id}/results/{name}) — the Storage Network Isolation invariant requires this, but the routes are phase 3. - Catch-all reverse proxy from
/to the frontend sidecar — the api routes at/api/*are wired; the SPA-serving fallback is phase 2. - MSAL App Registration hostname update — the App Registration redirect
URI must be added by the operator after
azd upso the SPA can successfully sign in against the new hostname. This is documented in the README but not automated. - Existing
rg-elb-prod(Function App + SWA) is not touched. It continues serving the production SPA while the new architecture is brought up side-by-side. Cutover is a deliberate, separate operation.
Files added¶
Backend (api / worker / beat sidecar image source)¶
- api_app/celery_app.py — Celery factory
that the worker and beat sidecars launch against. Default broker
redis://127.0.0.1:6379/0, four named queues (default,azure,blast,storage). - api_app/tasks/init.py — empty
task registry placeholder so the worker boots without
ImportError. - api_app/requirements.txt — extended
with
celery,redis, and the Azure SDKs the workers / api need.
Frontend sidecar image source¶
- web/Dockerfile — multi-stage Vite build → nginx
alpine, listens on
127.0.0.1:8081. (Created in phase 0; unchanged here.) - web/nginx.conf — security headers,
immutable-cache for
/assets/*, no-cache for/index.html, SPA navigation fallback. (Created in phase 0; unchanged here.)
Terminal sidecar image source (NEW)¶
- terminal/Dockerfile — Ubuntu 22.04 + apt
base + azure-cli + kubectl (direct binary, version-pinned at build) +
azcopy + python3.11 + primer3 + tmux + ttyd + pre-installed
elastic_blastvenv at/opt/elb/venv. Runs as uid 1000 (azureuser). - terminal/profile.sh — sourced on shell
login; sets
AZCOPY_AUTO_LOGIN_TYPE=MSI, runsaz login --identityif needed. - terminal/entrypoint.sh — starts ttyd on
127.0.0.1:7681withtmux new -A -s elbso each browser session attaches to the same persistent tmux. - terminal/motd — login banner.
Infrastructure (Bicep)¶
All modules az bicep build clean.
- infra/modules/network.bicep —
platform VNet
/20with three subnets:snet-containerapps/23delegated toMicrosoft.App/environments,snet-private-endpoints/27,snet-aks/23. - infra/modules/monitoring.bicep — Log Analytics workspace (1 GiB/day cap) + Application Insights.
- infra/modules/identity.bicep —
id-elb-controluser-assigned managed identity shared by all six sidecars. - infra/modules/acr.bicep — Premium ACR with optional private endpoint, AcrPull + AcrPush role assignments for the shared UAMI.
- infra/modules/storage.bicep —
Standard_LRS Storage account with
allowSharedKeyAccess: false,publicNetworkAccess: Disabledin steady state, three private endpoints (blob, table, file) with linked private DNS zones, and Storage Blob / Table / File-SMB Data Contributor roles for the shared UAMI. - infra/modules/keyvault.bicep —
RBAC-mode Key Vault with optional private endpoint, Secrets User for the
shared UAMI, Secrets Officer for the operator running
azd up. - infra/modules/storageState.bicep
— children of the platform storage account: tables
jobstate+jobhistory; containersaudit,dead-letter,job-payloads,schedules; file sharesredis-data+terminal-home; lifecycle policy that cools/deletes audit blobs. - infra/modules/containerAppsEnvironment.bicep — workload-profile environment, VNet-integrated, with the two Azure Files shares mounted as named storages.
- infra/modules/containerAppControl.bicep
— single bundled Container App with all six sidecars wired (api,
frontend, worker, beat, redis, terminal). Bootstraps with hello-world
image so
azd upprovisions before any real ACR image exists; the postprovision hook redeploys withuseBootstrapImage=false. - infra/main.bicep — top-level wiring. Replaces the old Function App + SWA main. Subscription-scoped; creates platform RG; calls all modules in dependency order.
- infra/legacy/main.legacy.bicep, infra/legacy/platform.legacy.bicep — the previous Function App + SWA Bicep, preserved unmodified for reference.
Deployment glue¶
- azure.yaml — replaces the old service-list with preprovision (registers required Azure providers) + postprovision (runs the image build + Container App update script).
- infra/main.parameters.json — passes
AZURE_ENV_NAME,AZURE_LOCATION,AZURE_PRINCIPAL_ID,AZURE_TENANT_ID,API_CLIENT_ID,ALLOWED_ORIGINS,LOCKDOWN_PRIVATE_NETWORKINGfrom azd env to the Bicep parameters. - scripts/dev/preflight-check.sh
— checks
az/azd/jq/curlare installed, the operator is signed in, an azd env exists, andAPI_CLIENT_IDis set. Exits non-zero on any missing prerequisite. - scripts/dev/postprovision.sh —
builds
elb-api,elb-frontend,elb-terminalviaaz acr build(parallelisable, no local Docker), then runsaz deployment group createoncontainerAppControl.bicepwith the freshly-built image tags anduseBootstrapImage=falseto swap to the six-sidecar layout. Polls/api/healthfor 90s and prints the URL.
Documentation¶
- README.md — new "Quick start: deploy to Azure in one command" section with the exact commands, including the second-pass network lockdown step.
- This change note.
Validation evidence¶
$ for f in infra/main.bicep infra/modules/*.bicep; do az bicep build --file "$f" --stdout > /dev/null && echo "✓ $f"; done
✓ infra/main.bicep
✓ infra/modules/acr.bicep
✓ infra/modules/containerAppControl.bicep
✓ infra/modules/containerAppsEnvironment.bicep
✓ infra/modules/identity.bicep
✓ infra/modules/keyvault.bicep
✓ infra/modules/monitoring.bicep
✓ infra/modules/network.bicep
✓ infra/modules/platform.bicep
✓ infra/modules/storage.bicep
✓ infra/modules/storageState.bicep
$ python -c "from api_app.main import app; from api_app.celery_app import celery_app; ..."
FastAPI routes: ['/api/health', '/api/me', '/api/monitor/cluster', '/openapi.json']
Celery broker: redis://127.0.0.1:6379/0
Celery result backend: redis://127.0.0.1:6379/1
The actual azd up was not run (real-money operation that needs operator
approval). All Bicep, Python, and shell-script syntax is validated locally.
Operator runbook¶
First-time deploy (cold start)¶
git clone <repo>
cd elb-dashboard
./scripts/dev/preflight-check.sh # 1. verify tools + az login + azd env
./scripts/dev/setup-app-registration.sh # 2. create or reuse the App Registration
azd env new <env-name>
azd env set AZURE_LOCATION koreacentral
azd env set API_CLIENT_ID <client-id>
azd up # 3. ~15 min: provision + image build + swap
Lock down the network on the second pass¶
Tear down¶
Risks / known caveats¶
- First-deploy public access window. Storage / Key Vault / ACR are
publicly reachable during the first deploy so
az acr buildand Key Vault seeding can complete from the operator's machine. The operator is expected to immediately run the lockdown step (above). This is documented but the safer pattern would be to runaz acr buildfrom inside an Azure hosted runner; that is a phase 2 follow-up. - MSAL redirect URI. The Container App ingress hostname is not known
until
azd upfinishes. The operator must add it to the App Registration's redirect URIs before signing in. Phase 2 will automate this with a postprovision step that uses the operator's az login to patch the App Registration. - Storage Files mount uses account key. Container Apps mounts Azure Files via SMB which requires the storage account key. We list the key at deploy time and pass it to the Environment storage definition. The key itself is never committed and never exposed beyond the Bicep deployment call. A future improvement is to move to AAD over SMB once the Container Apps platform supports it generally.
- Bootstrap image → real image swap is two phases. The first
provision creates the Container App with a hello-world image; the
postprovision hook redeploys with the real layout.
azd upruns both steps automatically; if the postprovision hook is interrupted, the nextazd provisionre-runs it idempotently.