2026-05-17 - End-to-end Get Started runbook¶
Motivation¶
The previous get-started guide covered local setup and first Azure deployment, but stopped before a researcher could prove that a fresh clone actually reaches a complete ElasticBLAST result. The onboarding path needed one concrete, smallest-cost smoke test from prerequisites through AKS, database preparation, BLAST submit, and result download.
User-facing change¶
- Rewrote
docs/get-started.mdas a phased runbook: tool install, clone, local checks, App Registration,azd up, redirect URI, deployed app sign-in, smallest BLAST smoke test, network lockdown, cleanup, and troubleshooting. - Added a smallest end-to-end BLAST scenario using
16S_ribosomal_RNA,Standard_D2s_v3system pool,Standard_D8s_v3workload pool, one workload node,blastn, XML output format 5, warmup off, and sharding off. - Documented the required AKS Blob CSI /
azureblob-nfs-premiumcheck, the six-sidecar Container App check, terminal CLI verification, and the job-submit image rebuild caveat for Azure Blob NFS. - Added an optional clean Azure VM validation appendix that creates an Ubuntu 24.04
Standard_D4s_v5VM with SSH restricted to the caller IP, then runs the Linux prerequisite, clone, backend test, and web build steps from the same document. - Preserved the local API port correction to
127.0.0.1:8085.
API / IaC diff summary¶
api/tasks/azure.py: AKS provisioning now builds a cluster model with the Blob CSI driver enabled soazureblob-nfs-premiumis available for ElasticBLAST PV mode.api/services/image_tags.pyandapi/tasks/acr.py: the ACR build task now shell-quotes pre-build commands safely and patches thencbi/elasticblast-job-submit:4.1.0build context to copy all templates and skip the GCP-style VolumeSnapshot step unlessELB_CLOUD_PROVIDER=gcp.terminal/Dockerfileandterminal/profile.sh: terminal image PATH/dependency setup now exposes the vendoredelastic-blastCLI and installs the sibling runtime requirements.terminal/patch_elastic_blast.py: AKS workload templates are patched withworkload=blasttolerations and node selectors for init, submit, batch, and vmtouch workloads.docs/get-started.md: updated with live validation evidence and the operational checks discovered during the run..gitignore: keeps the Python packaginglib/ignore while explicitly allowingweb/src/lib/**, which contains TypeScript source modules required by the production build.
Validation evidence¶
- Clean Ubuntu 24.04 VM prerequisite replay succeeded: Azure CLI 2.86.0, azd 1.25.1, uv 0.11.14, Node v20.20.2, npm 10.8.2, jq 1.7, git 2.43.0, Python 3.12.13,
uv sync --all-groups, andnpm ci. AStandard_B2svalidation VM was too small for the full backend test suite and exited with code 137, so the runbook now recommendsStandard_D4s_v5for full clean-VM validation. - Deployed health check against the active Container App succeeded after restoring the six-sidecar revision:
GET /api/healthreturnedstatus=okon revisionca-elb-control--0000040. - Runtime images built in ACR
acrelbnm5virmqrdi5c.azurecr.io:ncbi/elb:1.4.0,ncbi/elasticblast-job-submit:4.1.0,ncbi/elasticblast-query-split:0.1.4, andelb-openapi:4.9. - Storage private endpoints / DNS were repaired;
16S_ribosomal_RNApreparation completed with 12 blobs and 18,433,197 bytes copied. - AKS
elb-smoke-aksprovisioned inrg-elb-ca/koreacentral, Kubernetes 1.34.7, withsystempool=Standard_D2s_v3x1 andblastpool=Standard_D8s_v3x1. Blob CSI was enabled andazureblob-nfs-premiumexisted. - Kubernetes validation:
init-pv,submit-jobs,elb-finalizer, andblastn-batch-16s-ribosomal-rna-job-000completed; workload pods ran onblastpoolwithworkload=blasttoleration and node selector. - Result validation: downloaded
results/elb-smoke-16s-r3/job-6445053ac15a400d9e653b167013d929/batch_000-blastn-16S_ribosomal_RNA.out.gzfrom the terminal sidecar; gzip size 1,971 bytes, decompressed XML size 17,918 bytes, and<BlastOutput>/</BlastOutput>were present. - Targeted tests:
uv run pytest -q api/tests/test_azure_provision_aks.py api/tests/test_acr_build_task.py-> 4 passed. - Clean-clone reproducibility tests fixed during validation: the backend no longer depends on untracked
docs/tempcalibration output, and the frontend MSAL config no longer readswindowin Node/Vitest imports. - Targeted lint:
uv run ruff check api/tasks/azure.py api/tasks/acr.py api/services/image_tags.py api/tests/test_azure_provision_aks.py api/tests/test_acr_build_task.py terminal/patch_elastic_blast.py-> all checks passed. - Terminal patcher validation:
python3 -m py_compile terminal/patch_elastic_blast.py; appliedterminal/patch_elastic_blast.pytwice to a temporary ElasticBLAST tree and confirmed workload tolerations/node selectors.