Warmup Search Space Submit Guard¶
Motivation¶
Precise sharded BLAST should not start until node-local warmup is actually ready. Sharded result equivalence also depends on sending the calibrated full-run effective search space to every shard with -searchsp; otherwise BLAST computes shard-local effective search spaces and E-values drift from the full DB run.
Warmup also took longer than expected on the 10 x E16 workload pool. The previous default launched 64 AzCopy workers per warmup pod, which can create about 640 concurrent transfers across the pool and increase Storage throttling risk.
User-facing Change¶
- Sharded BLAST submissions now wait in Celery for node-local warmup readiness instead of proceeding against a cold or partially warmed DB.
- If warmup is still loading, the job remains retryable in the
waiting_for_warmupphase and survives browser refresh. - Calibrated
core_ntsearch-space propagation is now covered by a regression test using the documented calibration artifact. - New warmup Jobs use lower AzCopy defaults: 16 concurrent transfers and 2 GiB buffer per pod.
API / Task Diff Summary¶
api.tasks.blast.submitchecks Kubernetes warmup status before sharded submit paths.api.tasks.blastnow treatssharding_mode != offordb_auto_partition=trueas requiring Ready node-local warmup unlessenable_warmup=falseis explicit.api.services.warmup_jobsreduces default AzCopy concurrency and buffer values for new warmup Jobs.- Tests lock the
core_ntcalibratedStatistics_eff-spacevalue32156241807668into the generated BLAST config as-searchsp 32156241807668.
Validation Evidence¶
uv run pytest -q api/tests/test_blast_tasks.py api/tests/test_warmup_jobs.py— 79 passed.uv run ruff check api/tasks/blast.py api/services/warmup_jobs.py api/tests/test_blast_tasks.py api/tests/test_warmup_jobs.py— passed.
Notes¶
The core_nt value is valid for the documented calibration conditions in docs/blast-searchsp-discovery.md. This change does not claim every future NCBI Web BLAST configuration has the same hidden effective search space; it guarantees that when calibrated metadata supplies a full-run effective search space, the dashboard-generated sharded config applies it uniformly to every shard.