BLAST Web Equivalence Hardening¶
Motivation¶
Actual MPXV F3L example runs exposed two user-visible gaps in the BLAST result flow:
- A submit task could briefly mark a dashboard job
completedbefore parseable result files existed. - The frontend defaulted to
-soft_masking true, while the verified NCBI Web BLAST-compatible MPXV/core_nt configuration requires-dust yes -soft_masking falseunless the user explicitly enables lookup-table-only masking.
User-facing Change¶
- Running jobs that have reached Kubernetes completion but have not produced parseable result files now remain
running/results_pendinginstead of showing as complete. - The result page explains that final BLAST result files are still being prepared instead of claiming the completed job has missing files.
- New BLAST searches default to Web-compatible hard masking for nucleotide low-complexity filtering. Users can still enable
Mask for lookup table onlyto send-soft_masking trueexplicitly. - Precise sharded submissions now opt into prepared DB-order oracle metadata when available, and strict tie-order oracle submissions widen the shard-local candidate pool so the finalizer can actually find the Web-selected accessions.
- Concurrent
elastic-blast submitcalls are serialized in the worker to avoid Kubernetesinit-ssd-*immutable Job patch conflicts. - The finalizer now looks for tie-order oracle metadata under both the ElasticBLAST internal result prefix and the parent dashboard job prefix, so oracle files uploaded before the internal
job-*id is known are still applied during merge. Strict oracle merge reports also include missing oracle accession diagnostics for DB snapshot gaps.
API / UI Diff Summary¶
api.tasks.blast.submitgatescompletedon parseable result artifacts.api.services.blast_job_state._refresh_running_blast_statealso refuses to promote running jobs to completed until result artifacts exist.api.tasks.blast.reconcile_stale_jobsno longer treats CelerySUCCESSfrom a submit task as BLAST completion when the task result still saysrunningor when completed artifacts are absent.web/src/pages/blastSubmitModel.tsandweb/src/pages/blastSubmit/useSubmitMutation.tsdefault to-soft_masking falsefor BLASTN low-complexity filtering.web/src/pages/blastSubmit/useSubmitMutation.tsincludesuse_db_order_oraclefor precise sharded submissions.api.tasks.blast.submitserializes terminal-sideelastic-blast submitinvocations with a Redis lease and expands strict tie-order oracle runs to-max_target_seqs 5000unless the user already requested a wider pool.api.services.monitoring.k8s_check_blast_statusnow filters Kubernetes Jobs byelb-job-ideven before scoped pods exist, avoiding false completion from another active BLAST run.terminal/patch_elastic_blast.pypatches the ElasticBLAST finalizer script to search both${ELB_RESULTS}/metadataand the parent dashboard-prefixmetadatadirectory fortie-order-oracle*.txtfiles before invokingmerge-sharded-results.sh.terminal/merge-sharded-results.shrecordstie_order_oracle_missing_countandtie_order_oracle_missing_querieswhen strict oracle accessions are absent from the shard result pool.- BLAST result tabs render pending-result copy while the job is still running.
Validation Evidence¶
- Focused backend:
uv run ruff check api/tasks/blast/__init__.py api/services/blast_job_state.py api/tests/test_blast_tasks.py api/tests/test_local_to_blast_job.py - Focused backend:
PYTHONPATH=$PWD uv run pytest -q api/tests/test_blast_tasks.py::test_gate_completed_submit_waits_for_result_artifacts api/tests/test_blast_tasks.py::test_gate_completed_submit_allows_completed_with_result_artifacts api/tests/test_blast_tasks.py::test_reconcile_celery_success_marks_row_completed api/tests/test_blast_tasks.py::test_reconcile_submit_success_keeps_running_row_running api/tests/test_blast_tasks.py::test_reconcile_submit_completed_waits_for_result_artifacts api/tests/test_local_to_blast_job.py::test_refresh_running_blast_state_waits_for_result_artifacts - Focused frontend:
npm run test -- --run src/pages/blastResults/analytics/blastAnalyticsState.test.ts src/pages/blastSubmit/taxonomyFilter.test.ts - Browser: job
6db7b6f4-480a-40f6-9765-1201dac9e8adrendersRESULTS_PENDINGand explains that final BLAST result files are still being prepared. - Real BLAST: job
9d633524-4633-42af-83c8-63ff789f7afcproducedmerged_results.out.gz; aggregate parses1 / 1files with100hits. - NCBI comparison evidence: Web RID
0TACF1Z1016showed that-soft_masking trueproduces mismatched raw/bit scores (462/854.272) versus Web BLAST (448/828.419), motivating the default hard-masking fix. - NCBI comparison evidence: corrected job
6db7b6f4-480a-40f6-9765-1201dac9e8adwith-dust yes -soft_masking falsematched Web BLAST primary HSP values (score=448,bits=828.419,value_mismatch_count=0). - NCBI comparison evidence: DB-order oracle job
9400ebe6-4487-4457-a461-7077445a6f30still hadtop100_overlap=4andvalue_mismatch_count=0, confirming exact Web row membership/order requires a Web accession oracle for this tied MPXV/core_nt window. - Strict Web oracle run: job
eb5771a0-b20f-437b-a21f-ec62670c1bdf, internal ElasticBLAST idjob-091df19b32144d09940cbb659b928ce9, task1cb94c03-dd3f-4ab8-ad91-3e3b956c0f86completed and producedmerged_results.out.gz. - Strict Web oracle finalizer evidence: patched
elb-scriptsrerun producedmerge-report.jsonwithranking_basis=best_hsp_evalue_bitscore_oracle_ordinal,tie_order_oracle_accessions=100,tie_order_oracle_strict=true,tie_order_oracle_missing_count=1,first_missing_accessions=[OZ470124.1], andtotal_output_hits=99. - NCBI comparison evidence:
docs/temp/blast-equivalence-20260520/compare-strict-oracle-v2-report.jsonhadshared_accessions=99,top100_overlap=99, andvalue_mismatch_count=0; the only Web-only accession wasOZ470124.1. - DB snapshot evidence: raw shard XML search found no
OZ470124.1, andblastdbcmd -db /blast/blastdb/core_nt_shard_00..09 -entry OZ470124.1returnedEntry not foundon all 10 node-local shards. The remaining 1/100 Web difference is therefore a localcore_ntsnapshot gap, not a result parser, finalizer ordering, or UI issue. - Browser evidence:
http://127.0.0.1:8090/blast/jobs/eb5771a0-b20f-437b-a21f-ec62670c1bdfrendered99 shown, 99 filtered of 99 hitsin Descriptions, and the Alignments tab rendered actual Query/Sbjct alignment blocks after data load. - Focused backend after concurrency/status hardening:
PYTHONPATH=$PWD uv run pytest -q api/tests/test_k8s_blast_status.py api/tests/test_local_to_blast_job.py::test_refresh_running_blast_state_waits_for_result_artifacts api/tests/test_blast_submit_route_options.py api/tests/test_blast_oracles.py - Full backend:
PYTHONPATH=$PWD uv run pytest -q api/tests(722 passed). - Full frontend:
npm run test -- --run(193 passed) andnpm run build(passed with the existing chunk-size warning).