Local BLAST 16S Baseline Evidence¶
Motivation¶
The AKS one-shot pod path was too slow for the small EQ-07 Web-vs-local 16S
baseline check because pod scheduling and image readiness dominated the work.
Installing BLAST+ locally makes the small database comparison fast enough to
iterate.
User-Facing Change¶
No application behavior changed. The discovery evidence now records the local BLAST+ installation path and the first local 16S baseline comparison against the saved NCBI Web BLAST XML.
API/IaC Diff Summary¶
- No API changes.
- No IaC changes.
- Installed BLAST+ 2.17.0 under
~/.local/elb-tools/ncbi-blast-2.17.0+. - Downloaded the local
16S_ribosomal_RNABLAST database under~/.cache/elb-dashboard/blastdb/16S_ribosomal_RNA/. - Added evidence files under
docs/temp/web-blast-equivalence/2026-05-17-16s-carnobacterium/.
Validation Evidence¶
blastn -versionreportsBLASTN 2.17.0+.blastdbcmd -db 16S_ribosomal_RNA -inforeports27,648 sequencesand40,051,470 total bases, matching the saved Web XML statistics.- Local full baseline command completed in
2.54sand produced 500 hits / 500 HSPs with the same top hit as Web RID0JXX09HH016. scripts/dev/compare-blast-xml.pygeneratedweb-vs-local-16s-canonical-compare.json; strict comparison does not pass yet (difference_count=5991). The first hit-order mismatch is rank 111, Web/local hit ID overlap is 438/500, and Web XML reportsStatistics_eff-space=0while local BLAST+ reports57425628120.- A synthetic local 4-shard 16S probe completed without AKS: FASTA extraction was
1.81s, shard BLAST runs were0.93sto1.06seach, and merged XML matched the full local baseline statistics exactly (db_len=40051470,db_num=27648,eff_space=57425628120,hsp_len=26). Strict hit ordering still does not pass: contiguous shards preserve the same top accession and top-10 order, but first accession mismatch is rank 15 and top-500 accession overlap is 442/500. db-snapshot-and-value-equivalence.jsonconfirms the important same-database condition for the 16S probe: Web XML and the local May 14 2026 database both reportdb_len=40051470anddb_num=27648. Web-vs-local full has 438 shared accessions and all 438 have identical primary HSP/value fields. Local full-vs-sharded has 442 shared accessions; 437 have identical primary HSP/value fields, and the five mismatches areHit_defGI-formatting differences caused by FASTA-based shard DB regeneration.