BLAST XML Results Preview¶
Motivation¶
Completed BLAST jobs that use -outfmt 5 produce XML result files, but the
dashboard analytics/export path only parsed tabular -outfmt 6 / -outfmt 7
rows. Researchers could download the XML, but could not quickly inspect hits in
the browser or export one consolidated CSV in the same way they would review an
NCBI Web BLAST results table.
User-facing change¶
- The results analytics page now opens on a Hits table that shows query, accession, organism/taxid, description, HSP query coverage, identity, length, e-value, bit score, source file, and a conservative review badge.
- The Hits / Alignments tabs now read all parseable result blobs by default instead of treating the first result file as the whole job. The UI reports returned, filtered, total-hit, and file-coverage counts and supports paging, sorting, accession text filtering, organism/taxid filtering, identity, HSP query-coverage, and e-value thresholds.
-outfmt 5BLAST XML results are parsed into the same canonical hit model as tabular output, so overview stats, alignment cards, CSV, TSV, and JSON export work for XML-backed jobs.- Result exports include computed query/subject coverage, source blob, and diagnostic review fields when those values are available.
- HSP coverage is computed from query/subject coordinate spans when coordinates are available, preventing gapped alignments from inflating review badges via raw alignment length.
- Alignment preview parsing stops at a server-side hit safety cap and surfaces the response as partial instead of letting very large result sets exhaust the API sidecar.
- Numeric hit values are rendered defensively when a malformed tabular result field has to be preserved as text, preventing one bad row from breaking the preview.
- Partially readable result sets are surfaced as degraded in the UI instead of being mistaken for clean no-hit jobs.
- Gzipped result blobs such as
merged_results.out.gzare read through a bounded decompression path before parsing.
API / IaC diff summary¶
api/services/blast_results_parser.pyaddsparse_blast_xmlandparse_blast_result_contentfor XML/tabular auto-detection, with namespace-tolerant XML element traversal.api/services/storage_data.pyaddsread_result_blob_text, which inflates.gzresult blobs with a decompressed byte cap.api/routes/stubs.pyexpands the result parser target set from.outonly to.out,.out.gz,.xml, and.xml.gz, and makes the alignment endpoint page/sort/filter across all parseable result blobs by default.web/src/pages/BlastAnalytics.tsxadds the Web BLAST-style Hits table and makes it the default analytics tab, with review badges and richer table controls for molecular diagnostic review.
No IaC changes. No new dependencies.
Validation evidence¶
uv run pytest -q api/tests/test_blast_results_parser.py api/tests/test_blast_results_routes.py api/tests/test_storage_data.py
52 passed in 1.36s
uv run pytest -q api/tests
622 passed in 31.10s
uv run ruff check api
All checks passed!
cd web && npx eslint src/pages/BlastAnalytics.tsx src/api/blast.ts --max-warnings 0
passed
cd web && npm run build
✓ built in 5.06s
The full frontend build emitted the existing Vite chunk-size warning for the main bundle; it did not fail the build.