Descriptions / Taxonomy fast-path — fix artifact cache misses¶
Motivation¶
Opening the Recent searches → Descriptions tab took 5–15 seconds on the
first hit per session even though the backend already pre-bakes a
"default" alignments artifact on job completion. The two-line gate
that decides whether to serve the artifact was too strict: it only
matched callers who omitted page_size, but the SPA's default query
always sends page_size: 100 (the same size the artifact is baked
at). Result: every Descriptions tab open went down the cold parse
path — list blobs → download → parse → annotate → sort → paginate —
even when a ready artifact existed.
The Taxonomy tab had the analogous problem after the
2026-05-22-taxonomy-ncbi-parity change made the SPA always send
include_lineage=true. The pre-baked taxonomy artifact contained no
lineage, so the gate refused to serve it.
User-facing change¶
- Descriptions tab opens in ~50–200 ms instead of 5–15 s on every subsequent open of a job whose artifact has been built. The first open of a brand-new job still runs the cold path AND enqueues the backfill so the second open is fast.
- Taxonomy tab Organism view now serves from the artifact too,
with
lineage/blast_namealready populated — no per-open eutils round-trip on the request thread. - No filter / sort / page-size change affects the user — the gate only matches the exact "default" request the SPA sends on first open. Any user-applied filter still routes to the cold parser so results stay correct.
API / IaC diff summary¶
api/routes/blast/result_helpers.pydefault_alignments_request: acceptpage_size in (None, RESULTS_DEFAULT_PAGE_SIZE).default_taxonomy_request: drop theinclude_lineage=falseguard now that the artifact carries lineage.api/services/blast_result_artifacts.pybuild_default_taxonomy_payload: enrich the rolled-up organisms with lineage /blast_name(top-20 organisms, eutils cached) so the artifact matches what the SPA's default Taxonomy query asks for. Lineage enrichment is wrapped in a best-effort try/except — a transient eutils failure does not block the artifact bake.- No infra / Bicep changes.
Validation¶
uv run pytest -q api/tests→ 977 passed.uv run ruff check api/routes/blast/result_helpers.py api/services/blast_result_artifacts.py→ clean.- Manual: a job whose
result_alignmentsartifact exists now returnssource: "artifact"/artifact_state: "ready"for the SPA's default query (verified by sending the exact query the SPA sends and inspecting the response).
Follow-up — artifact schema versioning (same day)¶
The first pass shipped a hidden trap: once an analytics artifact was
written as status: ready, artifact_build_should_enqueue would
never re-trigger a rebuild, so any job whose artifact had been baked
by an older code version stayed stale forever. After Phase 1+2 went
out, the Taxonomy tab kept showing "Unclassified" for jobs whose
result_taxonomy artifact had been built before the stitle fallback
landed.
Fix:
api/services/job_artifacts.py- New
_ANALYTICS_ARTIFACT_MIN_SCHEMA_VERSIONtable — bump the entry for an artifact type whenever its builder's payload semantics change, and stamp the matching version in the builder. read_result_analytics_artifactnow reads the payload, checksartifact_schema_version, and on miss flips the state row tostatus: failedwitherror_code: schema_staleso the next request triggers a rebuild viaartifact_build_should_enqueue.api/services/blast_result_artifacts.pybuild_default_alignments_payloadandbuild_default_taxonomy_payloadnow emit"artifact_schema_version": 2. The minimum table for both is set to 2, matching the Phase 2 rollup changes.api/tests/test_job_artifacts.py- Two new tests lock in the stale-detection contract
(
test_read_result_analytics_artifact_treats_missing_schema_as_stale,test_read_result_analytics_artifact_returns_fresh_payload).
Operational note: rolling out this change picks up automatically on the next request per job — the SPA reads the artifact, the route sees a stale state, the worker rebuilds, and the second request is fast. Local dev environments must restart the Celery worker after pulling so the new builder code is loaded; the stale flag would otherwise rebuild via the old code and stay stuck.