Skip to content

2026-05-22 — Split oversized __init__.py files in api/tasks/

Motivation

Three api/tasks/<package>/__init__.py files had grown well past a single responsibility — they each defined the package's helpers and every Celery task body in one file:

File Lines (before) Tasks defined inside
api/tasks/storage/__init__.py 647 warmup_database, check_database_updates, reconcile_auto_warmup
api/tasks/openapi/__init__.py 637 deploy_openapi_service (plus 5 large private helpers)
api/tasks/azure/__init__.py 597 provision_aks, start_aks, stop_aks, delete_aks, assign_aks_roles, diag_noop

The charter (.github/copilot-instructions.md) treats the per-file context header as an SRP gate: if the Responsibility needs "and" chains or multiple architectural concerns, split before adding more code. These packages already failed that gate.

api/tasks/blast/ was already organised this way (thin facade + sibling submodules) — this change brings the other task packages in line.

User-facing change

None. All public Celery task names, route imports, and test monkeypatch targets are unchanged.

API / IaC diff summary

For each of the three packages, the monolithic __init__.py was replaced with a thin facade that re-exports the same symbols, and the task / helper code was extracted into dedicated sibling modules:

api/tasks/storage/ (50-line facade + four siblings) * helpers.pyBLAST_DATABASES, now_iso, publish_db_metadata_invalidate, update_state, record_task_progress, wait_for_warmup_jobs * warmup.pywarmup_database Celery task * update_check.pycheck_database_updates Celery task * reconcile.pyreconcile_auto_warmup Celery task

api/tasks/openapi/ (26-line facade + six siblings) * constants.pyMI_NAME, K8S_SA_NAME, K8S_NAMESPACE, FED_CRED_NAME, role definition IDs * helpers.pyblast_node_count, now_iso, record_progress * rbac.pyassign_role_idempotent, setup_workload_identity * manifests.pybuild_manifests * kubectl.pykubectl_apply * deploy.pydeploy_openapi_service Celery task

api/tasks/azure/ (55-line facade + six siblings) * helpers.pynow_iso, update_state * cluster_params.pybuild_cluster_params * rbac.pyattach_acr, grant_storage_blob_contributor_to_aks, ensure_aks_runtime_rbac, assign_aks_roles Celery task * provision.pyprovision_aks Celery task * lifecycle.pystart_aks, stop_aks, delete_aks Celery tasks * diagnostics.pydiag_noop Celery task

Compatibility surface preserved

  • All Celery task names (api.tasks.storage.warmup_database, api.tasks.azure.start_aks, api.tasks.openapi.deploy_openapi_service, …) are unchanged — Celery routes / beat schedules / route callers continue to work without edits.
  • All private helper names that tests or other tasks import directly (_attach_acr, _grant_storage_blob_contributor_to_aks, _ensure_aks_runtime_rbac, _build_cluster_params, _build_manifests, _kubectl_apply, _setup_workload_identity, _update_state, _record_task_progress, _now_iso, _select_warmup_shard_count, _program_to_mol_type, _build_elb_image, _publish_db_metadata_invalidate, _wait_for_warmup_jobs, _autowarmup_inflight_acquire) are re-exported by the facades and listed in __all__ to silence the RUF/F401 warnings flagged by ~/.config/agent/work-discipline.md (Facade re-export note, 2026-05-19).
  • Module-level Azure SDK clients (aks_client, acr_client, storage_client, get_credential) are also re-exported so existing monkeypatch.setattr(azure, "aks_client", ...) patterns in api/tests/test_azure_tasks.py continue to work.

Monkeypatch-friendly facade indirection

Tests use monkeypatch.setattr("api.tasks.storage.<name>", ...) and monkeypatch.setattr(azure, "<name>", ...) to override helpers/clients inside the running task body. Direct from … import X inside a task module binds the original symbol into that module's namespace, so a facade-level patch would no longer affect the call site.

To preserve the contract, the task modules now look those symbols up via the package at call time (import api.tasks.storage as _facade then _facade._update_state(...)). The pattern is used in api/tasks/storage/warmup.py, api/tasks/storage/reconcile.py, api/tasks/storage/update_check.py, api/tasks/azure/rbac.py, api/tasks/azure/provision.py, and api/tasks/azure/lifecycle.py.

Validation evidence

  • uv run pytest -q api/tests1021 passed in 32.97s (was 1017 + 9 failed pre-fix; the 4-test delta comes from previously-broken collection paths that now run).
  • uv run ruff check apiAll checks passed!

Largest remaining file in the touched packages:

475 api/tasks/storage/warmup.py      ← single Celery task body (warmup_database)
239 api/tasks/azure/rbac.py
205 api/tasks/openapi/deploy.py
198 api/tasks/openapi/manifests.py
 50 api/tasks/storage/__init__.py    (was 647)
 26 api/tasks/openapi/__init__.py    (was 637)
 55 api/tasks/azure/__init__.py      (was 597)

The largest sibling (storage/warmup.py, 475 lines) is the warmup_database Celery task body itself — a single coherent side-effecting operation, so it stays as one file per the implementation-discipline rule against unnecessary abstractions.

No infra / no UI / no deployment change

This is a pure refactor of Python module layout. No Bicep, no Container App template, no SPA bundle, no behaviour change — so no scripts/dev/quick-deploy.sh / postprovision.sh / azd provision required per charter §13 ("Do NOT redeploy for ordinary code changes").