AI & ML layer

The AI/ML layer turns a member’s lab results into safe, narrated, member-facing insights. It was built by mapping the legacy loop-platform “intelligence layer” onto existing platform infrastructure rather than re-porting it — the platform already had the homes (durable workflows, the AI providers/prompts packages, the clinical health-engine, the tool catalog), so the genuinely-new work was narrow: a canonical biomarker registry, expanded recommendation coverage, trend math, and the insight generate→store→serve loop.

Epic: LOO-2164. Build brief: docs/agent-briefs/ai-ml-layer-buildout-2026-06-19.md.

The pipeline

Separation of concerns. clinical owns what is clinically true and safe (deterministic, reviewable rules over the member’s own biomarkers); ai owns how it’s said, when, where it’s stored, and how it’s served (LLM narration, scheduling, persistence). This mirrors the pre-existing ai /v1/recommend → clinical /v1/recommendations/generate seam.

Components

Biomarker reference registry (`clinical`)

services/clinical/src/lib/biomarker-reference.ts is a generated, canonical registry of 44 markers (code, unit, sex-specific reference + optimal bands, synonyms). It is produced from vendored source data by scripts/gen-biomarker-reference.py — do not hand-edit the data; edit scripts/data/*.json and re-run the generator (there is no CI drift guard, and thresholds are clinical-decision data requiring clinician sign-off).

resolveBiomarkerCode(label) maps lab synonyms and alternate spellings ("HbA1c", "anti-TPO", "VitaminD25OH") onto canonical codes, so a result row from any vendor matches the right rule.
The parser fills a reference range from the registry when a lab omits one, so the value classifies (low/high/critical) instead of silently defaulting to normal. Vendor-supplied ranges always win.

Sex-aware classification (safety)

Many markers (testosterone, estradiol, free-T, AMH, …) have distinct male/female bands. The registry’s sex-union default band is dangerous for these: a woman’s testosterone of 250 ng/dL (androgen excess) reads “normal” against the 15–1000 union. Therefore:

isSexSpecific(code) flags these markers.
Sex is sourced end-to-end: collected at intake → stored as sex_at_birth on the patient record → fetched by clinical from patient-graph at parse time. When sex is known, sex-specific markers classify against the correct band; when unknown, they are left unclassified rather than falsely reassuring.

See Sex-aware classification (shipped) for the end-to-end wiring.

Recommendation engine (`clinical`)

recommendation-engine.ts maps a biomarker’s status → a typed recommendation (category, priority, advice text), keyed on canonical codes. Coverage spans the metabolic, lipid, thyroid (incl. antibodies), hormone, fertility, and inflammatory panels. POST /v1/recommendations/generate combines biomarker-, protocol-, and contraindication-sourced recommendations.

Trend math (`clinical`)

biomarker-stats.ts is a pure-TypeScript reimplementation of the legacy Python stats: linear-regression trend (slope, r², p-value, direction, CI) and Welch changepoint detection, with Student-t helpers (Lanczos lnΓ + the regularized incomplete beta). It is golden-parity tested against scipy to ~9 significant digits — so we get trend analysis in-process without operating a separate Python service. Served at GET /v1/patients/{patient_id}/biomarkers/{code}/trend.

Insight service (`ai`)

InsightService calls clinical for structured recommendations, narrates each via the recommendation-narration prompt (@platform/ai-prompts), and stores the result in ai.insights — idempotent per (patient_id, kind, source_hash, generated_date) (a re-run is a no-op). It emits ai.insight.generated.v1 (PHI-safe; best-effort — a publish failure never fails an already-stored insight).

POST /v1/insights/generate — internal (admin:ai); called by the scheduler.
GET /v1/insights / GET /v1/insights/{id} — member reads, ownership-checked.

ai.insights stores narrated, patient-linked prose, so ai now holds PHI at rest (service.yaml compliance.phi: true). The ai erasure path deletes the table; audit details stay PHI-safe (counts/categories only).

Feature flags

Everything ships dark. Two flags split the surface so generation can run “dark” to validate output before any UI exposure:

Flag	Gates
`feature.ai.insights`	the member-facing surface (UI)
`feature.ai.insights-generation`	the scheduled generator

Both are beta-gated OFF: turn on feature.ai.insights-generation first to validate generated output dark, then flip feature.ai.insights to expose the member surface.

Events

ai.insight.generated.v1 (consumer-optional) — identifiers + category/priority only, never the narrated body or lab values.

Sex-aware classification (shipped)

Biological sex at birth is now collected from every member at intake (promoted into a structured sex_at_birth on the patient record) and clinical fetches it from patient-graph at parse time to classify sex-specific markers against the correct band on every ingest path. When sex is unknown the fallback fails open — sex-specific markers are left unclassified rather than mis-classified.

Red-flag escalations

Critically abnormal values escalate to the care team via red-flag-rules.ts (the path to care-team alert / PagerDuty), not just a recommendation. Beyond the metabolic/lipid/thyroid baseline, the expanded panels escalate the true emergencies — CK ≥ 5000 (rhabdomyolysis), hemoglobin < 7 (severe anemia), ALT/AST > 1000 (acute liver injury), and a testosterone > 1500 warning (androgen-secreting source). Codes are synonym-resolved so vendor labels still match. Hormone “criticals” that are referrals (not emergencies) remain recommendation-only by design.

Scheduler + consumer (shipped)

Generation is driven two ways by the clinical-insights workflow (services/workflow): an event trigger on clinical.biomarker.parsed.v1 regenerates a patient’s insights the moment their labs are parsed, and a daily cron fans out over the active-member cohort (GET /v1/memberships/active). Both stay dark behind feature.ai.insights-generation (the ai generate endpoint no-ops when off). Members read their insights at loop-health /insights + /insights/[id], fail-closed behind feature.ai.insights.

Open items

Workflow orchestration. The expanded panel ports the legacy condition workflows’ advice text only — not their orchestration (contraindication gating, follow-up lab ordering, scheduled re-checks). That is a separate build on services/workflow + clinical.
Derived markers. LH:FSH ratio and HOMA-IR are not templated — no lab sends them; they must be computed from their components with unit-safe handling.

Where things live

Concern	Location
Biomarker registry + generator	`services/clinical/src/lib/biomarker-reference.ts`, `scripts/gen-biomarker-reference.py`
Recommendation engine	`services/clinical/src/services/recommendation-engine.ts`
Trend math	`services/clinical/src/lib/biomarker-stats.ts`
Insight service + routes	`services/ai/src/services/insight.service.ts`, `src/routes/insights.routes.ts`
Insights table	`services/ai/migrations/0005_insights.sql`
Event schema	`packages/contracts/src/events/ai.ts`
Flags	`packages/flags/src/taxonomy.ts`

Durable workflows Intelligence layer (the loop)