orca — sirhco.dev

case studyagentshealthcare

Agentic platform for healthcare workflows — Rust core, Vertex AI, human-in-the-loop by default.

01 · problem

Problem

Healthcare workflows that benefit from agents — patient education, discharge planning, research synthesis — can not tolerate the usual failure modes of open-loop LLM systems. PHI handling, auditability, and human review are not afterthoughts. Off-the-shelf agent frameworks assume a trust model that does not hold in this domain.

02 · shape

Shape

A Rust service wraps a Plan-Act-Verify state machine. Each turn: the planner decomposes the request, the actor issues tool calls, a verifier LLM scores the result, and the coordinator decides to continue, escalate to a human reviewer, or halt. Firestore holds the run log and the memory tier; Vertex AI Gemini is the model behind every agent role.

03 · build

Build

Memory is three-tier: raw transcripts, Gemini-summarized mid-term context, distilled long-term facts. A guardrail pipeline runs on every tool call — confidence threshold, critic agent, cross-check against a second model, self-consistency vote, JSON schema enforcement — and any failed gate routes the turn to a reviewer queue. OpenTelemetry spans every stage so a failed run is reconstructible from logs alone. Speech in and out use Cloud STT/TTS behind the same interface as text.

figure · service topology

04 · result

Result

Sub-second median latency per turn is the bar. The confidence gate routes the bottom quartile of responses to human review before they reach a patient-facing surface. The system is designed so that every decision a model made can be reconstructed from telemetry alone, without the run log being the thing a reviewer has to decode.

stack

RustAxumrigVertex AI GeminiFirestoreGKEOpenTelemetry