RubRIX: Rubric-Driven Risk Mitigation in Caregiver-AI Interactions
Drishti Goel, Jeongah Lee, Qiuyue Joy Zhong, Violeta J. Rodriguez, Daniel S. Brown, Ravi Karkar, Dong Whi Yoo, Koustuv Saha
Provides a structured method to audit AI systems before deployment in caregiving contexts. The framework makes explicit what clinicians already know: that identical phrasing can be supportive or harmful depending on the caregiver's expressed need state. Enables product teams to catch context-inappropriate responses that pass generic safety filters.
Generic AI safety frameworks (toxicity, hallucinations) miss caregiving-specific risks—when a stressed parent asks for help, an LLM's response could validate distress appropriately or escalate crisis.
Method: RubRIX introduces a clinician-validated framework that evaluates LLM responses across caregiving-specific dimensions: emotional validation appropriateness, distress escalation risk, and information accuracy for high-stakes parenting decisions. The rubric operationalizes what 'safe' means when AI mediates vulnerable emotional states, moving beyond generic content policy violations to context-dependent harm assessment.
Caveats: Framework requires clinical validation per deployment context. Caregiving spans infant care to elder care—rubric may need adaptation.
Reflections: Can the rubric be automated for real-time response filtering, or does it require human-in-the-loop review? · How do caregivers themselves rate risk compared to clinician assessments? · Does making risk dimensions explicit change how designers prompt or fine-tune models?