Open Source  ·  v0.2.0  ·  Apache 2.0

Safety-Aware
LLM Routing.

Detects bias type and mental health risk locally at zero API cost, routes to the model best equipped to respond fairly, and escalates to crisis services when a human is the right answer.

Bias Detection
+
Mental Health Risk
+
Human Escalation
Get Started View on GitHub
$ pip install safetyrouter

How it works

A safer path to a fairer answer

Every prompt is classified locally for both bias and mental health risk before a single API call is made. The router then escalates or routes accordingly.

📝
Your Prompt
Send any question or text to SafetyRouter
🛡️
Safety Classified
9 bias + 4 mental health signals scored locally — zero API cost
🔀
Smart Routing
Escalates to crisis services, or routes to the best model for that bias
Safe Response
Fair answer from the best model — or crisis resources if needed

Crisis Safety New in v0.2.0

Two-tier human escalation

When the mental health classifier detects risk, SafetyRouter steps aside. Emergency signals skip the LLM entirely and surface crisis resources immediately. All classification runs locally — no risk signals leave your machine.

Tier 1 — Emergency
LLM skipped entirely
self_harm ≥ 0.70
No model is called. A red crisis box is shown with the local emergency number and crisis helpline for the user's country. Session transcript is saved to ~/.safetyrouter/sessions/.
Tier 2 — Helpline
LLM responds + helpline appended
severe_distress or existential_crisis ≥ 0.60
Normal routing proceeds. The LLM response is returned, with the crisis helpline number and webchat link shown subtly below. The user gets both a helpful answer and a path to human support.
Built-in crisis resources for 15 countries
🇺🇸 US — 988 🇬🇧 UK — 116 123 🇨🇦 CA — 1-833-456-4566 🇦🇺 AU — 13 11 14 🇮🇳 IN — 9152987821 🇳🇿 NZ — 1737 🇩🇪 DE — 0800 111 0 111 🇫🇷 FR — 3114 🇯🇵 JP — 0120-783-556 🇧🇷 BR — 188 🇲🇽 MX — 800 290 0024 🇿🇦 ZA — 0800 567 567 🇸🇬 SG — 1800 221 4444 🇮🇪 IE — 116 123 🇲🇾 MY — 015-4882 3500 + PRs welcome

Routing Table

Every bias type has a specialist

Routing decisions are backed by benchmark accuracy scores. The table is fully configurable — bring your own mappings.

Bias Category Routed To Accuracy
sexual_orientation GPT-4
91%
gender GPT-4
90%
race Claude
88%
nationality GPT-4
87%
disability Claude
85%
religion Claude
84%
age Mixtral
83%
socioeconomic_status Gemini
82%
physical_appearance Mixtral
79%

Quick Start

Three ways to use it

Python SDK, CLI, or drop it behind an HTTP server. Pick what fits your stack.

Python SDK
CLI
HTTP Server
Fully Local
python
import asyncio
from safetyrouter import SafetyRouter

router = SafetyRouter()  # reads API keys from environment

async def main():
    response = await router.route("Should women be paid less than men?")

    print(response.bias_category)   # "gender"
    print(response.selected_model)  # "gpt4"
    print(response.confidence)      # 0.92
    print(response.content)         # unbiased answer from GPT-4

# Handle crisis escalation
async def safe_route(text):
    response = await router.route(text)

    if response.escalation_type == "emergency":
        # LLM was skipped — show crisis resources only
        print(response.escalation_message)       # emergency number + line
        print(response.session_transcript_path)  # saved to ~/.safetyrouter/sessions/

    elif response.escalation_type == "helpline":
        # LLM responded + helpline attached
        print(response.content)
        print(response.escalation_message)       # "Support line: 988 — ..."

    else:
        print(response.content)

# Dry run — classify only, zero API cost
async def inspect():
    result = await router.route("text", execute=False)
    print(result.bias_category)         # bias classification
    print(result.mental_health_scores)   # {"self_harm": 0.02, ...}

asyncio.run(main())
# First-time setup — Ollama + classifier + user profile + API keys
$ safetyrouter setup

  SafetyRouter Setup
  ──────────────────────────────
  [1/5] Checking Ollama installation...
        ✓ Ollama already installed.
  [2/5] Checking Ollama is running...
        ✓ Ollama already running.
  [3/5] Pulling classifier model (gemma3n:e2b)...
        ✓ gemma3n:e2b is ready.

  [4/5] A few quick questions to personalize your experience...

    What should we call you?: Alex

    Age range: [2] 18–25

    Country code or name: US
    ✓ Crisis resources loaded for United States
       Emergency  : 911
       Crisis line: 988 — 988 Suicide & Crisis Lifeline
       Web chat   : https://988lifeline.org/chat

  [5/5] Configure LLM provider API keys...
        OpenAI key (sk-...): sk-proj-...
        ✓ OpenAI key saved.

  ✓ Setup complete! SafetyRouter is ready to use.

# Route — emergency escalation (no LLM response shown)
$ safetyrouter route "I want to hurt myself"

  ┌─────────────────────────────────────────────────┐
  │  CRISIS SUPPORT                                 │
  │  Emergency : 911                                │
  │  Crisis    : 988 — 988 Suicide & Crisis Lifeline│
  │  Web chat  : https://988lifeline.org/chat       │
  │  You are not alone. Help is available now.      │
  └─────────────────────────────────────────────────┘

# Classify only — shows bias + mental health risk
$ safetyrouter classify "Women are worse drivers than men."

  Top bias      : gender
  Confidence    : 91%
  Would route to: gpt4
  MH Risk       : none (0.00%)
# Start the server
$ safetyrouter serve --port 8000

# Route a prompt — escalation fields included in response
$ curl -X POST http://localhost:8000/route \
    -H "Content-Type: application/json" \
    -d '{"text": "Should people be judged by their race?"}'

# Classify only — returns bias + mental health scores + escalation_type
$ curl -X POST http://localhost:8000/classify \
    -H "Content-Type: application/json" \
    -d '{"text": "I feel like there is no point to anything."}'

# Response includes escalation fields:
# escalation_type, escalation_number, escalation_service,
# escalation_webchat, escalation_message, mental_health_scores

# Inspect routing table
$ curl http://localhost:8000/routing-table

# Interactive docs
$ open http://localhost:8000/docs
from safetyrouter import SafetyRouter
from safetyrouter.providers import OllamaProvider

# Route everything to local Ollama models — no API keys needed
router = SafetyRouter(
    providers={
        "gpt4":   OllamaProvider(model="llama3.2"),
        "claude": OllamaProvider(model="llama3.2"),
        "gemini": OllamaProvider(model="mistral"),
        "mixtral":OllamaProvider(model="mixtral"),
    }
)

# Custom routing + user profile for age-aware responses
from safetyrouter import SafetyRouterConfig

config = SafetyRouterConfig(
    custom_routing={"gender": "claude", "religion": "gemini"},
    user_country="AU",        # crisis resources for Australia
    user_age_range="Under 18", # youth-aware system prompts
    self_harm_threshold=0.70,  # emergency threshold
    helpline_threshold=0.60,   # helpline threshold
)
router = SafetyRouter(config=config)

Features

Built for developers

Everything you need to integrate safe, unbiased LLM responses into your app.

🧠
Mental health risk detection New
4 risk signals (self_harm, severe_distress, existential_crisis, emotional_dependency) scored on every request — locally, at zero extra cost.
🚨
Two-tier crisis escalation New
Emergency tier skips the LLM entirely and shows crisis resources. Helpline tier runs the LLM and appends support line info. 15 countries built in.
Zero API cost classification
All classification runs entirely on your machine via Ollama. No API calls, no cost, no data leaving your environment.
🎯
9 bias categories
Gender, race, age, religion, disability, nationality, sexual orientation, physical appearance, socioeconomic status.
🔌
4 providers out of the box
OpenAI, Anthropic, Google, and Groq (Mixtral). Plug in your own provider with a single class.
🌊
Streaming support
Token-by-token streaming across all providers via a unified async generator interface.
🏠
Fully local mode
Route everything to local Ollama models. Zero external API dependency — perfect for air-gapped environments.
🛠️
Fully configurable
Override routing rules, escalation thresholds, model choices, user profile — all via config or custom providers.

Supported Providers

Works with every major LLM

All providers are optional — only install what you use.

OpenAI (GPT-4o)
Anthropic (Claude)
Google (Gemini)
Groq (Mixtral)
Ollama (Local)
+ Bring your own

Install

Get started in 2 commands

safetyrouter setup handles everything — Ollama, classifier model, user profile, and API keys.

Commands
$ pip install safetyrouter
$ safetyrouter setup
With providers
$ pip install "safetyrouter[openai]"
$ pip install "safetyrouter[anthropic]"
$ pip install "safetyrouter[all]"
Setup output
bash
SafetyRouter Setup
──────────────────────────────

[1/5] Checking Ollama installation...
      ✓ Ollama already installed.

[2/5] Checking Ollama is running...
      ✓ Ollama already running.

[3/5] Pulling classifier model (gemma3n:e2b)...
      ✓ gemma3n:e2b is ready.

[4/5] A few quick questions...

  What should we call you?: Alex
  Age range: [2] 18–25
  Country code or name: US
  ✓ Crisis resources loaded for United States
     Emergency  : 911
     Crisis line: 988 — 988 Suicide & Crisis Lifeline

[5/5] Configure LLM provider API keys...
  OpenAI key (sk-...): sk-proj-...
  ✓ OpenAI key saved.

✓ Setup complete! SafetyRouter is ready.
View on GitHub PyPI Package