Customer Triage

MIT License · View on GitHub

Message routing: Should an AI classify customer inquiries, or are keyword rules enough?

This experiment tests whether AI adds value over simple keyword matching for routing customer support messages to the right department. We compared 6 approaches—from basic keyword lookup to multi-agent orchestration with CrewAI and LangGraph—on 250 real support tickets across 8 categories (billing, technical, returns, shipping, account, product, general, and escalation). The results surprised us: keyword matching alone achieves 69% accuracy at zero cost. Adding a single LLM call boosts accuracy to 88% for under a penny per 100 messages. But the real insight? Framework choice barely matters—fixing a bug in our CrewAI implementation improved accuracy by 45%, more than any framework switch.

Benchmark Results

  • Baseline (keyword): 68.6% accuracy at $0 per request
  • Best value (Native): 87.6% accuracy at $0.000074/request — 19 point lift
  • Best accuracy (CrewAI): 88.4% at 2.3x cost — only 0.8 points better
  • Framework spread was only 4 points (84-88%) across 6 implementations
  • 250 samples evaluated, not the 50-sample pilot that showed inflated 94%
# Route customer messages to departments
from customer_triage import classify

result = classify("I need to return my order")
# → {"department": "orders", "confidence": 0.94}

Questions about this project? Open an issue on GitHub or contact us directly.