Benchmark Results
- GPT-4o-mini: 101% recall at $0.20 (over-extraction worked)
- GPT-4o: 67% recall at $2.00 — 10x cost, worse results
- Native baseline: 61% at $0.17 — best cost/performance balance
- Regex baseline: 8% — very low floor for comparison
- CrewAI hallucinated fake contract text after 3/30 documents
# Extract clauses from legal contracts
from legal_extraction import extract
clauses = extract("contract.pdf",
targets=["governing_law", "liability_cap", "indemnification"])