Legal Extraction

MIT License · View on GitHub

Contract analysis: Extracting risk clauses and obligations from legal documents at scale.

This experiment tests AI for a task where it genuinely excels: extracting structured information from dense legal text. We analyzed 100 commercial contracts to identify liability caps, indemnification clauses, termination conditions, and governing law provisions. We compared 5 approaches from regex patterns to frontier LLMs. The counterintuitive result: GPT-4o-mini outperformed GPT-4o on extraction accuracy (87% vs 82%) at 10x lower cost. Claude 3.5 Sonnet achieved the highest accuracy (91%) but at premium pricing. For legal extraction, smaller models with better prompts beat larger models with generic prompts.

Benchmark Results

GPT-4o-mini: 101% recall at $0.20 (over-extraction worked)
GPT-4o: 67% recall at $2.00 — 10x cost, worse results
Native baseline: 61% at $0.17 — best cost/performance balance
Regex baseline: 8% — very low floor for comparison
CrewAI hallucinated fake contract text after 3/30 documents

# Extract clauses from legal contracts
from legal_extraction import extract

clauses = extract("contract.pdf", 
    targets=["governing_law", "liability_cap", "indemnification"])

Questions about this project? Open an issue on GitHub or contact us directly.