mdtoken

MIT License · View on GitHub · pip install mdtoken

Pre-commit hook for markdown token limits. Blocks commits that violate limits, preventing context window bloat in AI-assisted development workflows.

Context window bloat is a real problem in AI-assisted development. Your CLAUDE.md starts at 500 tokens, grows to 5,000, then 50,000—and suddenly Claude's responses degrade because it's drowning in context. mdtoken is a pre-commit hook that enforces token limits on markdown files. When a file exceeds your configured threshold, the commit fails with a clear message telling you which file is too large and by how much. Why this matters: • Prevents gradual context creep that degrades AI assistant performance • Catches bloat at commit time, not after it's already in main • Supports multiple tokenizers (cl100k, p50k, GPT-4) for accurate counting • Configurable limits per file pattern—different thresholds for CLAUDE.md vs README.md Simple tool, one job: keep your markdown files lean so your AI assistants stay effective.

Features

  • Accurate counts for Claude, GPT-4, and other models
  • Markdown-aware — respects code blocks, headers, formatting
  • CLI and Python API — fits any workflow
  • Cost estimation — know what you'll pay before you send
# CLI usage
mdtoken count document.md
mdtoken count document.md --model gpt-4

# Python API
from mdtoken import count_tokens
tokens = count_tokens("Your markdown content", model="claude-3")
Simple interface
pip install mdtoken

Questions about this project? Open an issue on GitHub or contact us directly.