Main
Start with wrap mode:
headroom wrap claude|codex|cursorgives quick wins without rewriting your app stack.Use proxy mode for language-agnostic pipelines: point any OpenAI-compatible client at the proxy and keep data local (per README).
Treat CCR as reversible: README emphasizes originals are retrievable, so you can compress aggressively without losing auditability.
Measure savings: capture before/after token counts (README demo shows 10,144 → 1,260) and tune only where it matters.
Source-backed notes
- README lists three usage modes: library, proxy (
headroom proxy), and agent wrap (headroom wrap ...). - README states it provides an MCP server with tools like
headroom_compress/headroom_retrieve/headroom_stats. - README demo includes a concrete token reduction example (10,144 → 1,260) and describes CCR as reversible.
FAQ
- Do I need to change my app?: Not necessarily — start with
headroom wrap ...or runheadroom proxyas a drop-in endpoint. - Is compression reversible?: README says CCR keeps originals; the agent can retrieve raw content on demand.
- How do MCP clients use it?: Install/enable the Headroom MCP server (README mentions MCP-native entrypoints) and call compress/retrieve tools.