# Tokentap — Token Tracker for LLM CLIs > Tokentap adds a live terminal dashboard and prompt archive for LLM CLI tools, so you can see token usage in real time while using Claude Code or Codex. ## Install Copy the content below into your project: ## Quick Use ```bash pip install tokentap # Terminal 1 tokentap start # Terminal 2 tokentap claude # or: tokentap codex ``` ## Intro Tokentap adds a live terminal dashboard and prompt archive for LLM CLI tools, so you can see token usage in real time while using Claude Code or Codex. - **Best for:** power users of LLM CLIs who need visibility into token burn and context window pressure - **Works with:** Python 3.10+; supports Claude Code, Codex, Gemini CLI (noted as blocked by an upstream issue), and OpenAI-compatible providers (per README) - **Setup time:** 5–15 minutes ## Practical Notes - Per README: shows a context “fuel gauge” (default limit 200,000) and saves prompts to Markdown + JSON. - Useful for regression: compare token usage before/after prompt/tool changes. - Combine with guardrails: when fuel gauge hits 70–80%, switch to summarization or retrieval mode. ## Main A simple workflow that pays off quickly: 1. Run your normal CLI session with Tokentap enabled. 2. When usage spikes, open the saved prompt archive and identify the culprit: retrieval payload, tool output, or template bloat. 3. Fix one thing at a time (shorten tool output, add truncation, or dedupe context), then measure again. Treat token usage as a budget: you’ll get better answers by spending tokens on *relevant evidence*, not repeated boilerplate. ### FAQ **Q: Does it require certificates?** A: Per README: no—"Zero configuration" and it runs as a local proxy with path-prefix routing for OpenAI-compatible providers. **Q: Can it run with Gemini CLI?** A: README notes Gemini CLI is currently blocked by an upstream issue when using OAuth; check the linked issue for status. **Q: What should I store?** A: Keep prompt archives in a private directory; they may contain secrets or code. Add redaction if you share logs. ## Source & Thanks > Source: https://github.com/jmuncor/tokentap > License: MIT > GitHub stars: 798 · forks: 37 --- ## 快速使用 ```bash pip install tokentap # 终端 1 tokentap start # 终端 2 tokentap claude # 或:tokentap codex ``` ## 简介 Tokentap 给 LLM CLI 工具加上实时 token 仪表盘与 prompt 存档:`tokentap start` 启动代理与看板,再用 `tokentap claude/codex` 运行工具即可观察消耗。 - **适合谁:** 需要看清 token 消耗与上下文压力的 Claude Code/Codex 等 CLI 重度用户 - **可搭配:** Python 3.10+;支持 Claude Code、Codex;Gemini CLI 在 README 中标注受上游问题影响;也支持 OpenAI 兼容供应商 - **准备时间:** 5–15 分钟 ## 实战建议 - README:有上下文“油表”(默认 200,000)并把 prompts 存成 Markdown + JSON。 - 适合做回归:对比 prompt/tool 改动前后的 token 消耗。 - 配合护栏:油表到 70–80% 时切换到摘要或检索模式避免爆窗。 ## 主要内容 一个能快速见效的用法: 1. 用 Tokentap 跑一段你平时的 CLI 会话。 2. 看到消耗飙升时,去 prompt 存档里定位原因:检索 payload、工具输出、还是模板膨胀。 3. 一次只改一个点(截断工具输出、去重上下文、减少 boilerplate),然后再量一次。 把 token 当预算管理:把 token 花在“相关证据”上,比重复的模板/闲聊更能提升答案质量。 ### FAQ **需要装证书吗?** 答:README 说明不需要:零配置,本地代理转发;对 OpenAI 兼容供应商用路径前缀路由。 **Gemini CLI 能用吗?** 答:README 提示当前受上游 OAuth 忽略 base URL 的问题影响;可跟踪链接 issue。 **存档应该怎么保存?** 答:建议放在私有目录;内容可能含 secrets/代码。要分享日志请先做脱敏。 ## 来源与感谢 > Source: https://github.com/jmuncor/tokentap > License: MIT > GitHub stars: 798 · forks: 37 --- Source: https://tokrepo.com/en/workflows/tokentap-token-tracker-for-llm-clis Author: Script Depot