# Tokentap — Token Tracker for LLM CLIs

> Tokentap adds a live terminal dashboard and prompt archive for LLM CLI tools, so you can see token usage in real time while using Claude Code or Codex.

## Install

Copy the content below into your project:

## Quick Use

```bash
pip install tokentap

# Terminal 1
tokentap start

# Terminal 2
tokentap claude
# or: tokentap codex
```

## Intro

Tokentap adds a live terminal dashboard and prompt archive for LLM CLI tools, so you can see token usage in real time while using Claude Code or Codex.

- **Best for:** power users of LLM CLIs who need visibility into token burn and context window pressure
- **Works with:** Python 3.10+; supports Claude Code, Codex, Gemini CLI (noted as blocked by an upstream issue), and OpenAI-compatible providers (per README)
- **Setup time:** 5–15 minutes

## Practical Notes

- Per README: shows a context “fuel gauge” (default limit 200,000) and saves prompts to Markdown + JSON.
- Useful for regression: compare token usage before/after prompt/tool changes.
- Combine with guardrails: when fuel gauge hits 70–80%, switch to summarization or retrieval mode.

## Main

A simple workflow that pays off quickly:

1. Run your normal CLI session with Tokentap enabled.
2. When usage spikes, open the saved prompt archive and identify the culprit: retrieval payload, tool output, or template bloat.
3. Fix one thing at a time (shorten tool output, add truncation, or dedupe context), then measure again.

Treat token usage as a budget: you’ll get better answers by spending tokens on *relevant evidence*, not repeated boilerplate.

### FAQ

**Q: Does it require certificates?**
A: Per README: no—"Zero configuration" and it runs as a local proxy with path-prefix routing for OpenAI-compatible providers.

**Q: Can it run with Gemini CLI?**
A: README notes Gemini CLI is currently blocked by an upstream issue when using OAuth; check the linked issue for status.

**Q: What should I store?**
A: Keep prompt archives in a private directory; they may contain secrets or code. Add redaction if you share logs.

## Source & Thanks

> Source: https://github.com/jmuncor/tokentap
> License: MIT
> GitHub stars: 798 · forks: 37

---

<!-- ZH -->

## 快速使用

```bash
pip install tokentap

# 终端 1
tokentap start

# 终端 2
tokentap claude
# 或：tokentap codex
```

## 简介

Tokentap 给 LLM CLI 工具加上实时 token 仪表盘与 prompt 存档：`tokentap start` 启动代理与看板，再用 `tokentap claude/codex` 运行工具即可观察消耗。

- **适合谁：** 需要看清 token 消耗与上下文压力的 Claude Code/Codex 等 CLI 重度用户
- **可搭配：** Python 3.10+；支持 Claude Code、Codex；Gemini CLI 在 README 中标注受上游问题影响；也支持 OpenAI 兼容供应商
- **准备时间：** 5–15 分钟

## 实战建议

- README：有上下文“油表”（默认 200,000）并把 prompts 存成 Markdown + JSON。
- 适合做回归：对比 prompt/tool 改动前后的 token 消耗。
- 配合护栏：油表到 70–80% 时切换到摘要或检索模式避免爆窗。

## 主要内容

一个能快速见效的用法：

1. 用 Tokentap 跑一段你平时的 CLI 会话。
2. 看到消耗飙升时，去 prompt 存档里定位原因：检索 payload、工具输出、还是模板膨胀。
3. 一次只改一个点（截断工具输出、去重上下文、减少 boilerplate），然后再量一次。

把 token 当预算管理：把 token 花在“相关证据”上，比重复的模板/闲聊更能提升答案质量。

### FAQ

**需要装证书吗？**
答：README 说明不需要：零配置，本地代理转发；对 OpenAI 兼容供应商用路径前缀路由。

**Gemini CLI 能用吗？**
答：README 提示当前受上游 OAuth 忽略 base URL 的问题影响；可跟踪链接 issue。

**存档应该怎么保存？**
答：建议放在私有目录；内容可能含 secrets/代码。要分享日志请先做脱敏。

## 来源与感谢

> Source: https://github.com/jmuncor/tokentap
> License: MIT
> GitHub stars: 798 · forks: 37


---
Source: https://tokrepo.com/en/workflows/tokentap-token-tracker-for-llm-clis
Author: Script Depot