KnowledgeMay 7, 2026·4 min read

LiteLLM Cost Tracking — Per-Project LLM Spend Dashboard

LiteLLM ships a built-in cost dashboard. Track LLM spend by project, user, model, tag. Hard budgets that block at the proxy. SOC2 / SSO via Pro tier.

Agent ready

This asset can be read and installed directly by agents

TokRepo exposes the CLI command, metadata JSON, install plan, and raw content links so agents can judge fit, risk, and next actions.

Stage only · 15/100Stage only
Target
Claude Code
Kind
Knowledge
Install
Stage only
Trust
Trust: New
Entrypoint
Asset
CLI install command
npx tokrepo install 72b2e16c-71b4-4702-87ed-f6ea3ba99f69 --target codex
Intro

LiteLLM's cost-tracking layer attributes every LLM call to a project, user, and tag, then surfaces it in a built-in dashboard. Set hard budgets per team — when the budget is hit, the proxy returns 429 instead of forwarding the request. Best for: any team where 'who's burning our LLM budget' is a recurring question. Works with: LiteLLM Proxy (self-host), LiteLLM Cloud (managed). Setup time: 5 minutes (Postgres + Redis + .env).


Enable in proxy config

# config.yaml
general_settings:
  master_key: sk-master
  database_url: postgresql://litellm:pass@db:5432/litellm
  store_model_in_db: true
  spend_logs_max_age: "90d"  # auto-prune

litellm_settings:
  callbacks: ["langfuse", "prometheus"]  # optional, ship to your observability
docker compose up -d  # spins up proxy + Postgres

Generate keys with budgets

# Per-team
curl -X POST http://localhost:4000/team/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"team_alias": "frontend-team", "max_budget": 1000, "budget_duration": "30d"}'

# Per-user (within a team)
curl -X POST http://localhost:4000/user/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "team_id": "frontend-team", "max_budget": 50}'

# Generate a key for that user
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "max_budget": 50}'

When alice@acme.com hits $50, all subsequent calls return 429.

Tag every request

client.chat.completions.create(
    model="claude-3-5-sonnet",
    messages=[...],
    extra_body={
        "tags": ["feature:onboarding", "env:prod", "user-tier:enterprise"],
    },
)

The dashboard then shows spend grouped by any tag combination — "how much did onboarding cost last week?" "What's enterprise vs free spend?".

Built-in dashboard

Visit http://localhost:4000/ui (default password from UI_USERNAME / UI_PASSWORD). Tabs:

  • Spend — by team, user, model, tag, date
  • Keys — generate, rotate, revoke
  • Models — health status, RPM/TPM consumed
  • Logs — every call with prompt + response (configurable retention)

Export to your warehouse

litellm_settings:
  callbacks: ["s3"]
  s3_callback_params:
    s3_bucket_name: my-llm-logs
    s3_region_name: us-east-1

Each call lands as a JSON line in S3 — query with Athena / DuckDB.


FAQ

Q: Does this require LiteLLM Pro? A: No — cost tracking, the dashboard, per-team budgets, and S3 export are all in the open-source proxy. Pro adds SOC2 attestation, SSO/SAML, and managed hosting.

Q: How accurate is the cost tracking? A: LiteLLM uses each provider's official token-count + per-model pricing table to calculate cost per call. For models without published prices (e.g. local Ollama), set input_cost_per_token / output_cost_per_token in config.yaml or it logs as $0.

Q: Can I block PII before sending to providers? A: Yes — LiteLLM has a Guardrails layer (regex-based or via Presidio / Lakera) that runs before the request goes upstream. Combined with cost tracking, you can also block specific tags from going to specific providers.


Quick Use

  1. Have LiteLLM Proxy running with a Postgres database_url
  2. Visit http://localhost:4000/ui and log in with UI_USERNAME / UI_PASSWORD
  3. Generate a team + user with budgets via the API snippets below

Intro

LiteLLM's cost-tracking layer attributes every LLM call to a project, user, and tag, then surfaces it in a built-in dashboard. Set hard budgets per team — when the budget is hit, the proxy returns 429 instead of forwarding the request. Best for: any team where 'who's burning our LLM budget' is a recurring question. Works with: LiteLLM Proxy (self-host), LiteLLM Cloud (managed). Setup time: 5 minutes (Postgres + Redis + .env).


Enable in proxy config

# config.yaml
general_settings:
  master_key: sk-master
  database_url: postgresql://litellm:pass@db:5432/litellm
  store_model_in_db: true
  spend_logs_max_age: "90d"  # auto-prune

litellm_settings:
  callbacks: ["langfuse", "prometheus"]  # optional, ship to your observability
docker compose up -d  # spins up proxy + Postgres

Generate keys with budgets

# Per-team
curl -X POST http://localhost:4000/team/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"team_alias": "frontend-team", "max_budget": 1000, "budget_duration": "30d"}'

# Per-user (within a team)
curl -X POST http://localhost:4000/user/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "team_id": "frontend-team", "max_budget": 50}'

# Generate a key for that user
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "max_budget": 50}'

When alice@acme.com hits $50, all subsequent calls return 429.

Tag every request

client.chat.completions.create(
    model="claude-3-5-sonnet",
    messages=[...],
    extra_body={
        "tags": ["feature:onboarding", "env:prod", "user-tier:enterprise"],
    },
)

The dashboard then shows spend grouped by any tag combination — "how much did onboarding cost last week?" "What's enterprise vs free spend?".

Built-in dashboard

Visit http://localhost:4000/ui (default password from UI_USERNAME / UI_PASSWORD). Tabs:

  • Spend — by team, user, model, tag, date
  • Keys — generate, rotate, revoke
  • Models — health status, RPM/TPM consumed
  • Logs — every call with prompt + response (configurable retention)

Export to your warehouse

litellm_settings:
  callbacks: ["s3"]
  s3_callback_params:
    s3_bucket_name: my-llm-logs
    s3_region_name: us-east-1

Each call lands as a JSON line in S3 — query with Athena / DuckDB.


FAQ

Q: Does this require LiteLLM Pro? A: No — cost tracking, the dashboard, per-team budgets, and S3 export are all in the open-source proxy. Pro adds SOC2 attestation, SSO/SAML, and managed hosting.

Q: How accurate is the cost tracking? A: LiteLLM uses each provider's official token-count + per-model pricing table to calculate cost per call. For models without published prices (e.g. local Ollama), set input_cost_per_token / output_cost_per_token in config.yaml or it logs as $0.

Q: Can I block PII before sending to providers? A: Yes — LiteLLM has a Guardrails layer (regex-based or via Presidio / Lakera) that runs before the request goes upstream. Combined with cost tracking, you can also block specific tags from going to specific providers.


Source & Thanks

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — Spend Tracking — ⭐ 17,000+

🙏

Source & Thanks

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — Spend Tracking — ⭐ 17,000+

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets