Esta página se muestra en inglés. Una traducción al español está en curso.
KnowledgeMay 7, 2026·4 min de lectura

LiteLLM Cost Tracking — Per-Project LLM Spend Dashboard

LiteLLM ships a built-in cost dashboard. Track LLM spend by project, user, model, tag. Hard budgets that block at the proxy. SOC2 / SSO via Pro tier.

Listo para agents

Este activo puede ser leído e instalado directamente por agents

TokRepo expone un comando CLI universal, contrato de instalación, metadata JSON, plan según adaptador y contenido raw para que los agents evalúen compatibilidad, riesgo y próximos pasos.

Stage only · 15/100Stage only
Superficie agent
Cualquier agent MCP/CLI
Tipo
Knowledge
Instalación
Stage only
Confianza
Confianza: New
Entrada
Asset
Comando CLI universal
npx tokrepo install 72b2e16c-71b4-4702-87ed-f6ea3ba99f69
Introducción

LiteLLM's cost-tracking layer attributes every LLM call to a project, user, and tag, then surfaces it in a built-in dashboard. Set hard budgets per team — when the budget is hit, the proxy returns 429 instead of forwarding the request. Best for: any team where 'who's burning our LLM budget' is a recurring question. Works with: LiteLLM Proxy (self-host), LiteLLM Cloud (managed). Setup time: 5 minutes (Postgres + Redis + .env).


Enable in proxy config

# config.yaml
general_settings:
  master_key: sk-master
  database_url: postgresql://litellm:pass@db:5432/litellm
  store_model_in_db: true
  spend_logs_max_age: "90d"  # auto-prune

litellm_settings:
  callbacks: ["langfuse", "prometheus"]  # optional, ship to your observability
docker compose up -d  # spins up proxy + Postgres

Generate keys with budgets

# Per-team
curl -X POST http://localhost:4000/team/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"team_alias": "frontend-team", "max_budget": 1000, "budget_duration": "30d"}'

# Per-user (within a team)
curl -X POST http://localhost:4000/user/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "team_id": "frontend-team", "max_budget": 50}'

# Generate a key for that user
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "max_budget": 50}'

When alice@acme.com hits $50, all subsequent calls return 429.

Tag every request

client.chat.completions.create(
    model="claude-3-5-sonnet",
    messages=[...],
    extra_body={
        "tags": ["feature:onboarding", "env:prod", "user-tier:enterprise"],
    },
)

The dashboard then shows spend grouped by any tag combination — "how much did onboarding cost last week?" "What's enterprise vs free spend?".

Built-in dashboard

Visit http://localhost:4000/ui (default password from UI_USERNAME / UI_PASSWORD). Tabs:

  • Spend — by team, user, model, tag, date
  • Keys — generate, rotate, revoke
  • Models — health status, RPM/TPM consumed
  • Logs — every call with prompt + response (configurable retention)

Export to your warehouse

litellm_settings:
  callbacks: ["s3"]
  s3_callback_params:
    s3_bucket_name: my-llm-logs
    s3_region_name: us-east-1

Each call lands as a JSON line in S3 — query with Athena / DuckDB.


FAQ

Q: Does this require LiteLLM Pro? A: No — cost tracking, the dashboard, per-team budgets, and S3 export are all in the open-source proxy. Pro adds SOC2 attestation, SSO/SAML, and managed hosting.

Q: How accurate is the cost tracking? A: LiteLLM uses each provider's official token-count + per-model pricing table to calculate cost per call. For models without published prices (e.g. local Ollama), set input_cost_per_token / output_cost_per_token in config.yaml or it logs as $0.

Q: Can I block PII before sending to providers? A: Yes — LiteLLM has a Guardrails layer (regex-based or via Presidio / Lakera) that runs before the request goes upstream. Combined with cost tracking, you can also block specific tags from going to specific providers.


Quick Use

  1. Have LiteLLM Proxy running with a Postgres database_url
  2. Visit http://localhost:4000/ui and log in with UI_USERNAME / UI_PASSWORD
  3. Generate a team + user with budgets via the API snippets below

Intro

LiteLLM's cost-tracking layer attributes every LLM call to a project, user, and tag, then surfaces it in a built-in dashboard. Set hard budgets per team — when the budget is hit, the proxy returns 429 instead of forwarding the request. Best for: any team where 'who's burning our LLM budget' is a recurring question. Works with: LiteLLM Proxy (self-host), LiteLLM Cloud (managed). Setup time: 5 minutes (Postgres + Redis + .env).


Enable in proxy config

# config.yaml
general_settings:
  master_key: sk-master
  database_url: postgresql://litellm:pass@db:5432/litellm
  store_model_in_db: true
  spend_logs_max_age: "90d"  # auto-prune

litellm_settings:
  callbacks: ["langfuse", "prometheus"]  # optional, ship to your observability
docker compose up -d  # spins up proxy + Postgres

Generate keys with budgets

# Per-team
curl -X POST http://localhost:4000/team/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"team_alias": "frontend-team", "max_budget": 1000, "budget_duration": "30d"}'

# Per-user (within a team)
curl -X POST http://localhost:4000/user/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "team_id": "frontend-team", "max_budget": 50}'

# Generate a key for that user
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "max_budget": 50}'

When alice@acme.com hits $50, all subsequent calls return 429.

Tag every request

client.chat.completions.create(
    model="claude-3-5-sonnet",
    messages=[...],
    extra_body={
        "tags": ["feature:onboarding", "env:prod", "user-tier:enterprise"],
    },
)

The dashboard then shows spend grouped by any tag combination — "how much did onboarding cost last week?" "What's enterprise vs free spend?".

Built-in dashboard

Visit http://localhost:4000/ui (default password from UI_USERNAME / UI_PASSWORD). Tabs:

  • Spend — by team, user, model, tag, date
  • Keys — generate, rotate, revoke
  • Models — health status, RPM/TPM consumed
  • Logs — every call with prompt + response (configurable retention)

Export to your warehouse

litellm_settings:
  callbacks: ["s3"]
  s3_callback_params:
    s3_bucket_name: my-llm-logs
    s3_region_name: us-east-1

Each call lands as a JSON line in S3 — query with Athena / DuckDB.


FAQ

Q: Does this require LiteLLM Pro? A: No — cost tracking, the dashboard, per-team budgets, and S3 export are all in the open-source proxy. Pro adds SOC2 attestation, SSO/SAML, and managed hosting.

Q: How accurate is the cost tracking? A: LiteLLM uses each provider's official token-count + per-model pricing table to calculate cost per call. For models without published prices (e.g. local Ollama), set input_cost_per_token / output_cost_per_token in config.yaml or it logs as $0.

Q: Can I block PII before sending to providers? A: Yes — LiteLLM has a Guardrails layer (regex-based or via Presidio / Lakera) that runs before the request goes upstream. Combined with cost tracking, you can also block specific tags from going to specific providers.


Source & Thanks

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — Spend Tracking — ⭐ 17,000+

🙏

Fuente y agradecimientos

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — Spend Tracking — ⭐ 17,000+

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados