How do I install LiteLLM Cost Tracking — Per-Project LLM Spend Dashboard?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Esta página se muestra en inglés. Una traducción al español está en curso.

KnowledgeMay 7, 2026·4 min de lectura

LiteLLM Cost Tracking — Per-Project LLM Spend Dashboard

Name: LiteLLM Cost Tracking — Per-Project LLM Spend Dashboard
Author: LiteLLM (BerriAI)

LiteLLM ships a built-in cost dashboard. Track LLM spend by project, user, model, tag. Hard budgets that block at the proxy. SOC2 / SSO via Pro tier.

LiteLLM (BerriAI) · Community

Listo para agents

Este activo puede ser leído e instalado directamente por agents

TokRepo expone un comando CLI universal, contrato de instalación, metadata JSON, plan según adaptador y contenido raw para que los agents evalúen compatibilidad, riesgo y próximos pasos.

Stage only · 15/100Stage only

Superficie agent

Cualquier agent MCP/CLI

Tipo

Knowledge

Instalación

Stage only

Confianza

Confianza: New

Entrada

Asset

Comando CLI universal

npx tokrepo install 72b2e16c-71b4-4702-87ed-f6ea3ba99f69

contrato de instalación JSON de metadata plan adaptador contenido raw

Introducción

LiteLLM's cost-tracking layer attributes every LLM call to a project, user, and tag, then surfaces it in a built-in dashboard. Set hard budgets per team — when the budget is hit, the proxy returns 429 instead of forwarding the request. Best for: any team where 'who's burning our LLM budget' is a recurring question. Works with: LiteLLM Proxy (self-host), LiteLLM Cloud (managed). Setup time: 5 minutes (Postgres + Redis + .env).

Enable in proxy config

# config.yaml
general_settings:
  master_key: sk-master
  database_url: postgresql://litellm:pass@db:5432/litellm
  store_model_in_db: true
  spend_logs_max_age: "90d"  # auto-prune

litellm_settings:
  callbacks: ["langfuse", "prometheus"]  # optional, ship to your observability

docker compose up -d  # spins up proxy + Postgres

Generate keys with budgets

# Per-team
curl -X POST http://localhost:4000/team/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"team_alias": "frontend-team", "max_budget": 1000, "budget_duration": "30d"}'

# Per-user (within a team)
curl -X POST http://localhost:4000/user/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "team_id": "frontend-team", "max_budget": 50}'

# Generate a key for that user
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "max_budget": 50}'

When alice@acme.com hits $50, all subsequent calls return 429.

Tag every request

client.chat.completions.create(
    model="claude-3-5-sonnet",
    messages=[...],
    extra_body={
        "tags": ["feature:onboarding", "env:prod", "user-tier:enterprise"],
    },
)

The dashboard then shows spend grouped by any tag combination — "how much did onboarding cost last week?" "What's enterprise vs free spend?".

Built-in dashboard

Visit http://localhost:4000/ui (default password from UI_USERNAME / UI_PASSWORD). Tabs:

Spend — by team, user, model, tag, date
Keys — generate, rotate, revoke
Models — health status, RPM/TPM consumed
Logs — every call with prompt + response (configurable retention)

Export to your warehouse

litellm_settings:
  callbacks: ["s3"]
  s3_callback_params:
    s3_bucket_name: my-llm-logs
    s3_region_name: us-east-1

Each call lands as a JSON line in S3 — query with Athena / DuckDB.

FAQ

Q: Does this require LiteLLM Pro? A: No — cost tracking, the dashboard, per-team budgets, and S3 export are all in the open-source proxy. Pro adds SOC2 attestation, SSO/SAML, and managed hosting.

Q: How accurate is the cost tracking? A: LiteLLM uses each provider's official token-count + per-model pricing table to calculate cost per call. For models without published prices (e.g. local Ollama), set input_cost_per_token / output_cost_per_token in config.yaml or it logs as $0.

Q: Can I block PII before sending to providers? A: Yes — LiteLLM has a Guardrails layer (regex-based or via Presidio / Lakera) that runs before the request goes upstream. Combined with cost tracking, you can also block specific tags from going to specific providers.

Quick Use

Have LiteLLM Proxy running with a Postgres database_url
Visit http://localhost:4000/ui and log in with UI_USERNAME / UI_PASSWORD
Generate a team + user with budgets via the API snippets below

Intro

Enable in proxy config

# config.yaml
general_settings:
  master_key: sk-master
  database_url: postgresql://litellm:pass@db:5432/litellm
  store_model_in_db: true
  spend_logs_max_age: "90d"  # auto-prune

litellm_settings:
  callbacks: ["langfuse", "prometheus"]  # optional, ship to your observability

docker compose up -d  # spins up proxy + Postgres

Generate keys with budgets

# Per-team
curl -X POST http://localhost:4000/team/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"team_alias": "frontend-team", "max_budget": 1000, "budget_duration": "30d"}'

# Per-user (within a team)
curl -X POST http://localhost:4000/user/new \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "team_id": "frontend-team", "max_budget": 50}'

# Generate a key for that user
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-master" \
  -d '{"user_id": "alice@acme.com", "max_budget": 50}'

When alice@acme.com hits $50, all subsequent calls return 429.

Tag every request

client.chat.completions.create(
    model="claude-3-5-sonnet",
    messages=[...],
    extra_body={
        "tags": ["feature:onboarding", "env:prod", "user-tier:enterprise"],
    },
)

The dashboard then shows spend grouped by any tag combination — "how much did onboarding cost last week?" "What's enterprise vs free spend?".

Built-in dashboard

Visit http://localhost:4000/ui (default password from UI_USERNAME / UI_PASSWORD). Tabs:

Spend — by team, user, model, tag, date
Keys — generate, rotate, revoke
Models — health status, RPM/TPM consumed
Logs — every call with prompt + response (configurable retention)

Export to your warehouse

litellm_settings:
  callbacks: ["s3"]
  s3_callback_params:
    s3_bucket_name: my-llm-logs
    s3_region_name: us-east-1

Each call lands as a JSON line in S3 — query with Athena / DuckDB.

FAQ

Source & Thanks

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — Spend Tracking — ⭐ 17,000+

🙏

Fuente y agradecimientos

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — Spend Tracking — ⭐ 17,000+

Discusión

Inicia sesión para unirte a la discusión.

Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados

Datadog LLM Observability — Trace Cost, Latency, Drift

Datadog LLM Observability traces OpenAI / Anthropic / Bedrock calls, tracks per-user cost, surfaces drift. Dashboards and span-level prompt view.

Knowledge

Datadog

Helicone Cache — Cut LLM Spend with Drop-In Response Caching

Helicone Cache short-circuits identical LLM requests at the proxy. Set Helicone-Cache-Enabled header, exact-match responses come back in ms at zero cost.

Knowledge

Helicone

Helicone Sessions — Group LLM Calls by User Conversation

Helicone Sessions group multiple LLM calls under one session ID. Trace a multi-step agent run end-to-end, see total cost, latency, conversation flow.

Knowledge

Helicone

Weave — Trace and Debug LLM Apps

Weave adds tracing to LLM apps with `@weave.op`. Install `weave`, call `weave.init()`, then track inputs/outputs across API calls and validation steps.

Knowledge

Agent Toolkit