Esta página se muestra en inglés. Una traducción al español está en curso.
KnowledgeMay 19, 2026·2 min de lectura

Embedding Drift Monitoring — Retrieval Regression Runbook

Embedding drift monitoring runbook for RAG and agent search. Uses golden queries, recall@K, rank delta, and rollback gates.

Listo para agents

Instalación lista para agent

Este activo puede instalarse después de elegir el runtime, revisar el plan y ejecutar el comando correspondiente.

Native · 98/100Política: permitir
Superficie agent
Cualquier agent MCP/CLI
Tipo
Knowledge
Instalación
Single
Confianza
Publisher verificado
Entrada
README.md
Comando de instalación directa
npx -y tokrepo@latest install ea696ee5-0736-48e3-a789-f5a026223bd0 --target codex

Ejecutar después de confirmar el plan con dry-run.

Metrics That Matter

Metric Use
Recall@K on golden queries Catches lost must-return documents.
Rank delta for critical docs Shows whether important docs fell below the fold.
Top-K overlap Detects broad distribution shifts.
Empty-result rate Finds tokenizer, filter, or metadata regressions.
Click or install follow-through Confirms search quality after launch.

Vector distance alone is not enough. A lower average distance can still be worse if the wrong assets now rank above the exact answer.

Change Types To Test

  • Embedding model upgrade or provider switch.
  • Chunk size, overlap, or markdown parsing change.
  • Metadata filter changes such as visibility, asset_kind, or language.
  • Hybrid ranking weight changes between BM25 and vector score.
  • Corpus refresh that adds many near-duplicate documents.

Ship Gate

Ship only when:

  1. Must-include recall does not regress.
  2. Empty-result rate does not increase for high-intent queries.
  3. Top critical docs remain in top 3 or top 5 where expected.
  4. Any intentional ranking shift is documented with examples.
  5. Rollback is available: old index, old embedding model, or old ranker config.
🙏

Fuente y agradecimientos

This is an original TokRepo runbook by William Wang. It uses standard IR evaluation ideas such as recall@K and rank movement, and applies them to vector search systems commonly used with RAG and agent registries.

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados