Esta página se muestra en inglés. Una traducción al español está en curso.
PromptsMay 14, 2026·2 min de lectura

Defender — Prompt Injection Guardrails for Agents

Defender is an OSS library to detect and neutralize prompt injection in tool outputs; verified 97★ and bundles a ~22MB ONNX model.

Listo para agents

Este activo puede ser leído e instalado directamente por agents

TokRepo expone un comando CLI universal, contrato de instalación, metadata JSON, plan según adaptador y contenido raw para que los agents evalúen compatibilidad, riesgo y próximos pasos.

Native · 96/100Política: permitir
Superficie agent
Cualquier agent MCP/CLI
Tipo
Prompt
Instalación
Single
Confianza
Confianza: Community
Entrada
Asset
Comando CLI universal
npx tokrepo install 21666b2c-58cb-50da-b8ac-5a3b476463b1
Introducción

Defender is an OSS library to detect and neutralize prompt injection in tool outputs; verified 97★ and bundles a ~22MB ONNX model.

Best for: Agent builders who pipe email/docs/PRs into an LLM and want injection defense before the model sees text

Works with: Node.js/TypeScript agents (MCP/CLI/function-calling) that can gate tool outputs before LLM calls

Setup time: 6-15 minutes

Key facts (verified)

  • GitHub: 97 stars · 9 forks · pushed 2026-05-13.
  • License: Apache-2.0 · owner avatar + repo URL verified via GitHub API.
  • README-backed entrypoint: npm install @stackone/defender.

Main

  • Make the defense boundary explicit: treat tool results as untrusted input and gate them before they enter model context.

  • Start conservative: block high-risk results, then whitelist/override per-tool fields once you understand false positives.

  • Log evidence: store riskLevel, tier2Score, and matched detections so you can tune safely over time.

Source-backed notes

  • README states the ONNX model (~22MB) is bundled — no extra downloads required.
  • README describes a two-tier pipeline (pattern detection + ML classifier) and mentions ~10ms/sample after warmup.
  • README positions it for MCP/CLI/tool-call agents to sanitize tool results (emails, documents, PRs) before LLM use.

FAQ

  • Does this replace secure prompting?: No — it’s an extra guardrail; still keep strong system prompts and tool permissioning.
  • Will it slow down my agent?: README cites ~10ms/sample after warmup; measure on your workload and cache where possible.
  • Where should I apply it?: At the boundary: right after receiving tool output and before adding it to model context.
🙏

Fuente y agradecimientos

Source: https://github.com/StackOneHQ/defender > License: Apache-2.0 > GitHub stars: 97 · forks: 9

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados