Cette page est affichée en anglais. Une traduction française est en cours.
PromptsMay 14, 2026·2 min de lecture

Defender — Prompt Injection Guardrails for Agents

Defender is an OSS library to detect and neutralize prompt injection in tool outputs; verified 97★ and bundles a ~22MB ONNX model.

Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Native · 96/100Policy : autoriser
Surface agent
Tout agent MCP/CLI
Type
Prompt
Installation
Single
Confiance
Confiance : Community
Point d'entrée
Asset
Commande CLI universelle
npx tokrepo install 21666b2c-58cb-50da-b8ac-5a3b476463b1
Introduction

Defender is an OSS library to detect and neutralize prompt injection in tool outputs; verified 97★ and bundles a ~22MB ONNX model.

Best for: Agent builders who pipe email/docs/PRs into an LLM and want injection defense before the model sees text

Works with: Node.js/TypeScript agents (MCP/CLI/function-calling) that can gate tool outputs before LLM calls

Setup time: 6-15 minutes

Key facts (verified)

  • GitHub: 97 stars · 9 forks · pushed 2026-05-13.
  • License: Apache-2.0 · owner avatar + repo URL verified via GitHub API.
  • README-backed entrypoint: npm install @stackone/defender.

Main

  • Make the defense boundary explicit: treat tool results as untrusted input and gate them before they enter model context.

  • Start conservative: block high-risk results, then whitelist/override per-tool fields once you understand false positives.

  • Log evidence: store riskLevel, tier2Score, and matched detections so you can tune safely over time.

Source-backed notes

  • README states the ONNX model (~22MB) is bundled — no extra downloads required.
  • README describes a two-tier pipeline (pattern detection + ML classifier) and mentions ~10ms/sample after warmup.
  • README positions it for MCP/CLI/tool-call agents to sanitize tool results (emails, documents, PRs) before LLM use.

FAQ

  • Does this replace secure prompting?: No — it’s an extra guardrail; still keep strong system prompts and tool permissioning.
  • Will it slow down my agent?: README cites ~10ms/sample after warmup; measure on your workload and cache where possible.
  • Where should I apply it?: At the boundary: right after receiving tool output and before adding it to model context.
🙏

Source et remerciements

Source: https://github.com/StackOneHQ/defender > License: Apache-2.0 > GitHub stars: 97 · forks: 9

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires