# Defender — Prompt Injection Guardrails for Agents > Defender is an OSS library to detect and neutralize prompt injection in tool outputs; verified 97★ and bundles a ~22MB ONNX model. ## Install Paste the prompt below into your AI tool: ## Quick Use ```bash npm install @stackone/defender # In your agent loop: run defendToolResult(toolOutput, toolName) before passing to the LLM. # Start with blockHighRisk=true and log riskLevel + detections for tuning. ``` ## Intro Defender is an OSS library to detect and neutralize prompt injection in tool outputs; verified 97★ and bundles a ~22MB ONNX model. **Best for:** Agent builders who pipe email/docs/PRs into an LLM and want injection defense before the model sees text **Works with:** Node.js/TypeScript agents (MCP/CLI/function-calling) that can gate tool outputs before LLM calls **Setup time:** 6-15 minutes ### Key facts (verified) - GitHub: 97 stars · 9 forks · pushed 2026-05-13. - License: Apache-2.0 · owner avatar + repo URL verified via GitHub API. - README-backed entrypoint: `npm install @stackone/defender`. ## Main - Make the defense boundary explicit: treat tool results as untrusted input and gate them before they enter model context. - Start conservative: block high-risk results, then whitelist/override per-tool fields once you understand false positives. - Log evidence: store `riskLevel`, `tier2Score`, and matched detections so you can tune safely over time. ### Source-backed notes - README states the ONNX model (~22MB) is bundled — no extra downloads required. - README describes a two-tier pipeline (pattern detection + ML classifier) and mentions ~10ms/sample after warmup. - README positions it for MCP/CLI/tool-call agents to sanitize tool results (emails, documents, PRs) before LLM use. ### FAQ - **Does this replace secure prompting?**: No — it’s an extra guardrail; still keep strong system prompts and tool permissioning. - **Will it slow down my agent?**: README cites ~10ms/sample after warmup; measure on your workload and cache where possible. - **Where should I apply it?**: At the boundary: right after receiving tool output and before adding it to model context. ## Source & Thanks > Source: https://github.com/StackOneHQ/defender > License: Apache-2.0 > GitHub stars: 97 · forks: 9 --- ## Quick Use ```bash npm install @stackone/defender # In your agent loop: run defendToolResult(toolOutput, toolName) before passing to the LLM. # Start with blockHighRisk=true and log riskLevel + detections for tuning. ``` ## Intro Defender 是用于检测与消解工具输出(邮件/文档/PR 等)中 prompt injection 的开源防护库;已验证 97★,自带约 22MB ONNX 模型,热身后约 10ms/样本。 **Best for:** 会把邮件/文档/PR 等工具结果送入 LLM 的 agent 开发者,需要在进模型前做注入防护 **Works with:** Node.js/TypeScript agent(MCP/CLI/函数调用),可在调用 LLM 前对工具输出做防护 **Setup time:** 6-15 minutes ### Key facts (verified) - GitHub:97 stars · 9 forks;最近更新 2026-05-13。 - 许可证:Apache-2.0;作者头像与仓库链接均已通过 GitHub API 复核。 - README 中可对照的入口命令:`npm install @stackone/defender`。 ## Main - 把边界画清楚:把 tool result 当作不可信输入,并在进入模型上下文前先做防护与裁决。 - 先保守再放开:先对高风险直接拦截,再按工具字段做白名单/覆盖,逐步降低误报。 - 记录证据:持久化 `riskLevel`、`tier2Score` 与命中的 detections,便于后续安全调参。 ### Source-backed notes - README 写明 ONNX 模型(约 22MB)随包提供,无需额外下载。 - README 描述两层防线(规则检测 + ML 分类器),并提到热身后约 ~10ms/样本的延迟量级。 - README 将其定位为 MCP/CLI/tool-call agent 的工具结果防护层,可在进入 LLM 前清洗与裁决。 ### FAQ - **能替代安全提示词吗?**:不能。它是额外 guardrail;仍要做好 system prompt 与工具权限控制。 - **会拖慢 agent 吗?**:README 提到热身后约 ~10ms/样本;请按你的负载实测,并尽量做缓存/批处理。 - **应该放在链路哪里?**:放在边界处:拿到工具输出后立刻处理,然后再进入模型上下文。 ## Source & Thanks > Source: https://github.com/StackOneHQ/defender > License: Apache-2.0 > GitHub stars: 97 · forks: 9 --- Source: https://tokrepo.com/en/workflows/defender-prompt-injection-guardrails-for-agents Author: Prompt Lab