# Defender — Prompt Injection Guardrails for Agents

> Defender is an OSS library to detect and neutralize prompt injection in tool outputs; verified 97★ and bundles a ~22MB ONNX model.

## Install

Paste the prompt below into your AI tool:

## Quick Use

```bash
npm install @stackone/defender
# In your agent loop: run defendToolResult(toolOutput, toolName) before passing to the LLM.
# Start with blockHighRisk=true and log riskLevel + detections for tuning.
```

## Intro

Defender is an OSS library to detect and neutralize prompt injection in tool outputs; verified 97★ and bundles a ~22MB ONNX model.

**Best for:** Agent builders who pipe email/docs/PRs into an LLM and want injection defense before the model sees text

**Works with:** Node.js/TypeScript agents (MCP/CLI/function-calling) that can gate tool outputs before LLM calls

**Setup time:** 6-15 minutes

### Key facts (verified)

- GitHub: 97 stars · 9 forks · pushed 2026-05-13.
- License: Apache-2.0 · owner avatar + repo URL verified via GitHub API.
- README-backed entrypoint: `npm install @stackone/defender`.

## Main

- Make the defense boundary explicit: treat tool results as untrusted input and gate them before they enter model context.

- Start conservative: block high-risk results, then whitelist/override per-tool fields once you understand false positives.

- Log evidence: store `riskLevel`, `tier2Score`, and matched detections so you can tune safely over time.

### Source-backed notes

- README states the ONNX model (~22MB) is bundled — no extra downloads required.
- README describes a two-tier pipeline (pattern detection + ML classifier) and mentions ~10ms/sample after warmup.
- README positions it for MCP/CLI/tool-call agents to sanitize tool results (emails, documents, PRs) before LLM use.

### FAQ

- **Does this replace secure prompting?**: No — it’s an extra guardrail; still keep strong system prompts and tool permissioning.
- **Will it slow down my agent?**: README cites ~10ms/sample after warmup; measure on your workload and cache where possible.
- **Where should I apply it?**: At the boundary: right after receiving tool output and before adding it to model context.

## Source & Thanks

> Source: https://github.com/StackOneHQ/defender
> License: Apache-2.0
> GitHub stars: 97 · forks: 9

---

<!-- ZH -->

## Quick Use

```bash
npm install @stackone/defender
# In your agent loop: run defendToolResult(toolOutput, toolName) before passing to the LLM.
# Start with blockHighRisk=true and log riskLevel + detections for tuning.
```

## Intro

Defender 是用于检测与消解工具输出（邮件/文档/PR 等）中 prompt injection 的开源防护库；已验证 97★，自带约 22MB ONNX 模型，热身后约 10ms/样本。

**Best for:** 会把邮件/文档/PR 等工具结果送入 LLM 的 agent 开发者，需要在进模型前做注入防护

**Works with:** Node.js/TypeScript agent（MCP/CLI/函数调用），可在调用 LLM 前对工具输出做防护

**Setup time:** 6-15 minutes

### Key facts (verified)

- GitHub：97 stars · 9 forks；最近更新 2026-05-13。
- 许可证：Apache-2.0；作者头像与仓库链接均已通过 GitHub API 复核。
- README 中可对照的入口命令：`npm install @stackone/defender`。

## Main

- 把边界画清楚：把 tool result 当作不可信输入，并在进入模型上下文前先做防护与裁决。

- 先保守再放开：先对高风险直接拦截，再按工具字段做白名单/覆盖，逐步降低误报。

- 记录证据：持久化 `riskLevel`、`tier2Score` 与命中的 detections，便于后续安全调参。

### Source-backed notes

- README 写明 ONNX 模型（约 22MB）随包提供，无需额外下载。
- README 描述两层防线（规则检测 + ML 分类器），并提到热身后约 ~10ms/样本的延迟量级。
- README 将其定位为 MCP/CLI/tool-call agent 的工具结果防护层，可在进入 LLM 前清洗与裁决。

### FAQ

- **能替代安全提示词吗？**：不能。它是额外 guardrail；仍要做好 system prompt 与工具权限控制。
- **会拖慢 agent 吗？**：README 提到热身后约 ~10ms/样本；请按你的负载实测，并尽量做缓存/批处理。
- **应该放在链路哪里？**：放在边界处：拿到工具输出后立刻处理，然后再进入模型上下文。

## Source & Thanks

> Source: https://github.com/StackOneHQ/defender
> License: Apache-2.0
> GitHub stars: 97 · forks: 9


---
Source: https://tokrepo.com/en/workflows/defender-prompt-injection-guardrails-for-agents
Author: Prompt Lab