What is Defender — Prompt Injection Guardrails for Agents?

Defender is an OSS library to detect and neutralize prompt injection in tool outputs; verified 97★ and bundles a ~22MB ONNX model.

Is Defender — Prompt Injection Guardrails for Agents free to use?

Yes. Defender — Prompt Injection Guardrails for Agents is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Defender — Prompt Injection Guardrails for Agents?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Defender — Prompt Injection Guardrails for Agents

Main

Make the defense boundary explicit: treat tool results as untrusted input and gate them before they enter model context.
Start conservative: block high-risk results, then whitelist/override per-tool fields once you understand false positives.
Log evidence: store riskLevel, tier2Score, and matched detections so you can tune safely over time.

Source-backed notes

README states the ONNX model (~22MB) is bundled — no extra downloads required.
README describes a two-tier pipeline (pattern detection + ML classifier) and mentions ~10ms/sample after warmup.
README positions it for MCP/CLI/tool-call agents to sanitize tool results (emails, documents, PRs) before LLM use.

FAQ

Does this replace secure prompting?: No — it’s an extra guardrail; still keep strong system prompts and tool permissioning.
Will it slow down my agent?: README cites ~10ms/sample after warmup; measure on your workload and cache where possible.
Where should I apply it?: At the boundary: right after receiving tool output and before adding it to model context.

Defender — Prompt Injection Guardrails for Agents

Ready-to-run agent install

Key facts (verified)

Main

Source-backed notes

FAQ

Source & Thanks

Discussion

Related Assets

Prompt Hardener — Prompt-Injection Risk Analyzer

Prompt Injection Defense — Security Guide for LLM Apps

Anamorpher — Image-Scaling Prompt Injection Lab

Superagent SDK — Guardrails Against Prompt Injection