Practical Notes
A reliable rollout pattern is: start with one high-signal guard (prompt injection / secrets) in monitor mode, log detections, then switch to block/redact. Keep scanner configs versioned, and add allowlists for known-safe internal tools to reduce false positives.
Safety note: Do not rely on a single prompt to prevent injection—enforce guardrails in code with logs, tests, and allowlists.
FAQ
Q: What problem does it solve? A: It adds an explicit scanning/guard layer to LLM inputs and outputs to reduce prompt injection, leakage, and harmful content.
Q: Is it a model or a rule engine? A: It’s a toolkit. You compose scanners/filters (rules + detectors) around whichever LLM you already use.
Q: Where should I enforce it? A: Enforce on both edges: before the model call (prompt) and before returning to users (output).