How do I install NeMo Guardrails — Programmable Safety for LLM Applications?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

NeMo Guardrails — Programmable Safety for LLM Applications

Introduction

NeMo Guardrails lets developers define safety boundaries for LLM applications using a combination of declarative rules and LLM-based checks. It intercepts user inputs and model outputs, applying configurable moderation, topic control, and factual grounding before responses reach the user.

What NeMo Guardrails Does

Filters harmful or off-topic user inputs before they reach the LLM
Checks LLM outputs for hallucinations, toxicity, and policy violations
Detects and blocks jailbreak attempts and prompt injection attacks
Controls dialog flow to keep conversations on predefined topics
Integrates with external knowledge bases for fact-checking responses

Architecture Overview

The framework processes each conversation turn through a pipeline of rails (input rails, dialog rails, retrieval rails, output rails). Each rail is a chain of actions that can invoke LLM calls, external APIs, or custom Python functions. Dialog management uses Colang, a modeling language that defines canonical conversation flows. The runtime maintains conversation state and matches user messages against defined patterns to select appropriate flows. Guardrails can be composed and layered for defense in depth.

Self-Hosting & Configuration

Install via pip: pip install nemoguardrails
Define guardrail behavior in YAML config files and Colang flow definitions
Configure the LLM provider (OpenAI, Azure, NVIDIA NIM, or any OpenAI-compatible API)
Add custom actions by writing Python functions registered via decorators
Deploy as a middleware server between your application and the LLM provider

Key Features

Colang modeling language provides precise control over dialog behavior
Built-in rails for content safety, topic control, and jailbreak detection
Supports NVIDIA AI Foundation models and safety classifiers
Extensible action system for integrating custom moderation logic
Can function as a transparent proxy, adding safety to existing LLM deployments

Comparison with Similar Tools

Guardrails AI — focuses on structured output validation; NeMo Guardrails provides dialog management and input/output moderation
LLM Guard — standalone input/output scanner; NeMo Guardrails adds dialog flow control and Colang language
Rebuff — prompt injection detection; NeMo Guardrails covers injection plus topic control, fact-checking, and output moderation
Prompt Armor — API-based prompt security; NeMo Guardrails is self-hosted and open-source
Lakera Guard — commercial prompt injection defense; NeMo Guardrails is free and integrates with NVIDIA's AI stack

FAQ

Q: What is Colang? A: Colang is a domain-specific language for defining conversational flows and guardrail rules in a human-readable format.

Q: Can I use NeMo Guardrails with any LLM? A: Yes. It supports any LLM accessible via an OpenAI-compatible API, including local models served by vLLM or Ollama.

Q: Does it add latency to LLM responses? A: Guardrail checks add some latency (typically one additional LLM call for input/output checking). The exact impact depends on the number and type of rails configured.

Q: Can I use it in production? A: Yes. NeMo Guardrails can be deployed as a server and supports async processing for concurrent requests.

NeMo Guardrails — Programmable Safety for LLM Applications

This asset can be read and installed directly by agents

Introduction

What NeMo Guardrails Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

Discussion

Related Assets

Guardrails — Validate & Secure LLM Outputs

VoltAgent — TypeScript AI Agent Framework

ZeroTier — Programmable Layer-2 Overlay Network

Pingora — Fast Programmable HTTP Proxy Framework by Cloudflare