ScriptsMay 19, 2026·3 min read

NeMo Guardrails — Programmable Safety for LLM Applications

NeMo Guardrails is an open-source toolkit by NVIDIA for adding programmable guardrails to LLM-based conversational systems. It provides input/output moderation, fact-checking, hallucination detection, jailbreak prevention, and dialog management via a declarative Colang configuration language.

Agent ready

This asset can be read and installed directly by agents

TokRepo exposes a universal CLI command, install contract, metadata JSON, adapter-aware plan, and raw content links so agents can judge fit, risk, and next actions.

Native · 98/100Policy: allow
Agent surface
Any MCP/CLI agent
Kind
Skill
Install
Single
Trust
Trust: Established
Entrypoint
NeMo Guardrails Overview
Universal CLI install command
npx tokrepo install e3c9db87-537e-11f1-9bc6-00163e2b0d79

Introduction

NeMo Guardrails lets developers define safety boundaries for LLM applications using a combination of declarative rules and LLM-based checks. It intercepts user inputs and model outputs, applying configurable moderation, topic control, and factual grounding before responses reach the user.

What NeMo Guardrails Does

  • Filters harmful or off-topic user inputs before they reach the LLM
  • Checks LLM outputs for hallucinations, toxicity, and policy violations
  • Detects and blocks jailbreak attempts and prompt injection attacks
  • Controls dialog flow to keep conversations on predefined topics
  • Integrates with external knowledge bases for fact-checking responses

Architecture Overview

The framework processes each conversation turn through a pipeline of rails (input rails, dialog rails, retrieval rails, output rails). Each rail is a chain of actions that can invoke LLM calls, external APIs, or custom Python functions. Dialog management uses Colang, a modeling language that defines canonical conversation flows. The runtime maintains conversation state and matches user messages against defined patterns to select appropriate flows. Guardrails can be composed and layered for defense in depth.

Self-Hosting & Configuration

  • Install via pip: pip install nemoguardrails
  • Define guardrail behavior in YAML config files and Colang flow definitions
  • Configure the LLM provider (OpenAI, Azure, NVIDIA NIM, or any OpenAI-compatible API)
  • Add custom actions by writing Python functions registered via decorators
  • Deploy as a middleware server between your application and the LLM provider

Key Features

  • Colang modeling language provides precise control over dialog behavior
  • Built-in rails for content safety, topic control, and jailbreak detection
  • Supports NVIDIA AI Foundation models and safety classifiers
  • Extensible action system for integrating custom moderation logic
  • Can function as a transparent proxy, adding safety to existing LLM deployments

Comparison with Similar Tools

  • Guardrails AI — focuses on structured output validation; NeMo Guardrails provides dialog management and input/output moderation
  • LLM Guard — standalone input/output scanner; NeMo Guardrails adds dialog flow control and Colang language
  • Rebuff — prompt injection detection; NeMo Guardrails covers injection plus topic control, fact-checking, and output moderation
  • Prompt Armor — API-based prompt security; NeMo Guardrails is self-hosted and open-source
  • Lakera Guard — commercial prompt injection defense; NeMo Guardrails is free and integrates with NVIDIA's AI stack

FAQ

Q: What is Colang? A: Colang is a domain-specific language for defining conversational flows and guardrail rules in a human-readable format.

Q: Can I use NeMo Guardrails with any LLM? A: Yes. It supports any LLM accessible via an OpenAI-compatible API, including local models served by vLLM or Ollama.

Q: Does it add latency to LLM responses? A: Guardrail checks add some latency (typically one additional LLM call for input/output checking). The exact impact depends on the number and type of rails configured.

Q: Can I use it in production? A: Yes. NeMo Guardrails can be deployed as a server and supports async processing for concurrent requests.

Sources

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets