Esta página se muestra en inglés. Una traducción al español está en curso.

SkillsApr 8, 2026·2 min de lectura

Guardrails AI — Validate LLM Outputs in Production

Add validation and guardrails to any LLM output. Guardrails AI checks for hallucination, toxicity, PII leakage, and format compliance with 50+ built-in validators.

Agent Toolkit · Community

Listo para agents

Instalación con revisión previa

Este activo requiere revisión. El prompt copiado pide dry-run, muestra escrituras y continúa solo tras confirmación.

Needs Confirmation · 66/100Política: confirmar

Superficie agent

Cualquier agent MCP/CLI

Tipo

Skill

Instalación

Single

Confianza

Confianza: Established

Entrada

Guardrails AI — Validate LLM Outputs in Production

Comando con revisión previa

npx -y tokrepo@latest install ffbad589-cd32-4eca-9518-fdcf9167ca21 --target codex

Primero dry-run, confirma las escrituras y luego ejecuta este comando.

TL;DR

Guardrails AI adds validation layers to any LLM output, checking for hallucination, PII, toxicity, and format errors.

§01

What it is

Guardrails AI is a Python library that adds validation and safety checks to LLM outputs. It provides 50+ built-in validators that check for hallucination, toxicity, PII leakage, format compliance, and factual accuracy. You wrap your LLM calls with a Guard object, and Guardrails automatically validates and optionally retries when outputs fail validation.

Guardrails AI is designed for teams deploying LLMs in production who need to ensure outputs meet quality and safety standards. It works with any LLM provider including OpenAI, Anthropic, and local models.

§02

How it saves time or tokens

Without Guardrails, you write custom validation logic for every LLM output format. Guardrails provides a declarative approach: define your validators once, and the library handles validation, error reporting, and automatic retries. The hub of pre-built validators (guardrails hub) saves development time by providing tested implementations for common checks.

Automatic retries with corrective prompts reduce manual intervention. When an output fails validation, Guardrails re-prompts the LLM with the error details, often producing a valid output on the second attempt.

§03

How to use

Install Guardrails and a validator:

pip install guardrails-ai
guardrails hub install hub://guardrails/regex_match

Create a Guard with validators:

from guardrails import Guard
from guardrails.hub import RegexMatch

guard = Guard().use(RegexMatch(regex=r'^\d{3}-\d{2}-\d{4}$'))

result = guard(
    model='gpt-4o',
    messages=[{'role': 'user', 'content': 'Generate a sample SSN format'}]
)
print(result.validated_output)

The Guard validates the LLM output against the regex pattern and retries if it does not match.

§04

Example

from guardrails import Guard
from guardrails.hub import DetectPII, ToxicLanguage
from pydantic import BaseModel, Field

class SafeResponse(BaseModel):
    answer: str = Field(description='The response text')
    confidence: float = Field(ge=0.0, le=1.0)

guard = Guard.from_pydantic(SafeResponse)
guard.use(DetectPII(pii_entities=['EMAIL', 'PHONE_NUMBER'], on_fail='fix'))
guard.use(ToxicLanguage(threshold=0.5, on_fail='reask'))

result = guard(
    model='gpt-4o',
    messages=[{'role': 'user', 'content': 'Summarize this customer feedback'}]
)

print(result.validated_output)
print(f'Validation passed: {result.validation_passed}')

This ensures the LLM output matches the Pydantic schema, contains no PII, and is not toxic. Failed PII checks are auto-fixed; toxic outputs trigger a retry.

§05

Related on TokRepo

AI agent tools -- Explore tools for building reliable AI agent pipelines
AI monitoring tools -- Browse tools for monitoring AI systems in production

§06

Common pitfalls

Validators from the hub need separate installation. Running guard.use(SomeValidator()) without first installing via guardrails hub install produces import errors.
Stacking many validators increases latency. Each validator runs sequentially by default. For latency-sensitive applications, limit validators to the most critical checks.
The on_fail='reask' strategy sends additional LLM calls, increasing token costs. Use on_fail='fix' for deterministic corrections or on_fail='exception' when retries are not acceptable.

Preguntas frecuentes

What validators are available in the Guardrails Hub?+

The hub includes 50+ validators covering PII detection, toxicity filtering, hallucination checking, regex matching, JSON schema validation, SQL injection prevention, prompt injection detection, and more. You can browse available validators at hub.guardrailsai.com.

Does Guardrails work with local LLMs?+

Yes. Guardrails works with any LLM that returns text. For local models via Ollama or vLLM, you can use the LiteLLM integration or pass raw text to the Guard for validation only. The validation logic is model-agnostic.

Can I write custom validators?+

Yes. Guardrails provides a Validator base class that you extend with your custom validation logic. Custom validators integrate with the same retry and error handling mechanisms as built-in validators. You can also publish custom validators to the hub.

How does the retry mechanism work?+

When a validator fails with on_fail='reask', Guardrails re-sends the prompt to the LLM with the validation error appended. The LLM sees what went wrong and generates a corrected output. The maximum number of retries is configurable. Each retry consumes additional tokens.

Does Guardrails support streaming outputs?+

Yes. Guardrails supports streaming validation where outputs are checked incrementally as tokens arrive. This enables early detection of validation failures without waiting for the complete response, reducing wasted tokens on invalid outputs.

Referencias (3)

Guardrails AI GitHub— 50+ built-in validators for LLM output validation
Guardrails Hub— Guardrails Hub for community validators
Guardrails Documentation— Pydantic-based structured output validation

Relacionados en TokRepo

AI agent tools Monitoring tools Coding tools

🙏

Fuente y agradecimientos

Created by Guardrails AI. Licensed under Apache 2.0.

guardrails-ai/guardrails — 4k+ stars

Discusión

Inicia sesión para unirte a la discusión.

Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados

Guardrails — Validate & Secure LLM Outputs

Guardrails is a Python framework for validating LLM inputs/outputs to detect risks and generate structured data. 6.6K+ GitHub stars. Pre-built validators, Pydantic models. Apache 2.0.

Skills

Script Depot

NeMo Guardrails — Programmable Safety for LLM Applications

NeMo Guardrails is an open-source toolkit by NVIDIA for adding programmable guardrails to LLM-based conversational systems. It provides input/output moderation, fact-checking, hallucination detection, jailbreak prevention, and dialog management via a declarative Colang configuration language.

Skills

Script Depot

Text Generation Inference (TGI) — Hugging Face Production LLM Server

TGI is Hugging Face's production-grade LLM inference server. It powers HF Inference Endpoints with continuous batching, tensor parallelism, quantization, and OpenAI-compatible APIs — handling thousands of requests per second.

Skills

Hugging Face

Instructor — Typed Structured Outputs for LLMs

Instructor turns LLM replies into validated Pydantic models with retries. `pip install instructor`, then extract typed objects across major providers.

Skills

Agent Toolkit