Guardrails AI — Validate LLM Outputs in Production
Add validation and guardrails to any LLM output. Guardrails AI checks for hallucination, toxicity, PII leakage, and format compliance with 50+ built-in validators.
What it is
Guardrails AI is a Python library that adds validation and safety checks to LLM outputs. It provides 50+ built-in validators that check for hallucination, toxicity, PII leakage, format compliance, and factual accuracy. You wrap your LLM calls with a Guard object, and Guardrails automatically validates and optionally retries when outputs fail validation.
Guardrails AI is designed for teams deploying LLMs in production who need to ensure outputs meet quality and safety standards. It works with any LLM provider including OpenAI, Anthropic, and local models.
How it saves time or tokens
Without Guardrails, you write custom validation logic for every LLM output format. Guardrails provides a declarative approach: define your validators once, and the library handles validation, error reporting, and automatic retries. The hub of pre-built validators (guardrails hub) saves development time by providing tested implementations for common checks.
Automatic retries with corrective prompts reduce manual intervention. When an output fails validation, Guardrails re-prompts the LLM with the error details, often producing a valid output on the second attempt.
How to use
- Install Guardrails and a validator:
pip install guardrails-ai
guardrails hub install hub://guardrails/regex_match
- Create a Guard with validators:
from guardrails import Guard
from guardrails.hub import RegexMatch
guard = Guard().use(RegexMatch(regex=r'^\d{3}-\d{2}-\d{4}$'))
result = guard(
model='gpt-4o',
messages=[{'role': 'user', 'content': 'Generate a sample SSN format'}]
)
print(result.validated_output)
- The Guard validates the LLM output against the regex pattern and retries if it does not match.
Example
from guardrails import Guard
from guardrails.hub import DetectPII, ToxicLanguage
from pydantic import BaseModel, Field
class SafeResponse(BaseModel):
answer: str = Field(description='The response text')
confidence: float = Field(ge=0.0, le=1.0)
guard = Guard.from_pydantic(SafeResponse)
guard.use(DetectPII(pii_entities=['EMAIL', 'PHONE_NUMBER'], on_fail='fix'))
guard.use(ToxicLanguage(threshold=0.5, on_fail='reask'))
result = guard(
model='gpt-4o',
messages=[{'role': 'user', 'content': 'Summarize this customer feedback'}]
)
print(result.validated_output)
print(f'Validation passed: {result.validation_passed}')
This ensures the LLM output matches the Pydantic schema, contains no PII, and is not toxic. Failed PII checks are auto-fixed; toxic outputs trigger a retry.
Related on TokRepo
- AI agent tools -- Explore tools for building reliable AI agent pipelines
- AI monitoring tools -- Browse tools for monitoring AI systems in production
Common pitfalls
- Validators from the hub need separate installation. Running
guard.use(SomeValidator())without first installing viaguardrails hub installproduces import errors. - Stacking many validators increases latency. Each validator runs sequentially by default. For latency-sensitive applications, limit validators to the most critical checks.
- The
on_fail='reask'strategy sends additional LLM calls, increasing token costs. Useon_fail='fix'for deterministic corrections oron_fail='exception'when retries are not acceptable.
Frequently Asked Questions
The hub includes 50+ validators covering PII detection, toxicity filtering, hallucination checking, regex matching, JSON schema validation, SQL injection prevention, prompt injection detection, and more. You can browse available validators at hub.guardrailsai.com.
Yes. Guardrails works with any LLM that returns text. For local models via Ollama or vLLM, you can use the LiteLLM integration or pass raw text to the Guard for validation only. The validation logic is model-agnostic.
Yes. Guardrails provides a Validator base class that you extend with your custom validation logic. Custom validators integrate with the same retry and error handling mechanisms as built-in validators. You can also publish custom validators to the hub.
When a validator fails with on_fail='reask', Guardrails re-sends the prompt to the LLM with the validation error appended. The LLM sees what went wrong and generates a corrected output. The maximum number of retries is configurable. Each retry consumes additional tokens.
Yes. Guardrails supports streaming validation where outputs are checked incrementally as tokens arrive. This enables early detection of validation failures without waiting for the complete response, reducing wasted tokens on invalid outputs.
Citations (3)
- Guardrails AI GitHub— 50+ built-in validators for LLM output validation
- Guardrails Hub— Guardrails Hub for community validators
- Guardrails Documentation— Pydantic-based structured output validation
Related on TokRepo
Source & Thanks
Created by Guardrails AI. Licensed under Apache 2.0.
guardrails-ai/guardrails — 4k+ stars