Skills2026年4月8日·1 分钟阅读

Guardrails AI — Validate LLM Outputs in Production

Add validation and guardrails to any LLM output. Guardrails AI checks for hallucination, toxicity, PII leakage, and format compliance with 50+ built-in validators.

Agent 就绪

先审查再安装

这个资产需要先审查。复制的指令会要求 Agent dry-run、列出写入项,确认后再继续。

Needs Confirmation · 66/100策略:需确认
Agent 入口
任意 MCP/CLI Agent
类型
Skill
安装
Single
信任
信任等级:Established
入口
Guardrails AI — Validate LLM Outputs in Production
先审查命令
npx -y tokrepo@latest install ffbad589-cd32-4eca-9518-fdcf9167ca21 --target codex

先 dry-run,确认写入项后再运行此命令。

TL;DR
Guardrails AI adds validation layers to any LLM output, checking for hallucination, PII, toxicity, and format errors.
§01

What it is

Guardrails AI is a Python library that adds validation and safety checks to LLM outputs. It provides 50+ built-in validators that check for hallucination, toxicity, PII leakage, format compliance, and factual accuracy. You wrap your LLM calls with a Guard object, and Guardrails automatically validates and optionally retries when outputs fail validation.

Guardrails AI is designed for teams deploying LLMs in production who need to ensure outputs meet quality and safety standards. It works with any LLM provider including OpenAI, Anthropic, and local models.

§02

How it saves time or tokens

Without Guardrails, you write custom validation logic for every LLM output format. Guardrails provides a declarative approach: define your validators once, and the library handles validation, error reporting, and automatic retries. The hub of pre-built validators (guardrails hub) saves development time by providing tested implementations for common checks.

Automatic retries with corrective prompts reduce manual intervention. When an output fails validation, Guardrails re-prompts the LLM with the error details, often producing a valid output on the second attempt.

§03

How to use

  1. Install Guardrails and a validator:
pip install guardrails-ai
guardrails hub install hub://guardrails/regex_match
  1. Create a Guard with validators:
from guardrails import Guard
from guardrails.hub import RegexMatch

guard = Guard().use(RegexMatch(regex=r'^\d{3}-\d{2}-\d{4}$'))

result = guard(
    model='gpt-4o',
    messages=[{'role': 'user', 'content': 'Generate a sample SSN format'}]
)
print(result.validated_output)
  1. The Guard validates the LLM output against the regex pattern and retries if it does not match.
§04

Example

from guardrails import Guard
from guardrails.hub import DetectPII, ToxicLanguage
from pydantic import BaseModel, Field

class SafeResponse(BaseModel):
    answer: str = Field(description='The response text')
    confidence: float = Field(ge=0.0, le=1.0)

guard = Guard.from_pydantic(SafeResponse)
guard.use(DetectPII(pii_entities=['EMAIL', 'PHONE_NUMBER'], on_fail='fix'))
guard.use(ToxicLanguage(threshold=0.5, on_fail='reask'))

result = guard(
    model='gpt-4o',
    messages=[{'role': 'user', 'content': 'Summarize this customer feedback'}]
)

print(result.validated_output)
print(f'Validation passed: {result.validation_passed}')

This ensures the LLM output matches the Pydantic schema, contains no PII, and is not toxic. Failed PII checks are auto-fixed; toxic outputs trigger a retry.

§05

Related on TokRepo

§06

Common pitfalls

  • Validators from the hub need separate installation. Running guard.use(SomeValidator()) without first installing via guardrails hub install produces import errors.
  • Stacking many validators increases latency. Each validator runs sequentially by default. For latency-sensitive applications, limit validators to the most critical checks.
  • The on_fail='reask' strategy sends additional LLM calls, increasing token costs. Use on_fail='fix' for deterministic corrections or on_fail='exception' when retries are not acceptable.

常见问题

What validators are available in the Guardrails Hub?+

The hub includes 50+ validators covering PII detection, toxicity filtering, hallucination checking, regex matching, JSON schema validation, SQL injection prevention, prompt injection detection, and more. You can browse available validators at hub.guardrailsai.com.

Does Guardrails work with local LLMs?+

Yes. Guardrails works with any LLM that returns text. For local models via Ollama or vLLM, you can use the LiteLLM integration or pass raw text to the Guard for validation only. The validation logic is model-agnostic.

Can I write custom validators?+

Yes. Guardrails provides a Validator base class that you extend with your custom validation logic. Custom validators integrate with the same retry and error handling mechanisms as built-in validators. You can also publish custom validators to the hub.

How does the retry mechanism work?+

When a validator fails with on_fail='reask', Guardrails re-sends the prompt to the LLM with the validation error appended. The LLM sees what went wrong and generates a corrected output. The maximum number of retries is configurable. Each retry consumes additional tokens.

Does Guardrails support streaming outputs?+

Yes. Guardrails supports streaming validation where outputs are checked incrementally as tokens arrive. This enables early detection of validation failures without waiting for the complete response, reducing wasted tokens on invalid outputs.

引用来源 (3)
🙏

来源与感谢

guardrails-ai/guardrails — 4k+ stars, Apache 2.0

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产