Guardrails AI is an open-source framework that intercepts LLM inputs and outputs to enforce validation, structure, and quality guarantees. The core abstraction is the Guard — a composable pipeline of validators that check LLM responses against defined criteria and take corrective actions like re-prompting, filtering, or raising exceptions when validation fails. Unlike conversational guardrails that control dialogue flow, Guardrails AI focuses on output contract enforcement: ensuring the LLM returns properly formatted JSON, stays within topic boundaries, avoids toxic language, and produces factually grounded responses.
The Guardrails Hub is a registry of pre-built validators covering a wide range of checks: regex matching for phone numbers and emails, PII detection and masking, competitor mention filtering, toxic language detection, jailbreak prompt detection, bias checking, hallucination scoring against retrieved context, code bug detection, SQL injection prevention, reading time limits, and LLM-as-judge evaluation. Validators compose together — you can chain content safety, structural validation, and domain-specific checks into a single Guard. For structured output, Guards wrap Pydantic models and add schema information to the prompt so even LLMs without function calling can generate valid JSON.
Guardrails AI works with any LLM provider through LiteLLM integration and supports both Python and JavaScript. It can run as a standalone Flask-based API server via the guardrails start command for microservice deployments. The framework integrates with NVIDIA NeMo Guardrails for combined flow control and output validation, and with OpenAI's Agents SDK via a GuardrailAgent class. Custom validators can be built and contributed back to the Hub. Installation is a pip install, and the CLI handles Hub configuration, validator installation, and dev server management.