LLM Guard sits as a middleware layer between your application and its language model, scanning both inbound prompts and outbound responses against configurable security rules. The 15 input scanners handle prompt injection detection using a fine-tuned DeBERTa model, PII anonymization that replaces names, emails, phone numbers, and credit card numbers with placeholders, toxicity filtering, secrets detection via Yelp's detect-secrets library, ban lists for competitors, substrings, topics, and code, invisible text detection for Unicode-based attacks, token limit enforcement, and language restriction. Each scanner returns a sanitized version of the text, a validity flag, and a risk score between 0 and 1.
The 20 output scanners cover the response side: deanonymization to restore PII placeholders after processing, bias detection, relevance scoring against the original prompt, factual consistency checking, malicious URL detection and reachability verification, sensitive data exposure prevention, no-refusal detection to catch when the model inappropriately refuses valid requests, language detection, and code output filtering by programming language. Scanners are composable through scan_prompt and scan_output functions that execute them in sequence with an optional fail_fast mode that stops at the first violation. The entire pipeline can be deployed as a standalone API server for team-wide use.
LLM Guard is engineered for cost-effective CPU inference — the team claims 5x lower inference costs on CPU compared to GPU — which matters for production deployments where scanning runs on every request. The toolkit integrates with any LLM framework including LangChain, Azure OpenAI, and Amazon Bedrock since it operates on text strings rather than model internals. Protect AI hosts an interactive playground on Hugging Face Spaces for testing scanners without installation. The latest release is v0.3.16, and while the release cadence has slowed from its initial rapid development, the scanner collection remains one of the most comprehensive open-source LLM security toolkits available.