aicoolies logo
LLM Guard logo

LLM Guard

Input and output security scanners for LLM applications

Share
open-sourceOpen Source
Visit Website →

LLM Guard is an open-source security toolkit by Protect AI that provides 15 input scanners and 20 output scanners to protect LLM applications from prompt injection, PII leakage, toxic content, secrets exposure, and data exfiltration. Each scanner is modular and independent — pick the ones you need, configure thresholds, and chain them into a pipeline. The library works with any LLM and has been downloaded over 2.5 million times. MIT licensed, Python 3.9+.

LLM Guard sits as a middleware layer between your application and its language model, scanning both inbound prompts and outbound responses against configurable security rules. The 15 input scanners handle prompt injection detection using a fine-tuned DeBERTa model, PII anonymization that replaces names, emails, phone numbers, and credit card numbers with placeholders, toxicity filtering, secrets detection via Yelp's detect-secrets library, ban lists for competitors, substrings, topics, and code, invisible text detection for Unicode-based attacks, token limit enforcement, and language restriction. Each scanner returns a sanitized version of the text, a validity flag, and a risk score between 0 and 1.

The 20 output scanners cover the response side: deanonymization to restore PII placeholders after processing, bias detection, relevance scoring against the original prompt, factual consistency checking, malicious URL detection and reachability verification, sensitive data exposure prevention, no-refusal detection to catch when the model inappropriately refuses valid requests, language detection, and code output filtering by programming language. Scanners are composable through scan_prompt and scan_output functions that execute them in sequence with an optional fail_fast mode that stops at the first violation. The entire pipeline can be deployed as a standalone API server for team-wide use.

LLM Guard is engineered for cost-effective CPU inference — the team claims 5x lower inference costs on CPU compared to GPU — which matters for production deployments where scanning runs on every request. The toolkit integrates with any LLM framework including LangChain, Azure OpenAI, and Amazon Bedrock since it operates on text strings rather than model internals. Protect AI hosts an interactive playground on Hugging Face Spaces for testing scanners without installation. The latest release is v0.3.16, and while the release cadence has slowed from its initial rapid development, the scanner collection remains one of the most comprehensive open-source LLM security toolkits available.

Pricing

Free open-source under MIT license

Platforms

Python 3.9+, pip, standalone API server, CPU-optimized inference

Categories

Tags

Use Cases

Alternatives

Related Tools

Agent Governance Toolkit logo

Agent Governance Toolkit

Microsoft’s public-preview runtime governance toolkit for policy, identity, sandboxing, audit, and MCP security around AI agents.

Agent Governance Toolkit is Microsoft’s MIT-licensed public-preview toolkit for governing AI agent runtimes. It adds policy enforcement, zero-trust identity, execution sandboxing, audit, reliability, and MCP security-gateway patterns around tool calls and autonomous actions, helping platform teams move beyond prompt-only guardrails while preserving architecture review requirements.

open-sourceOpen SourceTelemetry
Baz logo

Baz

Telemetry-aware AI code reviewer that checks how pull requests may affect real services.

Baz is an AI code-review platform focused on production-aware pull requests. Instead of only reading the diff, Baz connects code changes to application telemetry so reviewers can understand what endpoints, services, and runtime behavior may be affected. That makes it a useful complement to existing AI PR bots when the question is not just whether a change looks correct, but whether it could break a live system.

freemiumTelemetry
rampart

Rampart

Microsoft’s pytest-native red teaming framework for turning AI agent safety findings into CI tests.

RAMPART is an open-source Microsoft framework for safety and security testing of agentic AI applications. It brings red-team findings into a pytest-native workflow so teams can turn prompt injection, unsafe tool use, and behavioral boundary failures into repeatable regression tests. The strongest aicoolies angle is developer workflow: RAMPART makes agent safety part of CI/CD instead of a one-off security review.

open-sourceOpen Source
Statewright logo

Statewright

State-machine guardrails for controlling which tools AI coding agents can use at each phase.

Statewright is a guardrail layer for AI coding agents that uses explicit state machines to control what an agent can do at each stage of a workflow. Instead of relying only on prompt instructions, teams can model phases such as plan, implement, test, and review, then constrain tool access for clients like Claude Code, Codex, Cursor, opencode, and related MCP workflows.

open-sourceOpen Source
Magika logo

Magika

AI-powered file-type detection at Google scale

Open-source AI-powered file-type detection tool from Google that uses a custom deep-learning model under a few megabytes to identify more than 200 binary and textual content types in milliseconds, even on a single CPU. Magika ships as a CLI, Python package, JavaScript/TypeScript library, and an ONNX model, achieves around 99% accuracy on its test set, and is already used at Google scale across Gmail, Drive, and Safe Browsing as well as by VirusTotal and abuse.ch.

freeOpen Source
Trent AI logo

Trent AI

Agentic AI security posture management

Trent AI is a specialized security platform for agentic AI applications providing AI Security Posture Management that compounds with every development cycle. Scans, judges, mitigates, and evaluates AI agent security detecting threats traditional tools miss including prompt injection attacks, tool misuse, unintended autonomous actions, data exfiltration through agent chains, and privilege escalation. Offers continuous assessment with remediation plan execution through Claude Code.

paid

Comparisons