Shannon and Garak both operate in the AI security space but target fundamentally different attack surfaces. Shannon is an autonomous penetration testing agent that attacks web applications and APIs — finding SQL injection, XSS, authentication bypasses, and other traditional vulnerabilities using AI-powered reasoning. Garak tests the LLM models themselves — probing for prompt injection susceptibility, jailbreak vectors, toxic output generation, and data leakage. Understanding which layer of security you need to address determines which tool to deploy.
Shannon's approach mimics a skilled human pentester. Its multi-agent pipeline performs reconnaissance to map the application's attack surface, analyzes potential vulnerability points, attempts exploitation to confirm findings, and generates detailed reports with reproduction steps. Built on Anthropic's Agent SDK with Playwright for browser interaction, it interacts with applications the way a real attacker would — navigating forms, submitting payloads, and observing responses. Its 96.15 percent success rate on the XBOW benchmark significantly exceeds industry averages.
Garak operates at the model layer. It sends adversarial prompts to LLMs and evaluates the responses for undesirable behaviors — does the model leak training data, can it be coerced into generating harmful content, does it follow system prompt instructions when confronted with jailbreak attempts. The probe library covers dozens of attack categories including encoding-based bypasses, multi-turn manipulation, and role-playing exploits. This is essential for teams deploying LLM-powered features who need to understand their model's failure modes.
The technical stacks reflect their different targets. Shannon requires a Temporal cluster for durable workflow execution and Playwright for browser automation — it needs to interact with running web applications over HTTP. Garak is a Python package that sends API calls to LLM providers — it needs only network access to the model endpoint. Shannon's infrastructure requirements are heavier, but its testing is more comprehensive for application-level security.
Pricing models differ accordingly. Shannon Lite is open source under AGPL-3.0 but costs approximately fifty dollars per run in LLM API fees since it uses Claude for reasoning through complex attack scenarios. Garak is fully open source and significantly cheaper to run since its probes are predefined adversarial prompts rather than open-ended reasoning tasks. For budget-constrained teams, Garak provides immediate value at lower cost.
Discovery capability illustrates the difference. Shannon has found seven zero-day vulnerabilities in real-world applications — novel bugs that were not in any existing vulnerability database. Garak discovers whether known categories of model vulnerabilities affect your specific deployment. Shannon finds unknown unknowns at the application level; Garak confirms known unknowns at the model level.