What fast-agent Does
fast-agent is an Apache-licensed Python framework that lets developers stand up a fully-featured LLM coding agent in one command. The flagship invocation uvx fast-agent-mcp@latest -x drops you into an interactive shell where you can attach MCP servers, load Skills, and route across Anthropic, OpenAI Codex, HuggingFace, or local llama.cpp models without writing wrapper code. The same runtime works as an evaluation harness, a CI agent, or a custom subagent dispatcher — making it less of a chatbot SDK and more of a platform for serious agentic development workflows.
MCP and ACP: Protocol-First Architecture
Where LangChain or CrewAI treat the Model Context Protocol as a checklist feature, fast-agent is engineered around it. The framework implements the full MCP spec — including Sampling and Elicitations, which most competing implementations skip — so any MCP server you point it at behaves consistently whether it speaks stdio or HTTP with OAuth. The /connect slash command attaches a new server mid-session, and configuration lives in a single fastagent.config.yaml that survives across restarts.
Layered on top is ACP, the Agent Communication Protocol that fast-agent uses for subagent-to-subagent messaging. ACP turns the framework into a meaningful multi-agent runtime: a planner agent can spawn workers, hand off intermediate state, and reclaim control without losing context. For teams already invested in MCP tooling, this protocol-first stance is the differentiator that justifies switching from a generic agent framework.
Multi-Model Support and Provider Flexibility
fast-agent ships with first-class adapters for Anthropic Claude, OpenAI Codex, HuggingFace Inference, llama.cpp, and a generic local-OpenAI-compatible provider. The --model flag accepts shorthand aliases like opus, sonnet, codex-1, or hf-dev and the --pack flag loads pre-tuned configurations for common scenarios. --pack hf-dev wires up HuggingFace's dev models with sensible defaults for code generation; --pack codex pre-configures the Codex API for low-latency tool use.
Provider switching is non-disruptive — the same prompt, MCP servers, and Skills work across backends, so you can run an evaluation matrix to compare model quality on your actual tools rather than synthetic benchmarks. This is uncommon: most agent frameworks either lock you to a single provider or require rewriting prompts when you switch. fast-agent treats the model as a configuration concern, not an architectural one.
Skills System and Interactive Shell
The /skills command exposes a built-in registry of reusable agent capabilities — LSP integration, repo tools, web search, file operations — that you can mount and unmount during a session. Skills follow the Anthropic Skills spec, so any skill bundle published for Claude Code or another compatible runtime drops in without modification. Combined with the !-prefix for native shell commands inside the interactive REPL, the shell experience feels closer to a power-user terminal than to a vendor SDK.
The --smart flag layers automatic subagent routing and compaction on top of Skills. When the main agent's context approaches the model's window limit, fast-agent compacts older turns and offloads tangential work to ephemeral subagents — useful for long refactors or evaluation runs that span hundreds of tool calls. The behavior is configurable and observable, not a black box.
Evaluation and Automation Use Cases
Beyond interactive use, fast-agent is designed to run unattended. The same binary that powers the REPL can be driven from scripts, CI pipelines, or evaluation harnesses. Teams have used it to benchmark agent reliability across model providers, gate pull requests with autonomous QA agents, and run nightly regression suites on multi-turn tool use. Because the framework keeps state in plain YAML and JSONL transcripts, results are diffable and version-controllable.
Telemetry is opt-in by default — there is no phone-home behavior baked into the runtime — which matters for teams shipping agents to regulated industries. Pair fast-agent with a local llama.cpp model and you have a fully air-gapped coding agent that still speaks MCP. For evaluation specifically, this combination is hard to beat.
The Bottom Line
fast-agent occupies a thin, valuable strip of the agent framework landscape: heavier than a raw SDK like the official Anthropic Python client, lighter than LangChain, and more standards-aligned than CrewAI. If your workflow is MCP-centric, terminal-first, and benefits from protocol fidelity over abstraction layers, it is the most complete option available today. With 3,700+ GitHub stars, an active Discord, and steady releases under Apache 2.0, it has crossed the threshold from experiment to production-credible tool — though documentation and third-party integrations still trail the bigger frameworks.