Showing 24 of 128 tools
Persistent memory layer for AI coding agents — keeps Claude Code, Codex, Cursor, and any MCP agent in context across sessions
agentmemory is an open-source MCP server that gives AI coding agents persistent, cross-session memory. Built on hybrid vector-graph search, it achieves 95.2% recall on the LongMemEval-S benchmark while using up to 92% fewer context tokens than naive context injection. Works out of the box with Claude Code, Codex, Cursor, Windsurf, Cline, OpenCode, Kilo Code, Hermes, and any MCP client through 51 MCP tools plus 12 hooks and 4 skills.
MCP, ACP and Skills support for building production coding agents — interactive or automated.
fast-agent is an Apache-licensed Python framework for building and running LLM agents with full MCP (Model Context Protocol) and ACP support. It ships with an interactive shell mode, Skills management, and multi-model routing — making it a practical platform for coding agents, workflow automation, and agent evaluation across Claude, Codex, HuggingFace, and local models.
Command center for Claude Code and Codex — monitor, steer, and voice-control your AI agents from any device.
Omnara is a command center for AI coding agents, letting you run, monitor, and steer Claude Code and Codex sessions from your phone, web browser, Apple Watch, or any device while the agent runs on your machine. Sessions migrate to the cloud when your laptop goes offline, and the voice-first interface lets you guide your agent hands-free. Built by a YC S25 team and available with a free tier plus paid plans across desktop, web, and mobile clients.
Vectorless, reasoning-based RAG that reads documents like a human expert — no vector DB, no chunking.
PageIndex is a vectorless, reasoning-based RAG system that builds hierarchical tree indexes from long documents and uses LLMs to navigate them like a human expert would. Instead of chunking text and comparing embeddings, it constructs a table-of-contents-style structure and reasons its way to the right sections — no vector database required. Available as an open-source Python package, cloud API, MCP server, and chat platform.
Production-grade browser automation with AI self-healing and Playwright code ownership
Intuned is a code-first browser automation platform that turns natural language prompts into production-ready Playwright code, deploys it, and self-heals it when target sites change. Supports TypeScript and Python with Anthropic Computer Use, OpenAI CUA, Stagehand, Browser-Use, and Gemini Computer Use integrations. Built-in stealth, captcha solving, auth session management, and scheduled runs with concurrency control. No vendor lock-in—you own the code.
Open-source post-building layer for agents — tracing, evals, and online monitoring
Judgeval is the open-source post-building layer for AI agents from Judgment Labs, providing OpenTelemetry-based tracing, hosted and custom evaluation scorers, and online behavior monitoring for LLM-powered applications. Instrument any function with a single decorator, score live production traffic against faithfulness and instruction-adherence checks, and feed real-world failures back into reinforcement learning or supervised fine-tuning loops.
Open-source observability and self-healing layer for AI agents
TraceRoot is a YC S25-backed open-source observability platform purpose-built for AI agents and LLM apps. It combines OpenTelemetry-compatible tracing with an agentic debugging runtime that reads your source code, correlates failures with recent commits, and proposes fix PRs automatically. BYOK support spans seven LLM providers; the entire stack runs self-hosted via Docker Compose, with TraceRoot Cloud available for managed deployments.
Always-on cloud engineer that lives in Slack and ships verified PRs
Roomote is a Slack-first cloud coding agent from RooCodeInc that takes prompts end-to-end across GitHub, Linear, Notion, Sentry, and your own dev environment, then opens self-verified pull requests for review. It is the team behind 23k-star Roo Code going all-in on cloud agents — plug it into your stack, mention it in Slack, and it answers questions, drafts plans, and ships verified PRs without asking engineers to leave their flow.
Sandboxes for coding agents — Linux VMs, Git, and deploys in one box
Freestyle is YC-backed sandbox infrastructure built for AI coding agents, shipping secure Linux VMs with nested virtualization, Git servers, and one-click web deploys. It lets agents run real workloads, branch repos, and deploy apps under short-lived identities while billing only for active compute. Used in production by vly.ai, Rork, and Vibeflow.
Rust-native multi-agent orchestration for production
GraphBit is a Rust-native, multi-agent orchestration framework built for production. It targets the gap between Python-first frameworks like LangGraph and the operational expectations of enterprise systems — predictable memory, low latency, deterministic concurrency, and the ability to embed an agent runtime in services that already run Rust without dragging in a Python interpreter.
Open-source toolkit for building AI SRE incident response agents
OpenSRE is an open-source Python toolkit from Tracer Cloud for building AI SRE agents that investigate and respond to production incidents. It ships with connectors to Prometheus, Grafana, Kubernetes and incident platforms, plus a simulation harness that replays past incidents so teams can benchmark agent accuracy before trusting it on live pager rotations.
Self-evolution engine for AI agents with auditable updates
Evolver is an open-source self-evolution engine for AI agents that turns run logs into auditable, reviewable updates via its Genome Evolution Protocol. Instead of ad hoc prompt tweaking, teams collect traces and Evolver proposes versioned diffs to prompts, tools and workflows that engineers can approve, reject or roll back like code.
Official Chrome DevTools MCP server for coding agents
chrome-devtools-mcp is the Chrome DevTools team's official MCP server that lets coding agents control and inspect a live Chrome browser with first-party Chrome DevTools Protocol fidelity. It exposes Network inspection, Performance traces, Lighthouse audits, console output, and structured DOM snapshots as typed MCP tools, so agents can debug real pages and ship reliable web performance investigations without resorting to brittle DOM scraping.
Self-evolving local computer agent with a reusable skill tree
GenericAgent is a minimal, self-evolving autonomous agent in roughly 3K lines of Python that gives LLMs system-level control of a local computer. It writes files, runs shell commands, and browses the web, but its defining feature is skill crystallization: successful task runs are saved as reusable skills inside a growing skill tree that cuts token cost on repeats.
Open-source async coding agent you can run in your own sandbox
Open-source framework from LangChain AI for building your organization's internal coding agent — the same pattern Stripe's Minions, Ramp's Inspect, and Coinbase's Cloudbot follow. Built on LangGraph and Deep Agents, Open SWE runs each task in an isolated cloud sandbox (Modal, Daytona, Runloop, or LangSmith), invokes from Slack, Linear, or GitHub, orchestrates subagents, and opens pull requests autonomously — customizable end-to-end for your codebase and conventions.
Build modular, scalable LLM applications in Rust
Open-source Rust library for building scalable, modular, and ergonomic LLM-powered applications. Rig unifies 20+ model providers (OpenAI, Anthropic, Mistral, DeepSeek, Ollama, and more) and 10+ vector stores behind one trait-based interface, supports completion and embedding workflows, multi-turn streaming, and transcription/audio/image generation, with full GenAI Semantic Convention compatibility and WASM-ready core library — production agentic infra for Rust teams.
AI-powered task management for agentic coding workflows
Claude Task Master is an AI-powered task management system designed for agentic development workflows in IDEs like Cursor, Windsurf, Lovable, and Roo. It breaks complex projects into structured task trees with dependencies, priorities, and complexity scores so AI coding agents can execute work methodically. The MCP server integration enables direct task operations from any compatible client, while tagged task lists support multi-context management across branches and environments.
Context engineering platform for AI agents with temporal knowledge graphs
Zep is a context engineering platform that assembles relationship-aware context for AI agents from conversations, business data, documents, and events. It maintains a temporal knowledge graph that automatically extracts entities and relationships, tracking how context evolves over time. Zep delivers formatted context blocks optimized for LLMs with sub-200ms latency, integrating with LangChain, LlamaIndex, AutoGen, and Google ADK through Python, TypeScript, and Go SDKs.
Agent memory system that learns, not just remembers
Hindsight is an agent memory system that enables AI agents to learn from experience rather than just store conversations. It organizes memories into three biomimetic categories: World knowledge for facts, Experiences for agent events, and Mental Models for learned understanding. The system provides retain, recall, and reflect operations backed by a temporal knowledge graph with parallel retrieval strategies including semantic, keyword, graph traversal, and temporal search.
Agentic IM chatbot platform with multi-platform LLM integration
AstrBot is an open-source agentic chatbot infrastructure that connects multiple instant messaging platforms including Telegram, Discord, Slack, WeChat, QQ, Feishu, and DingTalk to AI language models. It supports multi-provider LLM integration, MCP protocol, knowledge bases, persona management, multimodal input, and a plugin ecosystem with over 1,000 community extensions. Features include a web management UI, sandbox code execution, and auto-context compression for efficient conversations.
LangChain-powered agent harness with planning and subagents
Deep Agents is a production-ready agent framework built on LangChain and LangGraph for complex agentic workflows. It features a planning system for task decomposition, a filesystem backend for persistent operations, sandboxed shell execution, and isolated subagents with independent context windows. Automatic context summarization keeps agents coherent across long sessions, while smart defaults simplify prompt engineering for multi-step autonomous tasks.
Autonomous coding agents that ship while you sleep
Twill is an autonomous coding agent platform that implements features, fixes bugs, and ships pull requests without manual intervention. Uses structured workflow of research, planning, human review, implementation in isolated sandbox, AI code review, then merge. Supports custom agent configurations with multiple LLM providers, isolated dev environments for verification, and integrations with GitHub, Linear, Sentry, Notion, and cloud platforms for end-to-end engineering automation.
Agentic AI security posture management
Trent AI is a specialized security platform for agentic AI applications providing AI Security Posture Management that compounds with every development cycle. Scans, judges, mitigates, and evaluates AI agent security detecting threats traditional tools miss including prompt injection attacks, tool misuse, unintended autonomous actions, data exfiltration through agent chains, and privilege escalation. Offers continuous assessment with remediation plan execution through Claude Code.
Control plane for autonomous AI agents
Keycard is the control plane for autonomous agents, providing identity verification, policy enforcement, and scoped access management. Resolves agent identity, enforces security policies, and issues time-limited resource-specific access tokens. Provides full visibility into every agent action with drift detection, automatic remediation, and integrations with Datadog, Linear, GitHub, and other services for agent-driven incident response and security operations.