# github
61 tools tagged
Showing 24 of 61 tools
agmsg
Cross-agent messaging for CLI coding agents
agmsg is an MIT-licensed Bash and SQLite messaging layer for CLI coding agents. It lets Claude Code, Codex, Gemini CLI, GitHub Copilot CLI, Antigravity, OpenCode, Hermes, and other terminal agents exchange messages through a shared local database instead of relying on a human copy-paste relay. It is intentionally not MCP, not a broker, and not a subagent framework.
Baz
Telemetry-aware AI code reviewer that checks how pull requests may affect real services.
Baz is an AI code-review platform focused on production-aware pull requests. Instead of only reading the diff, Baz connects code changes to application telemetry so reviewers can understand what endpoints, services, and runtime behavior may be affected. That makes it a useful complement to existing AI PR bots when the question is not just whether a change looks correct, but whether it could break a live system.
Open SWE
Open-source async coding agent you can run in your own sandbox
Open-source framework from LangChain AI for building your organization's internal coding agent — the same pattern Stripe's Minions, Ramp's Inspect, and Coinbase's Cloudbot follow. Built on LangGraph and Deep Agents, Open SWE runs each task in an isolated cloud sandbox (Modal, Daytona, Runloop, or LangSmith), invokes from Slack, Linear, or GitHub, orchestrates subagents, and opens pull requests autonomously — customizable end-to-end for your codebase and conventions.
Act
Run GitHub Actions locally for fast feedback
Act is an open-source tool that runs GitHub Actions workflows locally using Docker containers that match GitHub's execution environment. It provides instant feedback on workflow changes without pushing to a repository, supports matrix builds, secret management, and artifact handling. Act can also replace Makefiles by using workflow files as task definitions, making it useful for both CI/CD development and local task automation across development teams.
mrge
LSP-based AI code review agent backed by Y Combinator
mrge is a YC-backed AI code review agent that uses Language Server Protocol analysis to provide deep, context-aware pull request reviews. It goes beyond surface-level pattern matching by understanding project structure, type information, and cross-file dependencies. Integrates with GitHub and GitLab to deliver automated reviews that catch logic errors, security issues, and architectural inconsistencies.
Git Bayesect
Bayesian git bisection for finding commits that caused flaky tests
Git Bayesect applies Bayesian inference to git bisection, solving the problem of finding commits that introduced non-deterministic bugs like flaky tests. Unlike standard git bisect which requires binary pass-fail results, Git Bayesect handles probabilistic outcomes where a test might pass sometimes and fail sometimes, using entropy minimization to efficiently narrow down the culprit commit.
Aviator
Developer productivity platform with merge queues and flaky test detection
Aviator is a developer productivity platform combining merge queues, stacked PRs, automated code review, and flaky test management. Its merge queue prevents broken main branches by testing PRs in order before merging. Flaky test detection identifies unreliable tests causing CI failures. Founded by ex-Google engineers who built internal developer tools at scale. YC-backed with $2.3M seed from Elad Gil. Used by Bosch, Benchling, and Lightspeed.
Incident.io
Slack-native incident management with AI SRE agent
Incident.io is a Slack- and Microsoft Teams-native incident management platform with AI SRE investigation, on-call scheduling, status pages, and post-incident learning in one product. Vendor case studies cite Buffer reducing critical incidents by 70% and Favor reducing MTTR by 37%. It integrates with PagerDuty, Datadog, GitHub, Jira, and 100+ tools for incident response and operational workflows.
Karate DSL
Unified API, performance, and contract testing DSL
Karate is an open-source testing framework that unifies API testing, performance testing, UI automation, and contract testing in a single BDD-style DSL. Write tests in plain Gherkin-like syntax without any Java knowledge. Built-in assertions, data-driven testing, parallel execution, and HTML reports. 8,900+ GitHub stars, MIT licensed. Mature, actively maintained project with commercial support options for comprehensive API quality assurance.
TruLens
LLM evaluation and tracking with RAG triad metrics
TruLens is an open-source framework for evaluating and tracking LLM experiments with feedback functions, RAG triad metrics (answer relevance, context relevance, groundedness), and Honest/Harmless/Helpful evaluations. Features a unified Metric API for systematic evaluation of RAG pipelines and AI agents. 3,200+ GitHub stars, MIT licensed. Snowflake partnership adds enterprise integration. Supports LangChain, LlamaIndex, and custom LLM applications.
PurpleLlama
Meta's open-source LLM security suite with Llama Guard and CodeShield
PurpleLlama is Meta's open-source suite of tools for evaluating and improving LLM safety. It includes Llama Guard models for input/output content safety classification, LlamaFirewall for multi-layer defense, CodeShield for insecure code detection, and CyberSecEval benchmarks for measuring LLM security. Llama Guard 4 supports multimodal safety across text and images. 4,100+ GitHub stars, backed by Meta AI with 44+ contributors.
PrivateGPT
100% private document Q&A powered by local LLMs
PrivateGPT enables fully private document interaction using GPT-powered RAG without any data leaving your machine. Ingest documents (PDF, DOCX, TXT, and more) and chat with them using local LLMs via Ollama or remote providers. Built on LlamaIndex with Qdrant vector storage. 57,200+ GitHub stars, Apache 2.0 licensed. The go-to solution for air-gapped environments, regulated industries, and anyone who needs document Q&A without cloud data exposure.
Hatchet
Modern task queue and workflow orchestration built on PostgreSQL
Hatchet is an open-source task queue and workflow orchestration platform designed as a modern alternative to Celery and BullMQ. Built on PostgreSQL for durability, it handles background jobs, AI agent workflows, RAG pipelines, and GPU task scheduling with TypeScript and Python SDKs. YC W24 batch with 7,400+ GitHub stars, MIT licensed. Supports fan-out, rate limiting, retries, and real-time observability through a web dashboard.
GitMCP
Instant MCP server for any GitHub repository
GitMCP is a free, open-source remote MCP server that transforms any GitHub repository or GitHub Pages site into an AI-accessible documentation hub. Just replace github.com with gitmcp.io in any repo URL to give AI assistants grounded context about that project — eliminating code hallucinations with zero configuration required.
Argos CI
Visual regression testing for CI/CD pipelines
Argos CI is a visual regression testing platform that automatically catches unintended UI changes in CI/CD pipelines. It integrates with Playwright, Cypress, Storybook, and Puppeteer, featuring a stabilization engine that filters flaky pixel differences from genuine regressions. Used by teams at Meta and MUI for frontend quality gates.
GitHub MCP Server
Official MCP server for GitHub repo operations
GitHub MCP Server is the official Model Context Protocol server from GitHub that connects AI assistants to repositories, issues, pull requests, workflows, and code search. It exposes 100+ operations with toolset filtering, permission scoping, and audit logging, available in both remote-hosted and self-hosted Docker deployment modes.
Port
Internal developer portal for self-service engineering
Port is an internal developer portal platform that provides self-service interfaces for engineering teams. It offers a software catalog for tracking services, environments, and dependencies, along with self-service actions for common workflows like spinning up environments, deploying services, and managing resources. Features scorecards for engineering standards compliance and integrates with GitHub, GitLab, K8s, and cloud providers.
WhatTheDiff
AI-powered pull request summaries and code review
WhatTheDiff is an AI tool that generates human-readable pull request summaries and suggests code improvements. It analyzes code diffs to explain what changed and why in plain language, helping reviewers understand PRs faster. Integrates with GitHub and supports automated refactoring suggestions through a /wtd command. Useful for teams wanting to improve PR review speed and maintain changelog quality.
Backstage
Spotify's open-source developer portal framework
Backstage is Spotify's open-source framework for building internal developer portals, now a CNCF incubating project with 27,000+ GitHub stars. It provides a unified software catalog, service documentation, CI/CD pipeline views, and a plugin architecture with 200+ community plugins. Teams use it to create a single pane of glass for service ownership, API documentation, infrastructure management, and developer self-service workflows.
Jules
Google async coding agent for GitHub tasks, plans, and PRs
Jules is Google's async coding agent for GitHub repositories. Users start work from a prompt, GitHub issue label, scheduled task, or opt-in Suggested Task; Jules runs in a Google Cloud VM, proposes a plan, and opens PR-ready diffs. Free Jules offers 15 tasks/day and 3 concurrent tasks on Gemini 2.5 Pro; Pro/Ultra raise limits and start with Gemini 3 Pro access.
Trunk
AI-powered CI reliability and flaky test management
Trunk is a developer tools platform that tackles CI reliability through AI-powered flaky test detection, automatic quarantine, and merge queue management. It uses ML-based statistical analysis to identify flaky tests, isolates them to prevent pipeline blocks, and creates GitHub issues for resolution. Used by Zillow, Brex, and Faire, with $28.5M in funding and support for all major test frameworks.
PitchDocs
AI-powered marketing-ready READMEs from code
PitchDocs is an AI documentation generator that scans codebases to produce marketing-ready READMEs, changelogs, and AI context files like llms.txt. It bridges the gap between raw source code and consumer-ready technical communication, targeting the growing need for high-quality README files that serve both human developers and AI agents accessing documentation.
Umaku
Context-aware AI review with business logic validation
Umaku is a context-aware AI code review agent that understands full codebase and business logic context, detecting inconsistencies and assessing quality and risk beyond syntax-level analysis. It auto-generates and validates QA test cases from reviewed code, making it particularly effective for reviewing AI-generated code where product-intent validation matters more than style checking.
GitStart
Pull Requests as a Service with AI + developers
GitStart is a YC-backed platform that delivers merge-ready pull requests by combining AI coding agents with human developer oversight. Teams assign sprint-sized tickets and the AI Ticket Studio converts vague requirements into well-scoped specs, then hybrid agents generate production-ready code through a five-stage quality process with a 98% merge rate reported across customer teams.