59 tools tagged
Showing 24 of 59 tools
Open-source async coding agent you can run in your own sandbox
Open-source framework from LangChain AI for building your organization's internal coding agent — the same pattern Stripe's Minions, Ramp's Inspect, and Coinbase's Cloudbot follow. Built on LangGraph and Deep Agents, Open SWE runs each task in an isolated cloud sandbox (Modal, Daytona, Runloop, or LangSmith), invokes from Slack, Linear, or GitHub, orchestrates subagents, and opens pull requests autonomously — customizable end-to-end for your codebase and conventions.
Run GitHub Actions locally for fast feedback
Act is an open-source tool that runs GitHub Actions workflows locally using Docker containers that match GitHub's execution environment. It provides instant feedback on workflow changes without pushing to a repository, supports matrix builds, secret management, and artifact handling. Act can also replace Makefiles by using workflow files as task definitions, making it useful for both CI/CD development and local task automation across development teams.
LSP-based AI code review agent backed by Y Combinator
mrge is a YC-backed AI code review agent that uses Language Server Protocol analysis to provide deep, context-aware pull request reviews. It goes beyond surface-level pattern matching by understanding project structure, type information, and cross-file dependencies. Integrates with GitHub and GitLab to deliver automated reviews that catch logic errors, security issues, and architectural inconsistencies.
Bayesian git bisection for finding commits that caused flaky tests
Git Bayesect applies Bayesian inference to git bisection, solving the problem of finding commits that introduced non-deterministic bugs like flaky tests. Unlike standard git bisect which requires binary pass-fail results, Git Bayesect handles probabilistic outcomes where a test might pass sometimes and fail sometimes, using entropy minimization to efficiently narrow down the culprit commit.
Developer productivity platform with merge queues and flaky test detection
Aviator is a developer productivity platform combining merge queues, stacked PRs, automated code review, and flaky test management. Its merge queue prevents broken main branches by testing PRs in order before merging. Flaky test detection identifies unreliable tests causing CI failures. Founded by ex-Google engineers who built internal developer tools at scale. YC-backed with $2.3M seed from Elad Gil. Used by Bosch, Benchling, and Lightspeed.
Slack-native incident management with AI SRE agent
Incident.io is a Slack-native incident management platform with an AI SRE that autonomously investigates alerts, correlates deployments with telemetry, and drafts fix pull requests. Used by Buffer (70% fewer critical incidents), Favor (37% MTTR reduction), Intercom, and Productboard. Features include automated workflows, on-call scheduling, post-incident learning, and status pages. Integrates with PagerDuty, Datadog, GitHub, Jira, and 100+ tools.
Unified API, performance, and contract testing DSL
Karate is an open-source testing framework that unifies API testing, performance testing, UI automation, and contract testing in a single BDD-style DSL. Write tests in plain Gherkin-like syntax without any Java knowledge. Built-in assertions, data-driven testing, parallel execution, and HTML reports. 8,200+ GitHub stars, MIT licensed. 7+ years of active development with Global 2000 enterprise adoption for comprehensive API quality assurance.
LLM evaluation and tracking with RAG triad metrics
TruLens is an open-source framework for evaluating and tracking LLM experiments with feedback functions, RAG triad metrics (answer relevance, context relevance, groundedness), and Honest/Harmless/Helpful evaluations. Features a unified Metric API for systematic evaluation of RAG pipelines and AI agents. 3,200+ GitHub stars, MIT licensed. Snowflake partnership adds enterprise integration. Supports LangChain, LlamaIndex, and custom LLM applications.
Meta's open-source LLM security suite with Llama Guard and CodeShield
PurpleLlama is Meta's open-source suite of tools for evaluating and improving LLM safety. It includes Llama Guard models for input/output content safety classification, LlamaFirewall for multi-layer defense, CodeShield for insecure code detection, and CyberSecEval benchmarks for measuring LLM security. Llama Guard 4 supports multimodal safety across text and images. 4,100+ GitHub stars, backed by Meta AI with 44+ contributors.
100% private document Q&A powered by local LLMs
PrivateGPT enables fully private document interaction using GPT-powered RAG without any data leaving your machine. Ingest documents (PDF, DOCX, TXT, and more) and chat with them using local LLMs via Ollama or remote providers. Built on LlamaIndex with Qdrant vector storage. 57,200+ GitHub stars, Apache 2.0 licensed. The go-to solution for air-gapped environments, regulated industries, and anyone who needs document Q&A without cloud data exposure.
Modern task queue and workflow orchestration built on PostgreSQL
Hatchet is an open-source task queue and workflow orchestration platform designed as a modern alternative to Celery and BullMQ. Built on PostgreSQL for durability, it handles background jobs, AI agent workflows, RAG pipelines, and GPU task scheduling with TypeScript and Python SDKs. YC W24 batch with 2,800+ GitHub stars, MIT licensed. Supports fan-out, rate limiting, retries, and real-time observability through a web dashboard.
Instant MCP server for any GitHub repository
GitMCP is a free, open-source remote MCP server that transforms any GitHub repository or GitHub Pages site into an AI-accessible documentation hub. Just replace github.com with gitmcp.io in any repo URL to give AI assistants grounded context about that project — eliminating code hallucinations with zero configuration required.
Visual regression testing for CI/CD pipelines
Argos CI is a visual regression testing platform that automatically catches unintended UI changes in CI/CD pipelines. It integrates with Playwright, Cypress, Storybook, and Puppeteer, featuring a stabilization engine that filters flaky pixel differences from genuine regressions. Used by teams at Meta and MUI for frontend quality gates.
Official MCP server for GitHub repo operations
GitHub MCP Server is the official Model Context Protocol server from GitHub that connects AI assistants to repositories, issues, pull requests, workflows, and code search. It exposes 100+ operations with toolset filtering, permission scoping, and audit logging, available in both remote-hosted and self-hosted Docker deployment modes.
Internal developer portal for self-service engineering
Port is an internal developer portal platform that provides self-service interfaces for engineering teams. It offers a software catalog for tracking services, environments, and dependencies, along with self-service actions for common workflows like spinning up environments, deploying services, and managing resources. Features scorecards for engineering standards compliance and integrates with GitHub, GitLab, K8s, and cloud providers.
AI-powered pull request summaries and code review
WhatTheDiff is an AI tool that generates human-readable pull request summaries and suggests code improvements. It analyzes code diffs to explain what changed and why in plain language, helping reviewers understand PRs faster. Integrates with GitHub and supports automated refactoring suggestions through a /wtd command. Useful for teams wanting to improve PR review speed and maintain changelog quality.
Spotify's open-source developer portal framework
Backstage is Spotify's open-source framework for building internal developer portals, now a CNCF incubating project with 27,000+ GitHub stars. It provides a unified software catalog, service documentation, CI/CD pipeline views, and a plugin architecture with 200+ community plugins. Teams use it to create a single pane of glass for service ownership, API documentation, infrastructure management, and developer self-service workflows.
Google's proactive coding agent for async repository maintenance
Jules is Google's coding agent that proactively scans repositories for TODO comments, bug patterns, and improvement opportunities, proposing code changes without explicit user requests. Built on Gemini models, it operates asynchronously in the background, completing over 140,000 code improvements. Handles routine maintenance tasks like dependency updates, code cleanup, and follow-on work from completed features.
AI-powered CI reliability and flaky test management
Trunk is a developer tools platform that tackles CI reliability through AI-powered flaky test detection, automatic quarantine, and merge queue management. It uses ML-based statistical analysis to identify flaky tests, isolates them to prevent pipeline blocks, and creates GitHub issues for resolution. Used by Zillow, Brex, and Faire, with $28.5M in funding and support for all major test frameworks.
AI-powered marketing-ready READMEs from code
PitchDocs is an AI documentation generator that scans codebases to produce marketing-ready READMEs, changelogs, and AI context files like llms.txt. It bridges the gap between raw source code and consumer-ready technical communication, targeting the growing need for high-quality README files that serve both human developers and AI agents accessing documentation.
Context-aware AI review with business logic validation
Umaku is a context-aware AI code review agent that understands full codebase and business logic context, detecting inconsistencies and assessing quality and risk beyond syntax-level analysis. It auto-generates and validates QA test cases from reviewed code, making it particularly effective for reviewing AI-generated code where product-intent validation matters more than style checking.
Pull Requests as a Service with AI + developers
GitStart is a YC-backed platform that delivers merge-ready pull requests by combining AI coding agents with human developer oversight. Teams assign sprint-sized tickets and the AI Ticket Studio converts vague requirements into well-scoped specs, then hybrid agents generate production-ready code through a five-stage quality process with a 98% merge rate reported across customer teams.
Agentic DevOps automation via ChatOps
Kubiya is an agentic automation platform for DevOps and platform teams that uses specialized agents with connectors for Kubernetes, AWS, GitHub, Jira, and Terraform to automate operational tasks through Slack or web portals. It provides Terraform module support for infrastructure-as-code configuration and manages agent behaviors with policy-based controls for enterprise-grade governance.
IDE documentation that stays synced with code
Swimm uses AI to keep code documentation in sync with real-time code changes, providing interactive walkthroughs directly in VS Code and JetBrains IDEs. It solves the stale documentation problem by making docs part of the development workflow rather than a separate artifact, automatically detecting when code changes invalidate existing documentation and suggesting updates.