Loading...
Loading...
Writing tests, visual regression, and automated QA workflows
Showing 24 of 115 tools
GitHub's Kubernetes controller for autoscaling GitHub Actions runners
actions-runner-controller (ARC) is GitHub's official Kubernetes controller for managing self-hosted GitHub Actions runners. It automatically scales runner pods up and down based on workflow demand, provisioning runners when jobs queue and terminating them when complete. Supports runner groups, custom runner images, and organization-level runner management. Over 6,100 GitHub stars.
Microsoft's MCP server for structured browser automation by AI agents
Playwright MCP is Microsoft's Model Context Protocol server that enables AI agents to automate web browsers through structured tool calls. It exposes Playwright's browser automation capabilities as MCP tools that LLMs can invoke for navigating pages, clicking elements, filling forms, extracting content, and taking screenshots. Provides structured, reliable browser interaction for AI agent workflows.
Git-friendly API client by the creator of Insomnia
Yaak is an API client built by the original creator of Insomnia, designed with Git-friendly file storage from the ground up. API requests, environments, and collections are stored as human-readable files that version control naturally alongside application code. Supports REST, GraphQL, gRPC, and WebSocket with a clean, keyboard-driven interface focused on developer productivity.
AI-powered test generation agent for automated code coverage improvement
qodo-cover (formerly Cover Agent) is an open-source AI agent that automatically generates meaningful unit tests to improve code coverage. It analyzes existing code and test patterns to produce tests that follow project conventions and target uncovered branches. Uses an iterative approach where generated tests are verified by running them, discarding those that fail. MIT licensed with over 5,300 GitHub stars.
CNCF Sandbox chaos engineering framework for Kubernetes resilience
Krkn is a CNCF Sandbox chaos engineering tool that tests Kubernetes cluster resilience by injecting controlled failures. It simulates pod kills, node failures, network partitions, CPU/memory pressure, and zone outages. Krkn-AI adds AI-powered scenario generation that suggests chaos experiments based on cluster topology. Supports CI/CD integration for automated resilience testing in deployment pipelines.
GenAI-powered test agent with natural language test authoring
KaneAI is LambdaTest's GenAI-powered test automation agent that creates, evolves, and debugs tests from natural language descriptions. It generates test scripts in multiple frameworks including Selenium, Playwright, and Cypress from plain English instructions. Features intelligent test maintenance that automatically updates tests when application UI changes and two-way editing between natural language and code.
Codeless browser testing with visual test recorder and scheduling
Ghost Inspector provides codeless browser testing through a visual recorder that captures user interactions and converts them into automated test suites. Tests run on managed infrastructure with scheduled execution, CI/CD integration, and Slack notifications. Features visual comparison for UI regression detection, API testing, and test organization with folders and tags for managing large test suites.
AI-powered E2E testing with plain English test authoring
testRigor enables end-to-end test creation in plain English without coding or element selectors. Tests describe user actions in natural language like 'click on the Submit button' and testRigor's AI interprets and executes them across web, mobile, and API. Self-healing tests automatically adapt to UI changes. Supports cross-browser testing, visual validation, and integration with CI/CD pipelines.
Property-based API fuzz testing from OpenAPI and GraphQL schemas
Schemathesis automatically generates test cases from OpenAPI and GraphQL schemas to find crashes, validation errors, and specification violations in APIs. It uses property-based testing and fuzzing techniques to explore edge cases that manual test writing misses. CLI tool and Python library with CI/CD integration. Over 3,200 GitHub stars with support for authentication, custom checks, and stateful testing.
Shift-left DAST platform built for CI/CD pipeline integration
StackHawk is a dynamic application security testing platform designed for CI/CD pipeline integration. It tests running web applications and APIs for OWASP Top 10 vulnerabilities including SQL injection, XSS, and authentication flaws during the development process. Built on ZAP with a developer-friendly CLI and YAML configuration, it provides actionable findings with reproducer requests and fix guidance.
AI-powered DAST platform specializing in API and GraphQL security
Escape is an AI-powered dynamic application security testing platform focused on API security including REST, GraphQL, and gRPC endpoints. It automatically discovers and tests API endpoints for vulnerabilities without requiring source code access. Features business logic testing that goes beyond OWASP patterns, CI/CD integration for shift-left security, and detailed remediation guidance for developers.
CyberArk's open-source LLM fuzzing framework for AI security testing
FuzzyAI is CyberArk's open-source framework for fuzzing large language models to discover vulnerabilities like jailbreaks, prompt injection, guardrail bypasses, and harmful content generation. It systematically tests LLM deployments with over 20 attack techniques and generates detailed reports. Supports testing any model accessible via API including OpenAI, Anthropic, and self-hosted models.
Managed Docker build acceleration with up to 40x faster builds
Depot provides managed infrastructure for dramatically faster Docker image builds. It uses persistent build caches, native Intel and ARM builders, and optimized build scheduling to achieve up to 40x faster builds compared to standard Docker build workflows. Drop-in replacement for docker build that requires no Dockerfile changes. Used by major engineering teams to cut CI/CD pipeline times.
Hybrid CI/CD platform with self-hosted agents and cloud orchestration
Buildkite is a hybrid CI/CD platform that separates orchestration from execution. A cloud-hosted control plane manages pipeline coordination and UI while open-source agents run builds on your own infrastructure. Used by Shopify, Airbnb, Uber, and Tinder for internet-scale deployments. Supports 100,000+ parallel jobs with P95 billing that ignores usage spikes.
Multi-agent AI coding platform with orchestrated specialist agents
Zencoder is an AI coding platform that uses multi-agent orchestration to handle complex development tasks. Specialist agents collaborate on different aspects of implementation including code generation, testing, documentation, and review. Integrates with VS Code and JetBrains IDEs. Features repository-aware context that understands project architecture and coding standards.
LSP-based AI code review agent backed by Y Combinator
mrge is a YC-backed AI code review agent that uses Language Server Protocol analysis to provide deep, context-aware pull request reviews. It goes beyond surface-level pattern matching by understanding project structure, type information, and cross-file dependencies. Integrates with GitHub and GitLab to deliver automated reviews that catch logic errors, security issues, and architectural inconsistencies.
Blazing-fast PHP linter, formatter, and static analyzer in Rust
Mago is a comprehensive PHP toolchain written in Rust that unifies linting, formatting, and static analysis into a single binary. It enforces PER-CS formatting standards, catches code smells with 100+ lint rules, and performs deep type inference for semantic analysis. Inspired by Clippy and OXC from the Rust ecosystem, it delivers performance orders of magnitude faster than PHPStan and Psalm while requiring no PHP runtime to execute.
Browser automation CLI built for AI agents by Vercel Labs
Agent Browser is a Rust-based browser automation CLI designed specifically for AI agent workflows rather than traditional testing. Developed by Vercel Labs, it provides semantic element selection through a refs system, accessibility tree snapshots, session persistence, and authentication vaults. Unlike Playwright or Puppeteer which target test automation, Agent Browser optimizes for token efficiency and deterministic element selection that gives LLMs reliable browser interaction capabilities.
Agent harness performance system with 30+ agents and 136 skills
Everything Claude Code is a comprehensive agent harness performance optimization system providing 30 specialized agents, 136 skills, 60 commands, and automated hook workflows for AI-assisted development. Born from an Anthropic hackathon winner and evolved over 10+ months of intensive daily use, it works across Claude Code, Codex, Cursor, and OpenCode with built-in security scanning via AgentShield, continuous learning, and research-first development patterns.
Virtual engineering team as Claude Code skills by YC CEO Garry Tan
GStack transforms Claude Code into a structured virtual engineering team through 23 opinionated slash command skills created by Y Combinator CEO Garry Tan. It assigns specialist roles including CEO product review, engineering manager architecture oversight, designer visual audit, QA lead with real browser testing, and release engineer deployment. Each skill enforces focused workflows with clear decision principles for running parallel coding sessions.
Benchmark for evaluating AI coding agents on real GitHub issues
SWE-bench is a benchmark from Princeton NLP that evaluates AI coding agents by testing their ability to resolve real GitHub issues from popular open-source projects. Each task provides an issue description and repository state, and the agent must produce a working patch that passes the project's test suite. With 4,600+ GitHub stars, it has become the standard yardstick for comparing autonomous coding tools like Devin, Claude Code, and OpenHands.
Python toolkit for assessing and mitigating ML model fairness issues
Fairlearn is a Microsoft-backed open-source Python toolkit that helps developers assess and improve the fairness of machine learning models. It provides metrics for measuring disparity across groups defined by sensitive features, mitigation algorithms that reduce unfairness while maintaining model performance, and an interactive visualization dashboard for exploring fairness-accuracy trade-offs. Integrated with scikit-learn and Azure ML's Responsible AI dashboard.
Human-in-the-loop approval and oversight layer for AI coding agents
HumanLayer is a YC-backed platform that adds human approval, oversight, and escalation workflows to AI coding agents. Instead of letting agents execute autonomously, HumanLayer provides checkpoints where humans review and approve agent actions before they touch real codebases and infrastructure. It bridges the gap between autonomous AI coding and enterprise-safe deployment by making human oversight programmable.
ByteDance's open-source multimodal desktop agent with vision-based GUI automation
UI-TARS Desktop is ByteDance's open-source multimodal AI agent that automates desktop and browser interactions using computer vision rather than DOM selectors or accessibility APIs. Powered by the UI-TARS vision model, it can understand and operate any graphical interface by looking at screenshots, making it capable of automating applications that traditional browser automation tools cannot reach, including native desktop apps and complex web UIs.