Loading...
Loading...
Automating CI/CD pipelines, infrastructure provisioning, and deployment workflows
Showing 24 of 249 tools
OpenTelemetry-native observability with AI tracing, logs, traces, metrics, and session replay — self-hosted in 90 seconds.
Traceway is an open-source, OpenTelemetry-native observability platform that combines logs, traces, metrics, exceptions, session replay, and AI tracing in a single self-hosted system. MIT licensed with no open-core restrictions, it deploys in 90 seconds via Docker Compose and accepts OTLP/HTTP from any OTel SDK without a Collector or per-language vendor SDK.
Open-source post-building layer for agents — tracing, evals, and online monitoring
Judgeval is the open-source post-building layer for AI agents from Judgment Labs, providing OpenTelemetry-based tracing, hosted and custom evaluation scorers, and online behavior monitoring for LLM-powered applications. Instrument any function with a single decorator, score live production traffic against faithfulness and instruction-adherence checks, and feed real-world failures back into reinforcement learning or supervised fine-tuning loops.
Open-source observability and self-healing layer for AI agents
TraceRoot is a YC S25-backed open-source observability platform purpose-built for AI agents and LLM apps. It combines OpenTelemetry-compatible tracing with an agentic debugging runtime that reads your source code, correlates failures with recent commits, and proposes fix PRs automatically. BYOK support spans seven LLM providers; the entire stack runs self-hosted via Docker Compose, with TraceRoot Cloud available for managed deployments.
Rust-native multi-agent orchestration for production
GraphBit is a Rust-native, multi-agent orchestration framework built for production. It targets the gap between Python-first frameworks like LangGraph and the operational expectations of enterprise systems — predictable memory, low latency, deterministic concurrency, and the ability to embed an agent runtime in services that already run Rust without dragging in a Python interpreter.
Open-source toolkit for building AI SRE incident response agents
OpenSRE is an open-source Python toolkit from Tracer Cloud for building AI SRE agents that investigate and respond to production incidents. It ships with connectors to Prometheus, Grafana, Kubernetes and incident platforms, plus a simulation harness that replays past incidents so teams can benchmark agent accuracy before trusting it on live pager rotations.
Official Chrome DevTools MCP server for coding agents
chrome-devtools-mcp is the Chrome DevTools team's official MCP server that lets coding agents control and inspect a live Chrome browser with first-party Chrome DevTools Protocol fidelity. It exposes Network inspection, Performance traces, Lighthouse audits, console output, and structured DOM snapshots as typed MCP tools, so agents can debug real pages and ship reliable web performance investigations without resorting to brittle DOM scraping.
Self-evolving local computer agent with a reusable skill tree
GenericAgent is a minimal, self-evolving autonomous agent in roughly 3K lines of Python that gives LLMs system-level control of a local computer. It writes files, runs shell commands, and browses the web, but its defining feature is skill crystallization: successful task runs are saved as reusable skills inside a growing skill tree that cuts token cost on repeats.
See where your AI coding tokens actually go
Open-source TUI dashboard and CLI that shows where your AI coding tokens actually go, broken down by task type, tool, model, MCP server, and project. CodeBurn reads local session data directly from Claude Code, Codex, Cursor, OpenCode, Pi, and GitHub Copilot — no wrapper, proxy, or API keys — and layers on one-shot success rates so you can see whether the AI nails work first try or burns budget on edit/test/fix retries. Ships with a macOS menu bar widget and CSV/JSON export.
AI-powered file-type detection at Google scale
Open-source AI-powered file-type detection tool from Google that uses a custom deep-learning model under a few megabytes to identify more than 200 binary and textual content types in milliseconds, even on a single CPU. Magika ships as a CLI, Python package, JavaScript/TypeScript library, and an ONNX model, achieves around 99% accuracy on its test set, and is already used at Google scale across Gmail, Drive, and Safe Browsing as well as by VirusTotal and abuse.ch.
AI-powered production incident resolution
Resolve AI automates production incident investigation, diagnosis, and remediation acting as an AI SRE that participates in every on-call rotation. Autonomously investigates incidents pursuing multiple hypotheses in parallel, validates against real evidence, creates code snippets and drafts PRs, generates post-mortems, and onboards new teammates with instant answers about code and infrastructure. Drives 5x faster MTTR and 87% faster incident investigations.
Task runner for Python with Poetry and uv
Poethepoet (poe) is a batteries-included task runner for Python projects that integrates with Poetry and uv package managers. Define tasks in pyproject.toml, compose them in sequential, parallel, or DAG workflows, and execute with full virtual environment context. Supports shell commands, Python scripts, environment variables, .env file loading, and auto-generated shell completion across bash, zsh, and fish for streamlined development workflows.
Container-based CI/CD automation system
Concourse is an open-source CI/CD system built on composable primitives: resources for external artifacts, tasks for containerized work units, and jobs for orchestration. All pipelines are declarative YAML with version control, every task runs in an isolated container, and stateless workers enable horizontal scaling. Deployable via BOSH, Helm, Docker Compose, or standalone binary across any infrastructure.
Open-source feature flag management platform
Unleash is the largest open-source feature flag platform, enabling teams to decouple deployment from release with gradual rollouts, A/B testing, and trunk-based development. It provides 15+ official SDKs for server and client frameworks, a web-based admin dashboard for managing feature toggles, and supports activation strategies like percentage rollout, user targeting, and environment-based rules. Self-hostable via Docker with PostgreSQL storage.
Java diagnostic and troubleshooting tool
Arthas is Alibaba's open-source Java diagnostic tool that lets developers troubleshoot production issues without modifying code or restarting servers. It attaches to running JVM processes to inspect class loading, decompile classes, trace method invocations, monitor performance metrics, and view real-time stack traces. Supports JDK 6+ with both telnet and WebSocket interfaces for local and remote diagnostics across Linux, macOS, and Windows.
Production monitoring platform for AI agent reliability
Sentrial is a YC W26-backed monitoring platform for AI agent reliability in production. It semantically detects loops, hallucinations, tool misuse, and user frustration in real-time, then diagnoses root causes and recommends fixes. The platform claims 70% MTTR reduction via automated remediation including rollback, model retraining triggers, and webhooks. Sentrial positions itself as the Datadog for teams deploying autonomous AI agents at scale.
AI production engineer that auto-triages and fixes alerts
Sonarly is a YC W26-backed AI production engineer that autonomously triages production alerts, deduplicates them by root cause, and sends ready-to-merge pull request fixes. It connects to monitoring tools like Sentry and Datadog, analyzes alert patterns to identify the underlying issue, and generates code fixes or optimization recommendations. Built on Claude APIs, Sonarly reduces mean time to resolution for production incidents while minimizing alert fatigue for engineering teams.
Enterprise-grade sandbox for AI agent code execution
OpenSandbox is an open-source sandbox platform from Alibaba providing secure, isolated execution environments for AI coding agents. It supports Python, Java, JavaScript, and C# SDKs with a unified Sandbox Protocol for custom runtimes. Integrates with Docker and Kubernetes, offering isolation through gVisor, Kata Containers, and Firecracker microVMs with per-sandbox network controls.
Cloud-native POSIX filesystem on object storage
JuiceFS is a high-performance distributed POSIX filesystem built on object storage like S3 and metadata engines like Redis or MySQL. It enables seamless data sharing across thousands of clients with low latency and elastic throughput. JuiceFS ships with a Kubernetes CSI driver, Hadoop SDK compatibility, and FUSE mount support for AI training, big data analytics, and shared storage workloads. Apache 2.0 licensed with 13K+ GitHub stars.
Kafka-compatible streaming platform, no JVM required
Redpanda is a Kafka-compatible streaming data platform written in C++ using the Seastar framework. It eliminates the need for ZooKeeper and the JVM, delivering up to 10x lower tail latencies and significantly reduced operational complexity. Redpanda ships as a single binary with a built-in schema registry, HTTP proxy, and message broker. It supports the Kafka wire protocol, so existing producers, consumers, and tools work without code changes. Backed by $165M+ in funding with 12.0K GitHub stars.
Rust-powered JavaScript bundler for Vite
Rolldown is a high-performance JavaScript and TypeScript bundler written in Rust, built as the next-generation bundler for Vite. Created by Evan You and VoidZero, it offers a Rollup-compatible plugin API with 10-30x faster builds. It combines esbuild-level speed with full Rollup ecosystem compatibility, supporting tree-shaking, code splitting, and advanced optimizations natively. With 13K+ stars and MIT license, it is set to become the default bundler for Vite 8.
Modern application delivery platform for Kubernetes
KubeVela is a CNCF incubating project that provides a modern application delivery platform built on Kubernetes and the Open Application Model. It abstracts away infrastructure complexity by letting developers define applications declaratively with components, traits, and policies, while platform teams manage delivery workflows. KubeVela supports multi-cluster deployment, canary rollouts, GitOps integration, and extensible addon system.
Modern open-source server management panel
1Panel is a modern open-source Linux server management panel built with Go that provides a clean web interface for managing websites, databases, containers, and system resources. It features a marketplace with 165+ one-click app installs including Nextcloud and Bitwarden, automatic SSL provisioning with Let's Encrypt, visual Docker container management, and built-in firewall configuration. 1Panel also supports native AI agent deployment through Ollama integration.
Simple open-source personal cloud system
CasaOS is an elegant open-source personal cloud operating system that turns any hardware into a private home server with a one-line installation. It provides a beautiful web dashboard for managing Docker containers, a curated app store with one-click installs for tools like Nextcloud and Jellyfin, and built-in file management. CasaOS runs on Raspberry Pi, Intel NUC, old laptops, and cloud VMs with full support for Ubuntu, Debian, and Raspberry Pi OS.
Modern data pipeline orchestration with built-in AI
Mage AI is an open-source data pipeline orchestration tool positioned as a modern alternative to Apache Airflow. It provides a visual pipeline editor, native AI integrations for generating pipeline code, real-time streaming support, and built-in data quality checks. Mage handles batch and streaming workloads with a developer-friendly notebook-style interface and deploys to any cloud provider.