aicoolies logo
OpenSRE logo

OpenSRE

Open-source toolkit for building AI SRE incident response agents

Share
open-sourceOpen Source
Visit Website →

OpenSRE is an open-source Python toolkit from Tracer Cloud for building AI SRE agents that investigate and respond to production incidents. It ships with connectors to Prometheus, Grafana, Kubernetes and incident platforms, plus a simulation harness that replays past incidents so teams can benchmark agent accuracy before trusting it on live pager rotations.

We have a review for this tool

A detailed review by the aicoolies team — click to read

OpenSRE is an open-source toolkit from Tracer Cloud for building AI SRE agents that investigate and respond to production incidents. Rather than a generic chat-over-your-logs product, OpenSRE provides the scaffolding — connectors to common observability stacks, an incident workflow state machine, and an evaluation harness — so teams can assemble an agent that behaves like a junior on-call: pulling metrics, correlating traces, reading recent deploys, and proposing a root cause hypothesis.

Out of the box the framework integrates with the usual SRE surface: Prometheus, Grafana, Datadog-style metric queries, log backends, Kubernetes, and incident platforms. A notable design choice is the simulation and benchmarking layer: teams can replay past incidents against the agent to measure how well it diagnoses real outages before letting it touch a live pager. That makes OpenSRE easier to trust in production than a from-scratch LangChain pipeline.

The project is Apache-2.0 licensed and written in Python, which fits the typical DevOps toolchain and makes custom connectors straightforward to add. It is a strong fit for platform and SRE teams who want an agentic incident workflow they can self-host, extend, and evaluate — without buying into a closed AIOps vendor stack.

Pricing

Free and open source under Apache-2.0 license. Self-hosted — you pay for your own LLM provider, observability stack and infrastructure; the toolkit itself has no hosted tier.

Platforms

Python, self-hosted — integrates with Prometheus, Grafana, Kubernetes, and major incident management platforms

Categories

Tags

Use Cases

Alternatives

Related Tools

eve vercel

eve by Vercel

Filesystem-first framework for durable AI agents

Eve is Vercel's filesystem-first TypeScript framework for building durable AI agents as ordinary project files. It combines Markdown instructions and skills, typed tools, channels, connections, subagents, schedules, sandboxes, and evals with Vercel's agent runtime so teams can ship deployable agents without hand-rolling orchestration. The current beta fits Vercel-native backend agent projects.

open-sourceOpen Source
BrowserOS logo

BrowserOS

Open-source agentic browser that runs local AI agents in your browsing workflow.

BrowserOS is a privacy-first, open-source agentic browser for running AI assistants locally inside real browsing sessions instead of handing every task to a remote cloud browser.

open-sourceOpen Source
Agent Governance Toolkit logo

Agent Governance Toolkit

Microsoft’s open-source toolkit for adding policy enforcement, identity, sandboxing, and audit controls to production AI agents.

Agent Governance Toolkit is an open-source Microsoft project for teams moving AI agents from demos into controlled production workflows. It focuses on runtime policy enforcement, zero-trust identity, sandboxed execution, and reliability patterns around autonomous agents, giving security and platform teams a governance layer around tool calls and agent actions rather than another prompt-only guardrail.

open-sourceOpen SourceTelemetry
rampart

Rampart

Microsoft’s pytest-native red teaming framework for turning AI agent safety findings into CI tests.

RAMPART is an open-source Microsoft framework for safety and security testing of agentic AI applications. It brings red-team findings into a pytest-native workflow so teams can turn prompt injection, unsafe tool use, and behavioral boundary failures into repeatable regression tests. The strongest aicoolies angle is developer workflow: RAMPART makes agent safety part of CI/CD instead of a one-off security review.

open-sourceOpen Source
OpenHuman logo

OpenHuman

Local-first personal AI agent with memory trees, desktop integrations, and private workspace context.

OpenHuman is an open-source, local-first personal AI agent from TinyHumans. It combines a desktop app, persistent memory trees, Obsidian-compatible storage, OAuth integrations, and local model support into a private assistant harness. It is most interesting for users who want agentic workflows and long-term memory without handing every context detail to a fully cloud-hosted assistant.

open-sourceOpen SourceTelemetry
Unabyss logo

Unabyss

MCP-native personal context vault for keeping AI agents aligned with your work, voice, and projects.

Unabyss is a personal context headquarters for AI agents. It syncs sources such as email, Slack, Notion, Drive, meetings, and professional profiles into structured context files that can be served to MCP-capable clients. The strongest angle is not generic note taking; it is permissioned, reusable context for Claude, Cursor, custom agents, and other tools that otherwise need the same background explained repeatedly.

freemiumTelemetry

Comparisons