22 tools tagged
Showing 22 of 22 tools
Open-source post-building layer for agents — tracing, evals, and online monitoring
Judgeval is the open-source post-building layer for AI agents from Judgment Labs, providing OpenTelemetry-based tracing, hosted and custom evaluation scorers, and online behavior monitoring for LLM-powered applications. Instrument any function with a single decorator, score live production traffic against faithfulness and instruction-adherence checks, and feed real-world failures back into reinforcement learning or supervised fine-tuning loops.
ML experiment tracking and model monitoring
Weights and Biases is the AI developer platform for experiment tracking, model monitoring, and ML workflow orchestration. Weave extends W&B with LLM ops capabilities for prompt engineering, evaluation, and deployment. Enables teams to track experiments, monitor model performance in production, manage datasets, log LLM application traces, and collaborate on ML projects with visualization dashboards, automated logging, and enterprise SSO and RBAC compliance.
AI-powered production incident resolution
Resolve AI automates production incident investigation, diagnosis, and remediation acting as an AI SRE that participates in every on-call rotation. Autonomously investigates incidents pursuing multiple hypotheses in parallel, validates against real evidence, creates code snippets and drafts PRs, generates post-mortems, and onboards new teammates with instant answers about code and infrastructure. Drives 5x faster MTTR and 87% faster incident investigations.
Security operations resilience for SOC teams
Fig provides a Security Operations Resilience platform designed for modern SOC teams facing both unplanned and planned changes. Features drift detection to catch unplanned infrastructure changes, automated drift repair with testing, planned change modeling to simulate initiatives before deployment, version control, and automatic deployment with rollbacks. Helps teams maintain security coverage while shipping risk-free at 10x speed and focusing on strategic cyber work.
Open-source observability for AI agents
Laminar is an open-source observability platform for AI agents providing tracing, evaluation, and analytics for LLM applications. It integrates with Vercel AI SDK, LangChain, OpenAI, and Anthropic with a single line of code. Features include OpenTelemetry-native SDKs, an extensible evaluation framework with CI/CD support, SQL access to traces and metrics, and a visual debugging timeline for agent reasoning and actions.
Open-source AIOps alert management platform
Keep is an open-source AIOps platform that provides a single pane of glass for all alerts from monitoring tools like Datadog, PagerDuty, Grafana, and 50+ integrations. It uses AI to correlate, deduplicate, and enrich alerts, reducing noise and helping on-call teams focus on real incidents. Keep includes workflow automation, bidirectional sync with ticketing systems, and a modern web dashboard.
Autonomous Kubernetes and GPU infrastructure optimization
ScaleOps provides autonomous real-time management of Kubernetes and GPU infrastructure, reducing cloud costs by up to 80 percent without manual configuration. Backed by 130 million in Series C funding at an 800 million dollar valuation, it serves enterprises including Adobe, Wiz, DocuSign, and Salesforce. The platform continuously rightsizes pods, optimizes replicas, manages nodes, and allocates GPUs based on live workload demand rather than static configurations.
All-in-one open-source observability — logs, metrics, traces, RUM
OpenObserve is an open-source observability platform that unifies logs, metrics, traces, and real user monitoring in a single binary. It claims 140x lower storage costs than Elasticsearch through columnar storage and compression, with native OpenTelemetry support, a built-in query UI, dashboards, and alerts. Designed for AI and cloud-native workloads at petabyte scale. Over 15,000 GitHub stars.
Observability data accessible to AI agents via MCP
Netdata's MCP integration exposes infrastructure monitoring, discovery, and root-cause analysis capabilities to AI agents. Built into the 78K+ star Netdata monitoring platform, it lets agents query real-time metrics, explore system health, investigate incidents, and generate observability reports through the Model Context Protocol.
AI-driven log analysis with zero false positives
Dash0 is an AI-driven observability platform focused on log analysis that auto-structures unstructured logs, provides instant alerting with zero false positives, and delivers full-stack tracing capabilities. It uses AI to transform raw log data into structured, searchable events without requiring manual parsing configuration, making log-based debugging significantly faster for engineering teams.
AI-powered incident management in Slack and Teams
Rootly is an AI-native incident management platform that runs entirely within Slack and Microsoft Teams, automating incident workflows from detection through postmortem. It reduces manual incident overhead with AI-generated summaries, automated role assignments, escalation paths, and postmortem drafts, holding SOC 2 Type II, GDPR, and HIPAA compliance certifications for enterprise use.
Evaluation-first LLM and agent observability
Confident AI is an evaluation-first observability platform that scores every trace and span with 50+ metrics, alerting on quality drops in LLM and agent applications. It goes beyond traditional APM by treating evaluation as core observability, providing actionable insights that help teams understand not just whether their AI applications are running but whether they are producing correct and useful outputs.
AI observability with security posture management
Coralogix uses AI to provide actionable insights across logs and traces with a dedicated AI-SPM dashboard for tracking prompt injections and data leaks in AI applications. Its pay-per-use model with no upfront fees integrates security posture management directly into the observability stack, making it uniquely positioned for teams running both traditional and AI-powered production workloads.
Cloud-native observability with AI correlation
Middleware is a cloud-native observability platform that provides real-time insights into Kubernetes environments using AI to correlate metrics, logs, and traces for faster troubleshooting. It simplifies the debugging of complex microservice clusters by automatically connecting related signals across distributed systems, with a freemium model accessible to teams of all sizes.
Data and AI observability for enterprise teams
Monte Carlo is the leading data and AI observability platform using ML to monitor pipelines, warehouses, and lakes for quality issues. It detects freshness delays, volume anomalies, schema changes, and distribution shifts before they impact analytics. With 500+ deployments at Nasdaq, Honeywell, and Roche, it provides automated root cause analysis, field-level lineage, and incident management. Available on AWS and Azure Marketplace.
Open-source ML and LLM monitoring with 100+ metrics
Evidently AI is an open-source platform with 100+ pre-built metrics for monitoring data quality, model performance, and data drift in AI/ML pipelines. Available under Apache 2.0 with a cloud version, it helps teams detect when production data shifts away from training distributions, LLM output quality degrades, or feature pipelines introduce anomalies that silently degrade model accuracy.
Full-stack observability with AI-powered monitoring
New Relic is a full-stack observability platform combining APM, infrastructure monitoring, log management, distributed tracing, browser and mobile monitoring, synthetics, and AIOps in one unified dashboard with 50+ capabilities and 780+ integrations. Features AI-powered anomaly detection, incident correlation, root cause analysis, and an SRE agent for automated troubleshooting. Usage-based pricing with 100 GB free data ingest monthly and one free full platform user. Used by 16,000+ orgs.
Open-source AI observability for models and data pipelines
WhyLabs is an AI observability platform for monitoring ML models, LLM apps, and data pipelines — now fully open-sourced. Built on whylogs for privacy-preserving data logging and LangKit for LLM monitoring. Provides continuous drift detection, data quality monitoring, anomaly alerting, and LLM security including prompt injection and hallucination detection. Processes 100% of data without sampling across tabular, image, text, and embedding types. Incubated at Allen Institute for AI.
Open-source monitoring and alerting toolkit — the CNCF standard for metrics collection.
Prometheus is the open-source monitoring system and time-series database that has become the CNCF standard for metrics collection in cloud-native environments. Features a powerful query language (PromQL), pull-based metrics collection, multi-dimensional data model, and built-in alerting via Alertmanager. The foundation of modern Kubernetes observability.
Application monitoring and error tracking that helps developers fix issues faster.
Sentry is the leading error tracking and performance monitoring platform for developers. Captures and aggregates errors with full stack traces, breadcrumbs, and context across 100+ platforms. Used by over 100,000 organizations. Features session replay, performance tracing, and code-level profiling. Open source self-hosted option available.
Cloud-scale monitoring, security, and analytics platform for modern infrastructure.
Datadog is a comprehensive observability platform that unifies metrics, traces, logs, and security signals across your entire stack. Used by thousands of enterprises to monitor cloud infrastructure and applications in real time. Offers 750+ integrations, AI-powered alerting, and end-to-end visibility. The industry leader in cloud monitoring with support for AWS, Azure, GCP, and Kubernetes.
Open-source observability platform for metrics, logs, and traces visualization.
Grafana is the leading open-source platform for monitoring and observability visualization. It connects to virtually any data source — Prometheus, Elasticsearch, InfluxDB, PostgreSQL, CloudWatch, Datadog, and 150+ others — to create beautiful, interactive dashboards. Used by millions of users at companies like Bloomberg, JPMorgan, eBay, and PayPal. Grafana Cloud offers a fully managed experience with generous free tier. The CNCF ecosystem standard for metrics visualization.