Kubernetes management, troubleshooting, virtual clusters, cost optimization, and AI-powered K8s operations.
Showing 24 of 30 tools
CNCF Sandbox chaos engineering framework for Kubernetes resilience
Krkn is a CNCF Sandbox chaos engineering tool that tests Kubernetes cluster resilience by injecting controlled failures. It simulates pod kills, node failures, network partitions, CPU/memory pressure, and zone outages. Krkn-AI adds AI-powered scenario generation that suggests chaos experiments based on cluster topology. Supports CI/CD integration for automated resilience testing in deployment pipelines.
Lightweight Kubernetes distribution for edge, IoT, and development
k3s is a CNCF Sandbox lightweight Kubernetes distribution packaged as a single binary under 100MB. Created by Rancher Labs and now maintained by SUSE, it strips non-essential components and bundles containerd, Flannel, CoreDNS, and Traefik into a minimal but fully conformant K8s distribution. Ideal for edge computing, IoT, ARM devices, and local development environments.
Leading open-source service mesh for Kubernetes microservices
Istio is the most widely adopted open-source service mesh for Kubernetes, providing traffic management, security, and observability for microservice architectures. It uses Envoy proxy sidecars to intercept and manage service-to-service communication with mutual TLS, fine-grained traffic routing, circuit breaking, and distributed tracing. CNCF Graduated project used in production by Google, IBM, and Salesforce.
CNCF Sandbox Kubernetes alert enrichment and automation platform
Robusta is a CNCF Sandbox project that enriches Kubernetes alerts with diagnostic context and automates remediation workflows. It intercepts Prometheus alerts, attaches relevant logs, pod status, resource metrics, and troubleshooting suggestions before delivering them to Slack, Teams, or PagerDuty. Supports custom playbooks for automated incident response and AI-powered root cause analysis.
Kubernetes ChatOps bot for Slack, Teams, and Discord
Botkube is a Kubernetes ChatOps platform that brings cluster management into Slack, Microsoft Teams, and Discord. It provides real-time alerts for cluster events, enables kubectl command execution from chat, and supports automated workflows triggered by Kubernetes resource changes. Features plugin architecture for extensibility and RBAC-based access control for team collaboration.
AI-powered SRE agent for Kubernetes troubleshooting
Metoro is an AI SRE platform for Kubernetes that combines observability with autonomous troubleshooting. Its Guardian agent monitors cluster health, correlates metrics, logs, and traces to identify root causes, and suggests remediation actions. Features an MCP server for integration with AI coding agents and natural language querying of infrastructure state.
Zero-instrumentation Kubernetes observability powered by eBPF
Coroot is an open-source observability platform that uses eBPF to automatically instrument Kubernetes applications without code changes. It provides application maps, latency analysis, log correlation, and continuous profiling with automatic anomaly detection. Replaces the need for manual instrumentation with agents that capture metrics, traces, and logs at the kernel level.
Kubernetes cost monitoring and optimization platform
Kubecost provides real-time cost monitoring and optimization for Kubernetes clusters. It allocates infrastructure costs to namespaces, deployments, pods, and labels with granular accuracy. Acquired by IBM, it has become the standard for K8s cost visibility. Features include savings recommendations, budget alerts, cluster right-sizing, and multi-cluster cost aggregation across AWS, GCP, and Azure.
Zero-friction single-binary Kubernetes distribution by Mirantis
k0s is a lightweight, CNCF-certified Kubernetes distribution packaged as a single binary with zero host dependencies. Backed by Mirantis, it simplifies cluster deployment by bundling all required components into one executable that works on any Linux system. Supports x86-64, ARM64, and ARMv7 architectures with automatic upgrades and a built-in control plane load balancer.
eBPF-based networking, security, and observability for Kubernetes
Cilium is a CNCF Graduated project that provides networking, security, and observability for Kubernetes using eBPF technology. It replaces kube-proxy with efficient eBPF-based load balancing, enforces L3-L7 network policies using identity-based security, and includes Hubble for network flow observability and Tetragon for runtime security enforcement. Adopted by Google GKE, AWS EKS Anywhere, and Azure AKS.
Autonomous Kubernetes and GPU infrastructure optimization
ScaleOps provides autonomous real-time management of Kubernetes and GPU infrastructure, reducing cloud costs by up to 80 percent without manual configuration. Backed by 130 million in Series C funding at an 800 million dollar valuation, it serves enterprises including Adobe, Wiz, DocuSign, and Salesforce. The platform continuously rightsizes pods, optimizes replicas, manages nodes, and allocates GPUs based on live workload demand rather than static configurations.
Kubernetes-native framework for DevOps AI agents
kagent is a Kubernetes-native AI agent framework developed at Solo.io and accepted into the CNCF sandbox. It provides a structured environment for running DevOps-focused agents directly within Kubernetes clusters, with a dedicated kmcp toolkit for cloud-native operations. Unlike general-purpose agent frameworks, kagent targets platform engineers and SREs who need AI assistance with cluster management, troubleshooting, and infrastructure automation workflows.
Trusted runtime environments for AI agents in production infrastructure
Teleport Beams provides cryptographically verified, policy-gated access for AI agents to interact with production infrastructure including servers, Kubernetes clusters, and databases. Launched at KubeCon EU 2026, Beams extends Teleport's zero-trust access platform with agent-specific runtime controls, audit trails, and policy enforcement to ensure AI agents operate within defined boundaries when deployed in production environments.
Free and open-source Kubernetes IDE for managing clusters visually
Freelens is a free open-source Kubernetes IDE that provides a visual desktop interface for managing clusters, workloads, and configurations. Forked from the original Lens project after its licensing change, Freelens offers the same powerful cluster management experience with real-time monitoring, log viewing, and resource editing under the MIT license.
Run local code inside your Kubernetes cluster without deploying
mirrord lets developers run local processes as if they were inside their Kubernetes cluster — intercepting network traffic, environment variables, and file access at the OS level without any deployment or configuration changes. Backed by $12.5M in seed funding with investors including Sentry's co-founder, it claims up to 98% faster iteration cycles and 30% fewer production bugs by eliminating the gap between local and cluster environments.
Kubernetes-native cloud infrastructure control plane
Crossplane is a CNCF Graduated open-source project that extends Kubernetes to manage cloud infrastructure through declarative APIs. Platform teams compose custom infrastructure abstractions as Compositions and publish them as self-service APIs. It provisions resources across AWS, Azure, GCP, and 200+ providers directly from kubectl. Used by 450+ organizations with 11,000+ GitHub stars.
Open-source Kubernetes security platform for risk analysis and compliance
Kubescape is a CNCF-backed open-source Kubernetes security platform that scans clusters, manifests, and container images for vulnerabilities, misconfigurations, and compliance violations. It checks against NSA-CISA, MITRE ATT&CK, and CIS benchmarks, integrates into CI/CD pipelines, and provides runtime threat detection via eBPF. Supports SBOM generation and vulnerability scanning. Used by ARMO with growing enterprise adoption in cloud-native security.
Kubernetes-native model inference platform
KServe is an open-source Kubernetes-native platform for deploying and managing ML model inference at scale. It provides standardized inference protocols, autoscaling including scale-to-zero, canary rollouts, A/B testing, and multi-model serving. KServe supports all major ML frameworks including TensorFlow, PyTorch, scikit-learn, XGBoost, and LLM runtimes like vLLM and Triton through pluggable serving runtimes.
Cloud native runtime security for Kubernetes
Falco is a CNCF graduated open-source runtime security tool that detects unexpected behavior and threats across containers, Kubernetes, and cloud workloads in real time. Originally created by Sysdig, Falco monitors Linux kernel syscalls using eBPF and applies customizable detection rules to alert on malicious activity like container escapes, cryptojacking, unauthorized file access, and anomalous network connections. It supports 50+ alert output channels including SIEM integration.
Kubernetes dashboard with 360-degree visibility
Devtron is an open-source Kubernetes management dashboard that provides a 360-degree view of cluster resources with fine-grained RBAC for multi-cluster environments. Its upcoming agentic AI feature automates debugging and cluster optimization, while the current platform offers centralized visibility, GitOps-based deployment workflows, and security policy enforcement across distributed Kubernetes infrastructure.
Open-source MLOps platform for Kubernetes
Kubeflow is a CNCF open-source MLOps platform with 14,000+ GitHub stars for deploying and managing machine learning workflows on Kubernetes. It provides notebooks for experimentation, scalable training pipelines with distributed computing support, model serving with autoscaling, and comprehensive pipeline orchestration for teams running AI/ML workloads in cloud-native environments.
AI copilot for the Lens Kubernetes IDE
Lens Prism is an AI copilot integrated into the Lens Kubernetes IDE (the world's most popular K8s desktop client) that troubleshoots clusters, explains errors in plain English, and helps manage multi-cluster environments visually. It simplifies Kubernetes complexity for developers who prefer visual tools over CLI, providing AI-powered debugging and cluster management within a familiar desktop interface.
Agentic DevOps automation via ChatOps
Kubiya is an agentic automation platform for DevOps and platform teams that uses specialized agents with connectors for Kubernetes, AWS, GitHub, Jira, and Terraform to automate operational tasks through Slack or web portals. It provides Terraform module support for infrastructure-as-code configuration and manages agent behaviors with policy-based controls for enterprise-grade governance.
Kubernetes troubleshooting with event context
Komodor is a Kubernetes troubleshooting platform that extracts event and change context from clusters, correlating deployments, config changes, and infrastructure events to quickly identify the root cause of pod failures. Its Slack integration delivers incident context directly into team channels, helping SRE and platform teams reduce mean time to resolution by connecting the dots between what changed and what broke.