aicoolies logo

Krkn

CNCF Sandbox chaos engineering framework for Kubernetes resilience

Share
open-sourceOpen Source
Visit Website →

Krkn is a CNCF Sandbox chaos engineering tool that tests Kubernetes cluster resilience by injecting controlled failures. It simulates pod kills, node failures, network partitions, CPU/memory pressure, and zone outages. Krkn-AI adds AI-powered scenario generation that suggests chaos experiments based on cluster topology. Supports CI/CD integration for automated resilience testing in deployment pipelines.

Krkn provides a systematic approach to validating Kubernetes cluster resilience by injecting controlled failures that simulate real-world infrastructure problems. The framework supports a comprehensive set of chaos scenarios including pod deletion, node draining, network partition simulation, CPU and memory resource pressure, zone and region outages, and time skew injection. Each scenario is configurable with parameters for intensity, duration, and target selection to match the specific resilience questions teams need to answer.

The Krkn-AI companion project adds intelligence to chaos engineering by analyzing cluster topology, workload distribution, and resource dependencies to suggest chaos experiments most likely to reveal weaknesses. Rather than running generic failure scenarios, AI-guided experiments target the specific failure modes that the cluster's architecture is most vulnerable to, maximizing the value of each chaos engineering session.

As a CNCF Sandbox project, Krkn benefits from community governance and contributions from multiple organizations with production Kubernetes experience. Integration with CI/CD pipelines enables automated resilience testing as part of deployment workflows, catching reliability regressions before they reach production. The framework exports metrics and results to Prometheus and produces structured reports documenting which scenarios the cluster survived and which revealed weaknesses that need engineering attention.

Pricing

Free and open-source under Apache 2.0

Platforms

Python, Kubernetes, CI/CD integration

Categories

Tags

Use Cases

Alternatives

Related Tools

Safari MCP Server

Apple's Safari-native MCP server for web debugging agents

Safari MCP Server is Apple's safaridriver-based MCP server in Safari Technology Preview, giving compatible coding agents local access to Safari page content, console logs, network requests, screenshots, JavaScript evaluation, interactions, viewport controls, and accessibility/performance checks.

freeTelemetry

KubeAI

Kubernetes operator for serving AI inference workloads

KubeAI is an Apache-2.0 Kubernetes operator for deploying and scaling AI inference workloads, including LLMs, embeddings, reranking, and speech-to-text. It gives platform teams OpenAI-compatible endpoints, model proxy/controller primitives, model caching, scale-from-zero behavior, and cluster-native resource management for self-hosted inference on Kubernetes.

open-sourceOpen Source

kubectl-ai

Google’s open-source Kubernetes assistant that translates natural-language intent into precise cluster operations.

kubectl-ai is an AI-powered Kubernetes assistant from Google Cloud Platform. It acts as an intelligent interface for cluster work, translating operator intent into Kubernetes commands and workflows. The key distinction from reactive diagnosis tools is that kubectl-ai is designed as an interactive natural-language interface for planning and executing Kubernetes operations, with provider configuration and MCP-oriented workflows around the CLI.

open-sourceOpen SourceTelemetry
rampart

Rampart

Microsoft’s pytest-native red teaming framework for turning AI agent safety findings into CI tests.

RAMPART is an open-source Microsoft framework for safety and security testing of agentic AI applications. It brings red-team findings into a pytest-native workflow so teams can turn prompt injection, unsafe tool use, and behavioral boundary failures into repeatable regression tests. The strongest aicoolies angle is developer workflow: RAMPART makes agent safety part of CI/CD instead of a one-off security review.

open-sourceOpen Source
Vald logo

Vald

Cloud-native distributed vector search engine built for Kubernetes with automatic indexing and horizontal scaling.

Vald is a highly scalable distributed approximate nearest neighbor (ANN) vector search engine designed for cloud-native, Kubernetes-based architectures. Maintained by LY Corporation and listed in the CNCF Landscape, it uses the NGT algorithm (developed at Yahoo Japan), supports automatic incremental index backup, and handles billion-scale datasets across loosely coupled microservice components that scale horizontally via Helm.

open-sourceOpen Source
Requestly logo

Requestly

One tool for intercepting, mocking, and replaying HTTP — acquired by BrowserStack

Requestly is a BrowserStack-backed API client, HTTP interceptor, mock server, and session replay tool for frontend and QA teams. Its current product is commercial/API-client led, while the legacy interceptor/open-source code is AGPLv3. The free plan covers individual workflows, and Pro lists at $12/user/month monthly or $9/user/month annually for collaborative QA and frontend debugging teams.

freemium