Krkn provides a systematic approach to validating Kubernetes cluster resilience by injecting controlled failures that simulate real-world infrastructure problems. The framework supports a comprehensive set of chaos scenarios including pod deletion, node draining, network partition simulation, CPU and memory resource pressure, zone and region outages, and time skew injection. Each scenario is configurable with parameters for intensity, duration, and target selection to match the specific resilience questions teams need to answer.
The Krkn-AI companion project adds intelligence to chaos engineering by analyzing cluster topology, workload distribution, and resource dependencies to suggest chaos experiments most likely to reveal weaknesses. Rather than running generic failure scenarios, AI-guided experiments target the specific failure modes that the cluster's architecture is most vulnerable to, maximizing the value of each chaos engineering session.
As a CNCF Sandbox project, Krkn benefits from community governance and contributions from multiple organizations with production Kubernetes experience. Integration with CI/CD pipelines enables automated resilience testing as part of deployment workflows, catching reliability regressions before they reach production. The framework exports metrics and results to Prometheus and produces structured reports documenting which scenarios the cluster survived and which revealed weaknesses that need engineering attention.