Kubernetes, often abbreviated as K8s, is an open-source container orchestration platform originally developed by Google for automating the deployment, scaling, and management of containerized applications. It groups containers into logical units called pods, schedules them across a cluster of worker nodes using declarative manifests, and continuously reconciles actual state with desired state — providing self-healing, rolling updates, horizontal autoscaling, and service discovery out of the box.
Its feature set includes Deployments, StatefulSets, and DaemonSets for different workload patterns, Services and Ingress for networking and load balancing, ConfigMaps and Secrets for configuration management, PersistentVolumes for storage, Jobs and CronJobs for batch work, HorizontalPodAutoscalers for reactive scaling, and a powerful extension model through Custom Resource Definitions (CRDs) and Operators. The ecosystem around Kubernetes is enormous — Helm, Istio, Argo CD, Prometheus, Cert-manager, and thousands of other CNCF projects build on top of its core APIs.
Kubernetes powers most modern cloud-native infrastructure at companies ranging from early-stage startups to the largest enterprises, and is available as managed services (EKS, GKE, AKS, DigitalOcean Kubernetes) or self-hosted distributions (Rancher, OpenShift, Talos). Lightweight variants like k3s, kind, and minikube run full clusters on a developer laptop. It sits at the bottom of the AI-infrastructure stack for teams running GPU-heavy inference workloads, agent fleets, or RAG pipelines at scale.
