dstack is a control plane for AI infrastructure that solves the operational complexity of running training and inference workloads across diverse GPU environments. Modern AI teams face a fragmented landscape where GPU availability, pricing, and APIs differ across every cloud provider and on-premises setup. dstack provides a single declarative interface where developers specify what they need — GPU type, count, memory, and framework — and the platform handles provisioning, scheduling, and lifecycle management across all configured backends.
The platform supports NVIDIA, AMD, and Google TPU accelerators across AWS, GCP, Azure, Lambda Cloud, and self-managed Kubernetes or bare-metal clusters. Workloads are defined in YAML configuration files that specify resource requirements, Docker images, and execution commands. dstack's fleet management automatically discovers available GPUs, tracks utilization, and schedules jobs to minimize cost and maximize throughput. The auto-scaling engine provisions and deprovisisions cloud instances based on queue depth.
dstack has raised venture funding and maintains an active open-source project with over 2,000 GitHub stars. The MPL-2.0 license allows commercial use while requiring modifications to the core to be shared. For AI teams that have outgrown the workflow of manually SSH-ing into GPU instances or navigating cloud console UIs, dstack provides the infrastructure abstraction layer that makes multi-cloud GPU orchestration as straightforward as container orchestration with Kubernetes.