Production Kubernetes for ML/AI workloads: GPU infrastructure, control plane internals, and model serving patterns for engineers running inference at scale.