-
Beyond Integer GPUs: Mastering DRA for ML Workloads
Stop treating a $30K A100 like a boolean. Dynamic Resource Allocation (GA in Kubernetes 1.34) lets you claim GPUs by VRAM, compute capability, interconnect topology, and MIG profile — then share them safely across workloads. This article walks through every pattern with real manifests. The Problem: GPUs Are Not Integers For years, requesting a GPU…
Sidecar Pattern in K8s MLOps
Over last 1.5 years, I have built a lot of POCs, End-to-End products leveraging ML models, LLMs etc. With Gemini, Claude at your disposal, I am sure many of us would have done the same. At the end of 2025, my home lab was serving 20+ models with a mix of docker, EKS, 100+ exporters…