And finally, if you've enjoyed this thread, you might also like the Kubernetes workshops that we run at Learnk8s learnk8s.io/training or this collection of past Twitter threads
Your container does not have GPU drivers installed
So, how does PyTorch inside it actually use the host's GPU?
Let me explain 🧵
First, understand the host side
The NVIDIA kernel driver exposes GPUs as device files: /dev/nvidia0, /dev/nvidiactl, etc
This is how ANY application talks to the GPU — through these device files
PyTorch doesn't call the driver directly
It uses the CUDA Runtime () — a high-level API that handles memory management, kernel launches, and synchronization
This runtime library lives inside your container libcudart.so
You probably know there are some iptables somewhere, but do you know the exact sequence of chains involved in routing traffic to a ClusterIP?
What about a NodePort? Is that different?
🧵
1/
Services relies on the Linux kernel's networking stack and the Netfilter framework to modify and redirect network traffic. The Netfilter framework provides hooks at different stages of the networking stack where rules can be inserted to filter, change, or redirect packets
2/
The Netfilter framework offers five hooks to modify network traffic: PRE_ROUTING, INPUT, FORWARD, OUTPUT, and POST_ROUTING. These hooks represent different stages in the networking stack, allowing you to intercept and modify packets at various points in their journey