Requests are good for:
- setting a baseline (give me at least X of CPU)
- setting relationships between pods (this pod A uses twice as much CPU as the other)
but are not useful for setting hard limits.
For that you need CPU limits.
If your container has a hard limit and wants more CPU, it has to wait for the next period.
Your processed is throttled.
So what should you use as CPU requests and limits in your Pods?
For requests calculate the smallest CPU unit as:
REQUEST = NODE CORES * 1000 / MAX NUMBER OF PODS PER NODE
For a 1 vCPU node and max 10 Pods that's 1 * 1000 / 10 = 100Mi request
Assign the smallest unit or a multiplier of it to your containers.
Example
I don't know how much CPU I need for containerA but I know that it is twice as CPU intensive as containerB.
CPU request for B: 100Mi (1 unit) CPU request for A: 200Mi (2 units)
If the containers use more CPU, they will keep a ratio of 1:2 for all CPU available
What about limits?
- your app might have already "hard" limits. E.g. Node.js is single-threaded and uses up to 1 core
- you could have: limit = Node CPU - (CPU reserved)
If you need to be more specific, profiling is the way to go. Set the limits at 99th percentile + 50%
That's it!
This thread is part of the @learnk8s research on Kubernetes Node sizing: docs.google.com/spreadsheets/d…
You will see more CPU charts and comparisons coming soon in that doc.
Also if you want to dig in deeper here you can find a few relevant links:
blog.kubecost.com/blog/requests-…
medium.com/@betz.mark/und…
nodramadevops.com/2019/10/docker…