THREAD: How does the scheduler work in Kubernetes?

The scheduler is in charge of deciding where your pods are deployed in the cluster.

It might sound like an easy job, but it's rather complicated!

Let's dive into it. Image
1/8

Every time a Pod is created, it also added to the Scheduler queue.

The scheduler process Pods 1 by 1 through two phases:

1. Scheduling phase (what node should I pick?)
2. Binding phase (let's write to the database that this pod belongs to that node) ImageImageImageImage
2/8

The Scheduler phase is divided into two parts. The Scheduler:

1. Filters relevant nodes (using a list of functions call predicates)
2. Ranks the remaining nodes (using a list of functions called priorities)

Let's make an example.
3/8

You want to deploy a Pod that requires some GPU. You submit the pod to the cluster and:

1. The scheduler filters all Nodes that don't have GPUs
2. The scheduler ranks the remaining nodes and picks the least utilised node
3. The pod is scheduled on the node ImageImageImageImage
4/8

At this moment, the filtering phase has 13 predicates.

That's 13 functions to decide whatever the scheduler should discard the node as a possible target from the pod.

Even the scoring phase has 13 priorities.

Those are 13 functions to decide how to score and rank nodes. ImageImage
5/8

How can you influence the scheduler's decisions?

- nodeSelector
- Node affinity
- Pod affinity/anti-affinity
- Taints and toleration

And what if you want to customise the scheduler?
6/8

You can write your plugins for the scheduler. You can customise any block in the Scheduling phase.
The binding phase doesn't expose any public API, though. Image
7/8

You can learn more about the scheduler here:

- Kubernetes scheduler kubernetes.io/docs/concepts/…
- Scheduling policies kubernetes.io/docs/reference…
- Scheduling framework kubernetes.io/docs/concepts/…
8/8

Did you like this thread?

You might enjoy the previous threads too! You can find all of them here:

What would you like to see next? Please reply and let me know!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Daniele Polencic — @danielepolencic@hachyderm.io

Daniele Polencic — @danielepolencic@hachyderm.io Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @danielepolencic

Oct 28
How do Kubernetes Services work?

You probably know there are some iptables somewhere, but do you know the exact sequence of chains involved in routing traffic to a ClusterIP?

What about a NodePort? Is that different?

🧵Image
1/

Services relies on the Linux kernel's networking stack and the Netfilter framework to modify and redirect network traffic. The Netfilter framework provides hooks at different stages of the networking stack where rules can be inserted to filter, change, or redirect packets Image
2/

The Netfilter framework offers five hooks to modify network traffic: PRE_ROUTING, INPUT, FORWARD, OUTPUT, and POST_ROUTING. These hooks represent different stages in the networking stack, allowing you to intercept and modify packets at various points in their journey Image
Read 15 tweets
Mar 11
Having multiple tenants sharing a Kubernetes cluster makes sense from a cost perspective, but what's the overhead?

How much should you invest to keep the tenant isolated, and how does it compare to running several clusters?

We ran three experiments and recorded the costs. Image
Before examining the costs, let's look at the scale of the problem.

Most teams partition their cluster by environment.

For example, ten teams might have three environments each (i.e. dev, test and prod).

If you partition the cluster by environment and team, you will have 30 distinct slices.Image
What happens when you scale to 50 teams?

You will end up with 150 slices, of course.

But what are the consequences of this decision? Image
Read 18 tweets
Oct 3, 2023
By default, Kubernetes doesn't recompute and rebalance workloads.

You could have a cluster with fewer overutilized nodes and others with a handful of pods

How can you fix this?

🚨 Spoiler: you can watch Chris talking about this next week:

Continues…👇 bit.ly/k8s-optimize-3
Image
1/

Let's consider a cluster with a single node that can host 2 Pods

You maxed out all available resources so you can scale the cluster to have a second node and spread the load Image
2/

You provision a second node; what happens next?

Does Kubernetes notice that there's a space for your Pod?

Does it move the second Pod and rebalance the cluster?

Unfortunately, it does not

But why? Image
Read 19 tweets
Sep 5, 2023
What if you could choose the best node for your Kubernetes cluster before writing any code?

Imagine being able to estimate:

- Utilization.
- Overcommitment.
- Wasted resources.
- Costs.

And compare the results for multiple setups.

Let me show you how.
1/

First, not all resources in worker nodes can be used to run workloads

You need to account for memory and CPU used by kubelet, kube-proxy, operating system, etc Image
2/

Assuming you have accounted for those, instance types come in all shapes and sizes

How do you pick the best?

That's a tricky question, so I usually take a different approach: What's the best worker node for my workload?
Read 18 tweets
Jun 6, 2023
⎈ 20 Kubernetes threads in 20 weeks ⎈

I shared one (interesting) Kubernetes weekly thread for the past five months.

Here's the complete list: Image
1/

Isolating Kubernetes pods for debugging
2/

Learning how an ingress controller works by building one in bash
Read 23 tweets
May 8, 2023
How does Pod to Pod communication work in Kubernetes?

How does the traffic reach the pod?

Let's dive into how low-level networking works in Kubernetes. Image
1/

When you deploy a Pod, the following things happen:

➀ The pod gets its own network namespace
➁ An IP address is assigned
➂ Any containers in the pod share the same networking namespace and can see each other on localhost Image
2/

A pod must first have access to the node's root namespace to reach other pods

This is achieved using a virtual eth pair connecting the 2 namespaces: pod and root

The bridge allows traffic to flow between virtual pairs and traverse through the common root namespace Image
Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(