...or "How Kubernetes Just Repeats Good Old Deployment Patterns"
1. For a long time, people had been deploying services as groups of virtual (or physical) machines.
But VMs were often slow and bulky. Hence, not very efficient.
2. Then containers gained quite some popularity.
With containers, it became easier to distribute services. Reproducibility also improved. But containers haven't become a replacement for VMs.
Mainly, because of their deliberate focus on being an environment to run a single app:
3. Instead of containers, another abstraction took off - Kubernetes Pods!
A Pod is a group of semi-fused containers. External borders were preserved, but some of the internal isolation b/w containers substituting a Pod got weakened.
A Pod is a much closer abstraction to a VM.
4. A single (virtual or physical) machine can run many independant Pods.
In Kubernetes, machines substituting a cluster are called Nodes, but developers are rarely concerned with this abstraction. For them, Kubernetes is serverless! 🙈
More Pods per server means better packing.
5. Deployment of Pods happens through replicating a Pod template.
There is a Deployment object in Kubernetes that holds the desired Pod template and the needed number of "copies." But logically, there is not much difference between scaling Pods and VMs.
6. Kubernetes Service is a means of grouping Pods behind a logical name.
Kubernetes comes with built-in service discovery.
The implementation is neither client- nor server-side (rather network-side). But from the clients' standpoint, it feels like a good old reverse proxy.
Grasping Kubernetes Pods, Deployments, and Services 🧵
...through the lens of "old school" Virtual Machines.
Before the rise of Cloud Native:
- A VM was a typical deployment unit (a box)
- A group of VMs would form a service
- Everyone would build their own Service Discovery
Then, Docker containers showed up.
A container attempted to become a new deployment unit...
However, Docker's restriction of having a single process per container was too limiting. Many apps weren't built that way, and people needed more VM-ish boxes.
Kubernetes got the deployment unit right.
In Kubernetes, a minimal runnable thing is a Pod - a group of semi-fused containers.
Now, you can run (and scale!) the main app and its satellite daemons (sidecars) as a single unit.
Docker relies on containerd, a lower-level container runtime, to run its containers. It is possible to use containerd from the command line directly, but the UX might be quite rough at times.
1. Network namespaces - a Linux facility to virtualize network stacks.
Every container gets its own isolated network stack with (virtual) network devices, a dedicated routing table, a scratch set of iptables rules, and more.
2. Virtual Ethernet Devices (veth) - a means to interconnect network namespaces.
Container's network interfaces are invisible from the host - the latter runs in its own (root) network namespace.
To punch through a network namespace, a special Virtual Ethernet Pair can be used.
3. The need for a (virtual) switch device.
When multiple containers run in the same IP network, leaving the host ends of the veth devices dangling in the root namespaces will make the routes clash. So, you won't be able to reach (some of) the containers.
What is Service Discovery - in general, and in Kubernetes 🧵
Services (in Kubernetes or not) tend to run in multiple instances (containers, pods, VMs). But from the client's standpoint, a service is usually just a single address.
How is this single point of entry achieved?
1⃣ Server-Side Service Discovery
A single load balancer, a.k.a reverse proxy in front of the service's instances, is a common way to solve the Service Discovery problem.
It can be just one Nginx (or HAProxy) or a group of machines sharing the same address 👇
2⃣ Client-Side Service Discovery
The centralized LB layer is relatively easy to provision, but it can become a bottleneck and a single point of failure.
An alternative solution is to distribute the rosters of service addresses to every client and let them pick an instance.