In preparation for fixing broken Kubernetes clusters live on @rawkode's #Klustered event, I reminded myself of some of the core K8s debugging commands and techniques

Here are my top 10 tips for platform engineers debugging Kubernetes and the machinery underneath the covers 🧵 👇
First off, use kubectl to take a look at the cluster infra:

$ kubectl get nodes
$ kubectl cluster-info dump

These commands typically give you a good idea of where to start debugging, e.g. broken nodes, infra issues, resources
If kubectl doesn't work, it's time for some Linux debugging! SSH to the control plane node and run:

$ ps aux | grep kube

This will show you if core components such as the kublet, kube-apiserver, kube-scheduler are up and running
If you're not sure about the Kubernetes control plane architecture, my awesome colleague, @Didicodes, has you covered: blog.getambassador.io/ckad-tips-kube…
Checking out the kubelet logs will also generally provide pointers to underlying infra issues e.g. on a machine with systemd:

$ journalctl -xeu kubelet.service
In addition, you can typically find the logs to core components under /var/log/kube-apiserver.log or with kubeadm initialized clusters /var/log/containers/kube-...

Tail or less is your friend here:

$ tail /var/log/kube-apiserver.log
$ less /var/containers/kube-apiserver-...log
The @containerd command-line tool can also be useful to see if the control plane containers are up and running e.g.

$ ctr -n k8s.io containers ls

This blog post by @iximiuz is super useful: iximiuz.com/en/posts/conta…
Once you've got the control plane up and running, I like to look at everything running in the cluster (if my cluster workloads are small enough to not overwhelm my terminal). Look for CrashLoopBacks or pods not starting:

$ kubectl get pods -A
If you find anything, then it's time to go spelunking in the associated Pod or Deployment config and/or logs.

$ kubectl describe pod my-pod -n my-namespace
$ kubectl logs my-pod -c my-container
$ kubectl describe deploy my-deploy -n my-namespace
Getting a list of events can also be useful (and you will have seen some events with using the describe commands above)

$ kubectl get events --sort-by=.metadata.creationTimestamp

Look for obvious issues (image missing, resource issues e.g. OOM kills, network connectivity)
If you have access to the original manifests, the diff command can be super useful to see if someone has been tampering with your config in the cluster (accidentally or otherwise)

$ kubectl diff -f ./my-manifest.yaml
I could write an entire other thread on debugging networking and storage issues, but I'll save this for another day
As for general reference, the Kubernetes docs have some very useful additional guidelines for debugging cluster, too: kubernetes.io/docs/tasks/deb…
And I found Ioannis Moustakis's Certified Kubernetes Administrator (CKA) exam cheatsheet super useful for commands (even if you're not studying for the exam!): faun.pub/cka-kubernetes…
A lot of my Kubernetes infrastructure knowledge comes from following @kelseyhightower's "Kubernetes The Hard Way" github.com/kelseyhightowe…

I learned the basics of K8s here when it was first created back in 2016, and also used it as a teaching tool when working as a consultant:
If you are an app developer looking for guidance on debugging apps running in K8s, check out my other thread:
And if you're studying for the CKAD exam, @Didicodes has created this series for you: blog.getambassador.io/ckad-cka-exam-…
I hope this list was helpful to you! If it is, let me know! And please share your tips, too! 🙏
And I almost forgot! Here is the link to the YouTube recording of #Klustered live stream where @Didicodes, @alex_gervais, and me take on the @fairwindsops team

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Daniel Bryant

Daniel Bryant Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @danielbryantuk

Feb 18
"Platform Engineering" is rapidly becoming the new DevOps or SRE. Almost every day we hear about another org building an internal developer platform or control plane.

Want to know what platform engineering is, where the trends are going, and why you should care?

Read on 🧵👇
We've all been building application/web platforms for years

- On-premises: ticket-driven, bare-metal, long lead time
- First-gen PaaS: self-service, VM-based, one-size-fits-all, on-demand
- Next-gen PaaS/Custom platform: self-service, container-based, fast feedback, good UX/DevX
The rise of more ops-savvy developers (and SREs) and developer-friendly infra tooling has led to a boom in the creation of custom platforms

The attraction to building a custom platform is that you can craft the abstractions to match exactly what your org requires (in theory)
Read 21 tweets
Feb 13
My top 8 commands and tools for debugging applications running on @kubernetesio 🧵👇
A good first step is viewing the app's pods and associated logs (potentially targeting a specific container) looking for obvious crashes

$ kubectl get pods -n my-namespace

$ kubectl logs my-pod -c my-container

More info -> kubernetes.io/docs/reference…
If a container gets stuck in a crash-loop-backoff, use the `logs --previous` flag

$ kubectl logs my-pod -c my-container --previous
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(