Read on Twitter

12,399 views

Charity Majors

@mipsytipsy

, 21 tweets, 5 min read Read on Twitter

@honeycombio

@honeycombio

Another day, another article written about the @honeycombio-shaped hole in the world of operational tooling, without -- somehow -- ever mentioning honeycomb. 🤨 sumologic.com/blog/business-…

I am done grinding my teeth (didn't take long, loads of practice) and will instead recap it for funsies.

First off, we marvel at the growth of kubecon and nod to its intense (lol) complexity. Traditional monitoring won't work, this sounds like a job for OBSERVABILITYMAN 🦸‍♂️!!

Then the article (that doesn't mention us) links to a terrific talk (that doesn't mention us), on why the old school "three pillars" model for o11y is fatally flawed.

It's a great talk, you should read the slides. schd.ws/hosted_files/k…

Then we mention ballooning costs (yup, esp since you are paying half a dozen vendors to understand the same events slightly differently) and that tracing is a life saver in distributed systems like k8s (yup).

Now brace yourself for the greatest non sequitur leap of the new year:

... Prometheus is king! Winning everywhere!

"But wait," I hear you asking. "Is Prometheus going to help with any of those problems, of complexity and end to end tracing and the request path? Does Prometheus even do observability? Isn't it metrics and preaggregated dashboards?"

You, dear reader, have been paying attention. If you buy the highly technical, control theory-derived definition of observability that I do, then tools based on metrics (the technical definition: a number with appended tags) will never be observability tools.

Why? Because they have torn up and discarded all the connective tissue of the event before they ever write to disk.

That connective tissue is exactly what you needed to reason about the internal workings of your system as a developer. It tracks your user experience too.

I am not saying metrics/aggregate tools have no value! They can have loads of value. Primarily for ops use cases, like provisioning and system health.

Developers don't give a shit about system health. They care about the health of *each individual request*. Events.

This is the most prominent difference between monitoring (ops uses, aggregates) and observability (dev use cases, unique events).

I mean, ops doesn't give a shit about each and every request either, as long as the system is healthy and errors below SLOs.

Prometheus is a monitoring tool. It is a *great* one. It represents the peak of time series dashboard software in the wild.

My problem with it is that they claim it's more than that. Which leads to a very bad experience for users with more honeycomb-ish shaped problems.

https://twitter.com/buildchimp/status/1090460058075496448?s=21

https://twitter.com/buildchimp/status/1090460058075496448?s=21

Case in point.

https://twitter.com/buildchimp/status/1090460058075496448?s=21

https://twitter.com/chrissinjo/status/1089906244016586752?s=21

https://twitter.com/chrissinjo/status/1089906244016586752?s=21

Case in point.

https://twitter.com/chrissinjo/status/1089906244016586752?s=21

https://twitter.com/aberoham/status/1090434907111870465?s=21

https://twitter.com/aberoham/status/1090434907111870465?s=21

Case in point.

https://twitter.com/aberoham/status/1090434907111870465?s=21

https://twitter.com/gilligan128/status/1090219777371631617?s=21

https://twitter.com/gilligan128/status/1090219777371631617?s=21

Case in point.

https://twitter.com/gilligan128/status/1090219777371631617?s=21

Those are just from today. EVERYBODY has bit the dust on this.

And yes, I am aware that some proprietary implementations of metrics based systems do not have the same cardinality limitations, but doesn't invalidate my point -- they still aren't oriented around the event.

Because once you are gathering these events, it is *trivial* to then add a field with ordering, and 💥boom💥 now you get tracing for free.

It's even better than it sounds. You don't have to double your spend. Don't have to hop from tool to tool. It just works.

Tracing is just a fancy visual overlay on the rich events if you father them correctly.

Which means you can flip back and forth between exploring ("find me an example of this bug"), tracing ("now trace it"), and exploring ("who else was affected?").

Holy grail ✋

Guess I need to do a better job of sucking up to the CNCF club for them to start acknowledging our existence. Sigh.

I'd expect better from our fellow vendors, except.. never mind, I guess this all makes sense. 🖕 just keep muddying the waters, bros

(But k8s is our third largest integration, and it works exactly as advertised. Just sayin.)

https://twitter.com/tyler_treat/status/1090433901712019456?s=21

https://twitter.com/tyler_treat/status/1090433901712019456?s=21

Final case in point:

https://twitter.com/tyler_treat/status/1090433901712019456?s=21

@honeycombio

@honeycombio

P.S. @honeycombio has a free tier now. ICYMI

Like this thread? Get email updates or save it to PDF!

Subscribe to Charity Majors

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Like this thread? Get email updates or save it to PDF!

Subscribe to Charity Majors

This content may be removed anytime!

Try unrolling a thread yourself!

More from @mipsytipsy see all

Related threads

Trending hashtags

Did Thread Reader help you today?