went rummaging through slack history, realized it's been three years since i first googled the definition of 'observability' and found to my shock that:
a) it *had* a technical definition
b) it was everything i had been struggling and flailing to describe for the past six months
so, i wrote up a three year retrospective of how the term has been used and misused, claimed by vendors both good and evil, as well as the technical requirements for systems observability ... and why high cardinality is at the root of everything. thenewstack.io/observability-…
warning, it's long; and possibly of interest to only like me and five other people. but the important thing to note here is this.
the reason we got attention, in a NOTORIOUSLY packed space, is because we reflected the real pain and anger that engineers like us are feeling.
so many engineers have been shelling out ungodly sums of money to newrelic and datadog etc, despite the fact that those tools increasingly do not solve their problems, and solve a little less and less every month.
we live in a high cardinality world.
we live in a world where canned answers are increasingly ineffective.
we live in a world where you must instrument with intent, not rely on magic.
we live in a world where user experience is the only metric that matters.
we live in a world of cheap hardware and expensive time.
the metrics, logs and APM providers got their start back when they did a pretty good job of solving your problems. they've been printing money left and right ever since.
they weren't designed for a distributed world, but they've been making too much money to care.
but the undercurrent of dissatisfied folks who knew they were wasting their cash, knew their problems weren't being solved, were dubious of "high cardinality is impossible" -- has grown rapidly.
you people are why we are still around today, and gaining steam fast. 💙
turns out people don't particularly care for being gaslit by their software vendors. nor told that what they want they can't have, but how would they like thousands of dashboards instead??
vendors keep papering this gap by releasing features named e.g. "Infinite Cardinality"
instead of releasing, say, actual support for high cardinality. (you are not fooling anyone)
this smug stasis seems to be what happens when a category is raking in so much money, they don't notice or care when the world has moved on.
in conclusion, thanks to everybody who has given us a try, given us your feedback.
i too aspire to print so much money i don't have to care about anyone's feelings, but until then... we'll be over here trying to build the shit you need to understand your systems. 🧸🐝
• • •
Missing some Tweet in this thread? You can try to
force a refresh
The question is, how can you interview and screen for engineers who care about the business and want to help build it, engineers who respect sales, marketing and other functions as their peers and equals?
It's a great question!! I have ideas, but would love to hear from others.
I said "question", but there are actually two: 1) how to hire engineers who are motivated by solving business problems and 2) aren't engineering supremacists.
Pro tip: any time you see someone confidently opining on what all good CTOs know or do, it is ✨bullshit✨
There is no stock template for CTO, or default set of expectations or responsibilities. It stands alone among the C-levels in that good ones are all over the freaking map.
This may not hold true for publicly traded companies. But in my experience, a good CTO can be:
* over all of R&D
* over engineering, like a VP eng
* like a principal eng or architect
* team lead for special projects
* a great senior programmer
(continued... 👉)
A CTO can also be:
* a great communicator and popularizer
* on the road as a devrel
* a field CTO, whose authority opens doors to big customers
* a product visionary who sweats the details
* more of a cofounder than technical contributor, sharing "company-running" duties w/CEO
Yeah, this is a fair caveat. If you're already a decent senior engineer and manager, it's kind of possible to split your attention between managing a small team and writing code.
But you aren't going to improve at either skill set. Those cycles get devoured by context switching.
Tech lead managers ("TLMs") are a mistake we make over and over in this industry. I've written about this a bit, but the definitive post was written by @Lethain.
My coworker @suchwinston wrote a terrific piece on burnout before the break:
There's a reason why burnout and work/life balance are such evergreen topics, and it's not actually because the world is so terribly harsh and everyone is criminally overworked.honeycomb.io/blog/product-m…
Just to be clear: some places *are* awful, and some people *are* criminally overworked. But burnout and work/life balance are an issue for everyone, not just those people.
I think this is bc there is no real "solution". Each of us has to find and maintain our own equilibrium.
It takes a lot of hard work to become good at technology, and a lot more hard work to maintain your edge in a fast-changing industry.
I don't know of anyone for whom this is _easy_. None of this is remotely natural, from an evolutionary perspective. 🐒
This is an astute point. For all the ink that has been spilled about what observability is or is not, or how generation 2.0 differs from 1.0, it's actually quite simple.
"Observability" was coined to denote the emergence of ✨high cardinality✨ support in telemetry and tooling.
Cardinality, for those new to my feed 🤣 refers to the number of unique items in a set. Gender drop-down with three options? Low cardinality. Gender field you can write to? Much, much higher cardinality.
Metrics can't do high cardinality data. A metric can only be a number.
Logs *can* handle high cardinality data, which is why logs have always been so much more powerful than metrics.
The most useful debugging data is always the high cardinality shit. Request IDs, uncommon strings, whatever. It reduces the search space fast.