A good question! Monitoring checks and unit tests perform exactly the same function: they regularly and automatedly check that the code or system is operating within "normal" bounds.
So do we still need all these tests and checks, in a post-o11y world? The answer is yes...and no.
Yes, you still want to write tests and monitoring checks: to catch regressions, to catch or rule out all the dumb problems before you waste your precious curiosity on them.
But here's where tests and monitoring diverge. Tests don't (usually) wake you up when they fail, whereas the whole raison d'etre of monitoring is alerts, those every-alert-must-be-actionable fucking alerts.
So there's a cost to be borne. Is it worth it? 🤔
Here is where I would argue that in the absence of o11y tooling, team have been horribly overloading their usage of monitoring tools and alerts.
Instead of just a few top level service and e2e alerts that clearly reflect user pain, many shops have accumulated decades of
sedimentary layers of warnings and alerts and monitoring notifications. Not just to alert a human to investigate, but to *try to debug for them.*
They don't have tools to follow the bread crumbs. So they set off fireworks and town criers shouting clues on every affected block.
In a densely interconnected system, it's nearly impossible to issue a single, clean alert that is also correct about the root cause. (First of all, there is rarely "a root cause").
Instead what you get is a few hundred things squalling about getting slower --
none of which are the cause. However, your experienced sysadmin will roll over in bed, groan, skim a handful of the alerts at random; pronounce "redis again" and go back to sleep.
These squalling alerts -- that tell you details about the things you shouldn't have to care about,
but you leave them up because it's the only heuristic you have for diagnosing complex system states -- these monitoring checks can and should die off once you have observability.
With extreme prejudice. They burn you out, make you reactive, and they make you a worse engineer.
Use o11y for what it's great at -- swiftly understanding and diagnosing complex systems, from the perspective of your users.
Use monitoring for what it's great at -- errors, latency, req/sec, and e2e checks.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Well, I for one am not past this bullshit by now. ☺️ EMs who do some hands on engineering are better EMs.
Forbidding EMs from touching code at all is almost as silly and counterproductive as telling EMs that writing and shipping code is a core function of their role.
I say "almost" because if I had to choose one or the other, I would choose the clarity of "EMs responsible for team outcomes, SWEs responsible for technical outcomes" over the muddle of holding EMs responsible for everything and splitting their focus between people and code.
But I don't have to choose! EMs who keep a hand in the code are better EMs. They have more empathy and understanding for their team. They are better equipped to evaluate their engineers, they have more credibility and context. Everyone wins.
I have a new piece up. It's a bit of a rant, even for me, so buckle in.
A lot of "thought leaders" have been making their mortgages lately off of bits on how AI is going to replace software engineers, particularly entry-level engineers.
This is a dumb idea. It bespeaks a wealth of misunderstanding about what it means to be an engineer and write code, and what is valuable and hard about software systems.
But even really dumb, damaging ideas can weasel into people's heads if you repeat them blindly enough times.
Generative AI has made it easier than ever to generate lots of code. @kentquirk says it's "like a junior engineer who types really fast". 🤣
But writing code has always been the easiest part of software engineering -- *always*. And it's getting easier by the day.
It felt, to me, like those participating were stepping very cautiously around a few of the third rails Jaana just tripped over. (💜)
"Work-life balance"
"Working hard vs working smart"
"Meritocracy"
The intersection of company tech cultures and expectations and performance.
These are hard, complicated topics, and there are some very good reasons for speaking carefully. People can pick up a sentence and run in the wrong direction with it, and do a lot of damage.
I have abandoned god only knows how many drafts on this topic, for that reason.
The question is, how can you interview and screen for engineers who care about the business and want to help build it, engineers who respect sales, marketing and other functions as their peers and equals?
It's a great question!! I have ideas, but would love to hear from others.
I said "question", but there are actually two: 1) how to hire engineers who are motivated by solving business problems and 2) aren't engineering supremacists.