, 18 tweets, 6 min read Read on Twitter
i have a theory, which is that we struggle to get the time allocated to pay down technical debt (or improve deploys, etc) because to biz types we basically sound like the underpants gnomes.

step 1 ... pay down technical debt
step 2 ...
step 3 .. PROFIT
we spend a lot of time thinking about this because it's the same story for instrumenting properly, or learning a new tool, or investing in observability.

i.e. the engineers are usually dying to do it, even their managers may be board, yet it can get bumped indefinitely.
which is why you should read this new blog post by @lizthegrey, in which she connects the dots from start to finish: how infrastructure work translates directly into to dollars, and every hop between.

honeycomb.io/blog/treading-…
@lizthegrey another thing i keep thinking about is how observability unlocks this whole new universe of data-driven thinking. it's a utility tool you can apply not just to production, but to your processes, pipeline, capacity, users, etc.

if you're trying to win an argument, bring graphs.
@lizthegrey (i know this sounds like such a cliche. but watching our users turn honeycomb on problem after problem that we never could have predicted is *so cool*.)

you know your own problems better than any vendor. trust the vendors who aren't trying to take you out of the driver's seat.
@lizthegrey so much of the current enthusiasm for AI, ML, etc seem to be predicated on this desire to remove humans from the equation, to replace them with automation and machines that know better.

allspaw said it best: kitchensoap.com/2015/05/01/ope…
my favorite part will always be this. "anomaly detection in software is, AND ALWAYS WILL BE, an unsolved problem". 🛎🛎🛎

furthermore, problems are solved by teams, not individuals, and history is of untold value because while history may not repeat, it definitely rhymes.
this may be self-interested advice, but that doesn't mean it's not true:

if you're trying to modernize your architecture, make on call suck less, put software engineers on call, or make your systems more resilient: your very first step should be observability.
no, that doesn't mean toss some tracing onto your logs and metrics. there are no three fucking pillars, that's nonsense talk: those are just three rando data types. (sorry.. two data types and whatever "logs" is).

@el_bhs refutes it best: lightstep.com/blog/three-pil…
@el_bhs it means you need the ability to ask any question about what's happening on the inside of your systems, whether you've seen it before or not, and understand what's happening. just by using your tools from the outside.

this means read-time querying over raw events, yada yada.
the reason o11y comes first is because it's like turning the light on in the room.

before you get all fancy with your chaos engineering, or your microservices, or your orchestration and meshes, whatever -- wouldn't it be cool if you could just, like, see what you're doing?
to wrap this around to my original point: we (engineering teams) need to get better at explaining ourselves to other departments.

this means translating our needs and priorities into other languages, specifically the universal corporate language of dollars and cents.
engineering managers, in particular, need to get much better at translating their team's inputs and dependencies into financial terms.

need proof? look at how much easier it is to get a headcount than sign a $200k vendor bill, despite this being flamingly irrational.
other teams aren't malicious; they want what's best for us too. we all have a shared goal -- a whole boatload of them, presumably -- they just don't know how to evaluate statements like "we need to stop shipping features to containerize our build system".
and when your justification for doing so is basically "i have a hunch it will be better", the normal inclination will be to say "sorry, please keep shipping features."

data wins arguments. graphs win arguments. observability *gets you* the data and graphs to win the arguments.
if you haven't watched this talk by @lyddonb yet, you absolutely must. threadreaderapp.com/thread/1024608…

it's about the catastrophic consequences of our industry's failure to communicate about what it is we do and why it matters.
we need to do better. we *must* do better. but we cannot explain what we do not ourselves understand.

and that, my friends, is why you should stop putting it off.
stop thinking monitoring is the same, or good enough. and make observability a priority for your org.
oh p.s. here is a thing i have learned working w/ other teams:

stop taking numbers so seriously. other teams treat corporate-money numbers like the handwavey guesstimates that they are; engineers get like biblical literalists, all hung up on how many animals the Ark could hold.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Charity Majors
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!