If you're scared of pushing to production on Fridays, I recommend reassigning all your developer cycles off of feature development and onto your CI/CD process and observability tooling for as long as it takes to ✨fix that✨.
Fear of deploys is the ultimate technical debt. How much time do you waste
* waiting until it's "safe" to deploy,
* batching up changes into big changes that are decidedly unsafe to deploy,
* waiting to get paged and fretting over deploys,
* cleaning up terrible catastrophuckes
"Don't push to production" is a code smell -- like a three week old corpse kind of smell -- and a sign of deep wrongness. Just because you're used to it doesn't make it any less noxious.
Like, would you go around casually commenting "there's blood in my pee" or "i get a stabbing pain when i breathe"?
because that's how it looks when you go around nonchalantly telling people to halt the lifeblood of their systems based on the day of the week.
ok this seems to have touched a nerve, so rather than reply to every single person who is objecting "hey what's the harm in skipping fridays, it can't hurt anything and may help" or "why do you want to take away weekends" or "stop shaming people!!" i will say it here once:
The inability to push to prod (or fear of pushing to prod) is the symptom.
The fact that recovering from a deploy gone sideways is often a harrowing, multi-day event is the problem.
The way you address the problem is by shipping changes not more slowly, but more QUICKLY. Fewer, smaller changesets turn into debuggable, understandable deploys.
The delta from "writes code" to "code in prod" is the primary metric of a team's maturity and efficiency.
(Don't ask me -- ask the DORA report.)
So by slowing down or batching up or pausing your deploys, you are materially contributing to the worsening of your own overall state.
"Don't push on Fridays" isn't the problem, it's just a symptom of underlying pathology. Sometimes there are legit reasons not to push on Fridays. You have to use your own judgment! But most of what I'm hearing in this thread are lame excuses. 😉
Like I said, shipping software should not be scary. There are feature flags, event-level observability, and other modern developer tools that protect users from breaking changes. charity.wtf/2018/08/19/shi…
If a failed deploy means somebody's *weekend* is impacted ... y'all have much bigger problems, and somebody should FIX THAT.
Where does it stop? If pausing Friday deploys means Sunday is safe, shouldn't you maybe pause Thursday deploys just to be extra safer???
Finally, I am not shaming anyone! I am simply pointing out that it is a sign of poor hygiene that usually masks a heap of other problems. Which is what I am suggesting you might want to take a look at, because this strategy is incredibly costly in the long run.
It's just that the costs of pausing deploys for a day tend to be hidden ones that you experience as sluggish ship rates and a general culture of fear and avoidance (and a broad acceptance of truly fucked up situations as being "normal").
Which is truly hard to change. But we can start by pointing out that it is not normal, it is unnecessary and can be fixed. You can live in a better world.
🎶 Imagine all the people ... 🎶
Before you reply to this thread, ask yourself: is the thing you're about to "well actually" me about in fact:
* learned helplessness
* an adaptive response to something shitty
* the result of your org not valuing people's time off
* something else supporting my core point
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Let's talk about OpenTelemetry, or "OTel", as the kids like to call it.
I remember emitting sooo many frustrated twitter rants back in 2017-2018 about how *behind* we were as an industry when it comes to standards for instrumentation and logging.
Then OTel shows up.
For those of you who have been living under a rock, OTel is an open standard for generating, collecting, and exporting telemetry in a vendor agnostic way.
Before OTel, every vendor had its own libraries, and switching (or trying out) new vendors was a *bitch*.
Yeah, it's a bit more complicated to set up than your standard printf or logging library, but it also adds more discipline and convenience around things like tracing and the sort of arbitrarily-wide structured data blobs (bundled per request, per event) that o11y requires.
It's hard to formulate career goals in your first decade or so as an engineer; there is just SO MUCH to learn. Most of us just kinda wing it.
But this is a goal that I think will serve you well: do a tour of duty at a startup and another at a bigco, in your first 10y as an eng.
Besides the obvious benefits of knowing how to operate in two domains, it also prevents you from reaching premature seniority. (charity.wtf/2020/11/01/que…)
The best gift you can give your future self is the habit of regularly returning to the well to learn, feeling like a beginner.
Several people asked this. It's a good question! I will share my thoughts, but I am certainly not religious about this. You should do what works for you and your teams and their workflows. 📈🥂☺️
1) "assuming you have good deduplication"... can a pretty big assumption. You never want to be in a situation where you spend more time tweaking dupe, retry, re-alert thresholds than fixing the problem.
2) having to remember to go futz with a ticket after every little thing feels like a lot of busywork. You've already committed some code, mentioned it in #ops or wherever, and now you have to go paste all that information into a task (or many tasks) too?
@beajammingh the title particularly caught my eye. for the past month or two i've been sitting on a rant about how i no longer associate the term "devops"** with modern problems, but with fighting the last war.
** infinitely malleable as it may be
yes, if you have massive software engineering teams and operations teams and they are all siloed off from each other, then you should be breaking down (i can't even say it, the phrase is so annoying) ... stuff.
but this is a temporary stage, right? a bridge to a better world.
I've done a lot of yowling about high cardinality -- what it is, why you can't have observability without it.
I haven't made nearly as much noise about ✨high dimensionality✨. Which is unfortunate, because it is every bit as fundamental to true observability. Let's fix this!
If you accept my definition of observability (the ability to understand any unknown system state just by asking questions from the outside; it's all about the unknown-unknowns) then you understand why o11y is built on building blocks of arbitrarily-wide structured data blobs.
If you want to brush up on any of this, here are some links on observability:
Close! "If you're considering replacing $(working tool) with $(different tool for same function), don't do it unless you expect a 10x productivity improvement"
cvs to git? ✅
mysql to postgres? ❌
puppet to chef? ❌
redhat to ubuntu? ❌
The costs of ripping and replacing, training humans, updating references and docs, the overhead of managing two systems in the meantime, etc -- are so high that otherwise you are likely better off investing that time in making the existing solution work for you.
Of course, every situation is unique. And the interesting conversations are usually around where that 10x break-even point will be.
The big one of the past half-decade has been when to move from virtualization to containerization.