12,399 views

Charity Majors

@mipsytipsy

, 14 tweets, 3 min read

My Authors

https://twitter.com/brianleroux/status/1256620048262639618

https://twitter.com/brianleroux/status/1256620048262639618

This is a fun and informative thread on matters such as progressive deployments, serverless, canaries, and chaos.

And at least one useful point I don't think I've ever thought to articulate before, along the subject of testing in production ...

https://twitter.com/brianleroux/status/1256620048262639618

... which is that canaries are useful for testing code correctness and perf of the service itself, while rolling upgrades are useful for testing the impact of your changes on other services or storage systems.

You really want the ability to do both, as someone shipping code.

Actually, this is a great example of sociotechnical problemsolving.

Are you frustrated by too many failed deploys, or days elapsing before bugs are noticed and reverted? Do you struggle with people getting paged and losing time debugging changes that weren't theirs?

What won't help: getting upset with people, begging them to be more careful, endless retrospective, blaming and shaming.

What might: assuming your team already wants to do a good job, and building tooling to boost visibility, create feedback loops, and give fine grained control

Some examples. Is your problem that it's difficult and time consuming to figure out which diff is causing a problem, and who owns it?

Fix your deploys so that each deploy contains a single changeset. Generate a new, tested artifact after each merge, and deploy them in order.

This, fwiw, is the single most important piece of advice I would give to ANYONE. Unbundling the snarl of merges and autodeploying each diff after tests run and produce an artifact -- this is THE key to unfucking deploys and hooking up the right feedback loops.

✨YOU CAN DO IT✨

Once you do that, you are on to more exciting options.

Struggle with ownership? Modify your paging alerts so that if it's within an hour of a deploy to the complaining service, it pages whoever wrote and merged the diff that just rolled out. I

Struggle with a team that keeps lobbing a deploy out and not finding errors until days or weeks later? Aha, now we are deep in social/technical experimentation territory. ☺️

Likely your team is very weak at practicing instrumentation, for starters.

You might want to devote some real cycles to standardizing your instrumentation and observability, and create social and technical pressures to do the right thing.

Like, make it expected for devs to watch as their own diffs roll out, synchronously/in real time.

Pre-generate a url that compares the current/stable version and their new version, and graphs both versions side by side. Have the deploy process spit it out and instruct them: "GO HERE".

Maybe it deploys to a canary or 10% of hosts and then requires confirmation to proceed.

I could keep going... there are endless things you can do or build to incentivize engineers to actively explore and engage with their code in prod. ☺️

Or maybe you have the opposite problem and people ship too recklessly, and prod goes down or the deploys fail/rollback all day.

In that case maybe you need to build --

* automate the process of deploying each CI/CD artifact
*... to a single canary host
* then monitor a number of health checks and thresholds over the next 30 min
* if ok, promote 10% of hosts at a time to the new version

I believe that the deployment process is a criminally undertapped resource, the most powerful tool you have for understanding the strengths and weaknesses of a team.

Carefully considered changes to deploys can improve the overall function of the team in one fell swoop.

Imagine a world where your team is the team you have today, except:

- if you merge to master, your changes will automatically go live in the next 15 minutes

- deploys are almost entirely nonevents, because the behavioral changes are governed through feature flags anyway

Enjoying this thread?

Keep Current with Charity Majors

Stay in touch and get notified when new unrolls are available from this author!

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Enjoying this thread?

Try unrolling a thread yourself!

More from @mipsytipsy see all

Embed code for your website

Did Thread Reader help you today?