, 6 tweets, 2 min read Read on Twitter
Several reporters are asking me for insight on the FB outage. I have none, other than that outages on massive distributed systems can sometimes follow this pattern:
1) A human engineer makes a small config change. It goes into test environment and everything is ok, so the change makes its way to production.
2) Config change has unintended side-effects that only express in production, perhaps due to scale or a mismatch with the test environment. Things start to go haywire, and failures compound as critical services timeout and queues lengthen.
3) Fortunately, this company is staffed by real adult engineers who considered this possibility, so there is an automated or semi-automated process to roll-back to the last known good state. This automated agent is dispatched to deal with the anomaly.
4) Unfortunately, the automated system doesn't know how to handle the problem, and gets stuck in some kind of loop that causes more damage. Humans have to step in, stop it, and restart a complex web of interdependent services on hundreds of thousands of systems.
5) Humans win, but after paying a significant cost. The system, now rebooted into its new incarnation, is safe for now. But how long can the peace between man and machine hold?
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Alex Stamos
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!