, 20 tweets, 3 min read Read on Twitter
Learning from incidents, let’s talk specifics! A thread.
Those of us who advocate learning from incidents tend to be a bit vague about what, specifically, you might learn. I’m going to try to provide some examples, to make things more concrete.
I’m going to talk about three categories. These aren’t exhaustive, but I find them useful:

* gaps
* skill transfer
* shared understanding
“Gaps” are probably what most people think about getting out of incidents, they indicate some sort of deficiency that we notice that we can potentially address.
One common gap is identifying some usability issue with a tool involved in operations. Some user interface is confusing or error-prone, or the “right” (safer) way to perform an action is too hard, so people tend to work around this by doing it an unsafe way.
Another gap is an operational expertise gap. You might learn that a team is using some unsafe method for performing some action, and that they weren’t aware it was unsafe, and that there was a better way. This can present opportunities for improved education & training.
Incidents can also reveal resource gaps. For example, an incident may reveal that a team is at risk of overload. This is an area that @nora_js worked on when she was at Netflix.
“Skill transfer” is learning tacit knowledge from others. Incidents provide a great opportunity to watch experts in action.
One thing I personally learned from an incident was by watching how an expert used an internal request tracing tool to effectively diagnose an ongoing problem.

(The particular tool is mentioned in this tech blog: link.medium.com/DyhEhzsG6V )
You can also learn how experts diagnose problems by drawing from their experience. For example, I saw an expert diagnose a problem as being related to the default timeout being too low on a particular service because he had hit this problem before.
Reading narratives about how experts made diagnoses about a particular problem based on their experience is a great way to learn from their experience. Try to capture what those previous experiences were as well!
“Shared understanding” is when the organization does a better job at spreading knowledge about how the system itself works. It’s the least “actionable” form of learning, but it can be extremely valuable, in the same way that having years of experience is very valuable.
For example, you might learn that a particular service involved in an incident was initially implemented as a prototype, and was never intended for long-term production use.
Similarly, you might also learn that a service was designed for one particular use case, but later on it evolved to accommodate a new use case.
These kinds of learnings may not lead to any changes in the services you are learning about, but when people work on new services, these experiences will inform their judgment.
“Do we really need to spend the effort to harden this and add metrics, it’s just a prototype?” Remember what happened before?
Another example in the “shared understanding” category is learning about the rationale behind changes, or internal customer asks.
“Oh, if I had known he was asking for this feature because of X, I would have...” is one of my favorite responses in the wake of an operational surprise.
You can also learn about the constraints that other teams in your org are working under, as these are often invisible.
Learning doesn’t always involve an explicit action or outcome. You never know when in the future you will draw upon what you’ve learned to make a better decision. Fin
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Lorin Hochstein
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!