My Authors
Read all threads
How should you decide if an incident merits a post-incident review?

The answer isn't "if it's a P1 or P2, forget P3s". In fact, it's the wrong question to be asking... #Operability 1/n
Many orgs I work with have a policy of "do a review if the incident was a P1 or a P2". Lower priority incidents don't get a review.

This might be due to a high volume of P3s from untuned alerts, friction in the incident review process, lack of emphasis on improvement, etc. 2/n
At best, the incident review process involves team members working together to uncover a shared timeline and improvement actions.

I'd call this a "shallow analysis" that pre-dates an understanding of resilience engineering, operability, the work of @AdaptiveCLabs etc. 3/n
In contrast, a "deep analysis" would emphasise incident analysis prior to an incident review meeting, to obtain richer information on the socio-technical factors involved. This is the work @allspaw specialises in 👑

I don't see deep analyses often. Shallow is still rare :( 4/n
The question shouldn't be

How should you decide if an incident merits a post-incident review?

It should be

Given mandatory incident reviews, how should you decide if an incident merits deep or shallow analysis?

#Operability 5/n
I don't pretend to know _how_ to do a deep incident analysis, like others I learn from @AdaptiveCLabs 🙇‍♂️

I do know that alert priority is *not* a good way to decide on shallow/deep incident analysis, or yes/no incident review 6/n
The idea of alert priority is deeply subjective. One person's P1 is another person's P2.

An org might say "only review P1s and P2s, not P3s" because they are drowning in P3s and want to save review time/money... but a P3 can still cost you revenue 7/n
Production support is revenue insurance

If a P2 alert is linked to an expected max loss of £500K and a P3 is linked to £100K... if the P3 keeps occurring with no reviews, no learnings, it can become as or more costly as the P2

(And that's before reputational damage)

8/n
A P1, P2, P3 incident should have an incident review
A near-miss should have an incident review
A Chaos Day should have an incident review

There needs to be a relentless focus on improvement, on learning, on removing friction from the post-incident process 9/n

#Operability
A deep analysis of an incident, a near-miss, or a Chaos Day should happen if a substantial revenue loss has happened, or is predicted in the future

A shallow analysis of an incident should happen if a low revenue loss has happened, or is predicted 10/n
Incident revenue loss (incurred or forecast) , not incident priority, should govern the post-incident process

#Operability is about reliability, which is about revenue protection 11/n
One consequence of this is a revenue impact calculator must be available *during* an incident, not afterwards.

I've seen too many orgs where revenue impact is considered during a post-incident review, or not at all

It is an input, not an output /end
Thanks for all the comments on incident reviews! Keep them coming

And a reminder I'm available for #ContinuousDelivery and #Operability work from 31 Aug. Get in touch!
Missing some Tweet in this thread? You can try to force a refresh.

Keep Current with Steve Smith

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!