Really excited for this next talk by @LauraMDMaguire on “Lowering costs of coordination during service outages: A multiple case analysis”
#VelocityConf
@LauraMDMaguire is going to be going into details associated with the research done as a part of the SNAFU Catchers Consortium (snafucatchers.com)
#VelocityConf
@LauraMDMaguire’s engagement with SNAFU actually starts with the tri-can device which is used in rock climbing.
vdiffclimbing.com/tricams/
#VelocityConf
I love it when speakers make things real through narrating real personal stories and tying them to the subject. @LauraMDMaguire's talk covers experience in rock climbing involving failed assumptions, mental and physical exhaustion, cognitive overload, and pressure
#VelocityConf
Similar demands are placed on engineers managing incidents and operations every single day.
Needless to say this is not a topic that's new to #VelocityConf...
@allspaw, @ddwoods2 @ri_cook and others have all come to #VelocityConf in the past to share their expertise in Cognitive Systems and safety at previous #VelocityConf
The second cycle of SNAFU Catchers has involved contributions from:
- IBM
- New Relic
- KeyBank
- SalesForce
The focus this year has been on controlling the costs of coordination during incident response #VelocityConf
We'll be covering some early insights from this second round (the official report will be out later this summer).
#VelocityConf
Nearly all forms of meaningful work is 'joint activity'
- There's effort necessary to maintain what's known as 'Common Ground' in this activity
- As the speed and scale of systems increase the demands of this maintenance becomes greater
#VelocityConf
Coordination is not isolated to human to human interaction it also includes human to machine interaction as well.
There's a lot of automation used in incident response which aids in coordination but they can also generate cognitive demand on operators to utilize.
#VelocityConf
All of these demands are being made on top of other cognitive demands associated with managing the incident. Time can compound this pressure - it can increase intensity of effort and also limit actionable opportunities as well.
#VelocityConf
In complex activities and systems the human activity ebbs and flows over time. In busy high tempo operations task performance is more critical and consequences can escalate.
#VelocityConf
"Wait this is why we have incident command right?"
There's evidence that ICs alone are insufficient for smooth communication. Everyone has to invest in coordination in some way.
#VelocityConf
When one focuses on a single task and cut away from maintaining common ground coordination it does not come back for free. A debt of sorts is generated and effort must be re-invested to re-enter communications and coordination.
#VelocityConf
For joint activity to succeed there needs to be a deferral of local goals for those of the common group.
For instance: teams dropping planned work to jump on to an incident Slack or conference line to coordinate resolution.
#VelocityConf
Because this is cognitive and psychological there are direct parallels to this type of example within other industries: think aviation, aerospace, and medical.
One thing these industries generally don't deal with is distributed teams.
#VelocityConf
This picture from Apollo 13 demonstrates a lot of what goes in to the coordination of joint activity:
Data is being grabbed from screens, papers, internal digestion, side work being performed, concurrent audio from headsets etc
#VelocityConf
Being co-located in the same environment such as the previous picture provides a lot of context 'for free.' Cues, both data based and behavioral, are easy to derive. This is not the case with distributed teams...
#VelocityConf
As multiple platforms are utilized to pull in context, if the tools do not automatically support coordination this weight is placed on the shoulders of the operators themselves.
#VelocityConf
This can cause communication and coordination breakdowns.
#VelocityConf
While there’s a generally linear pattern for incident resolution for unchanging systems a linear model is woefully insufficient in highly complex distributed systems
#VelocityConf
Models and tools need to be based on reality and the complexity of this type of coordination.
#VelocityConf
Escalation - when normal work breaks down and becomes exceptional.
#VelocityConf
As an incident escalated cognitive and coordination demands increase making it more difficult to accomplish joint activity.
#VelocityConf
For all of these reasons simplistic models representing effectiveness of incident response such as Mean Time To Recover (MTTR) is insufficient.
#VelocityConf
Cognitive demands during an incident are cyclical in nature, as we take a more expansive hypothesis driven approach coordination increases and as we observe and validate these hypotheses we lower cost
#VelocityConf
Some quick takeaways:
Tech can help or hinder coordination
Coordination must be designed to integrate expertise, bringing in new resources, and getting up to speed.
#VelocityConf
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
