Tweet

Jessica Joy Kerr

Follow @jessitron

Oct 26 • 19 tweets • 10 min read

@DivineOps

SRE teams try to keep toil under 50%.

Only 50% of work that has no enduring value...

@DivineOps #QConSF

SREs in the audience? (Dozens of hands)
Experienced SREs? (Like 2.5 hands)

@RedHat

We @RedHat used to ship products. Build a thing, package it, send to customers. Then it was their problem. Customer hires a consultant or figures it out.

Now we mostly ship services. Now it’s our headache, reliability and uptime etc. It’s different

@DivineOps #QConSF

@DivineOps

SRE: “hopefully people get paid more for having this title”

The innovative part of SRE is: explicit agreements that align incentives. Between dev, ops, business.

@DivineOps #QConSF

@DivineOps

SLA = financially-backed availability.
The contract has a % of cost that is refunded if availability is lower than advertised.

This aligns incentives between vendor and customer.
(So much as a single metric can)

@DivineOps #QConSF

@DivineOps

SLO = targeted reliability. What do we care about?
(Service level objective)

Example: availability from a customer’s perspective

@DivineOps #QConSF

@DivineOps

SLI = actual reliability.
Without good monitoring, you don’t know whether the service does what the user expects it to do.

Monitoring improves as internal users catch stuff and move it into automation (only problems seen before)

@DivineOps #QConSF

@DivineOps

SLO = business approved reliability.

Explicitly aligns incentives between Business & Engineering.

@DivineOps #QConSF

@DivineOps

Error budget = acceptable level or Unreliability

When it’s gone, developers shift focus from delivering features to improving reliability.
… which they never had the incentive to do before!

@DivineOps #QConSF

@DivineOps

100% availability… no. Impossible, unnecessary, extremely expensive.
99.999% … will your users even notice?
The background error rate of the internet is .01-1% (depends on ISP)

@DivineOps #QConSF

@DivineOps

Things we got wrong about SRE:

the book says “it’s what you get when a software engineer designs an operations team” … no.

This led to:
Hire developers to do ops things, and get effective SRE? …no.

@DivineOps #QConSF

@DivineOps

Why didn’t Ops automate themselves out of a job?
because they didn’t have a software engineer? no.

They didn’t have APIs! The only way to update a registry was a human clicking.

@DivineOps #QConSF

@DivineOps

Kudos to Jeffrey Snover, who fought for PowerShell automation for Windows administration.

Google explicitly built Borg to be automatable.

Then people got the message: infrastructure as code.
Puppet, chef, etc.

@DivineOps #QConSF

@DivineOps

As an industry, we worked really hard to make the tools to make this automation happen.
THAT is what makes SRE possible.

Consistent APIs and reliable monitoring are prerequisites to automation.

@DivineOps #QConSF

@DivineOps

Second thing we got wrong:

Toil is bad, it’s useless, eliminate it.

Are we striving for a human-less system?

@DivineOps #QConSF

@DivineOps

Humanless systems don’t maintain themselves. No matter how automated, how well structured — they don’t maintain their structure.

@DivineOps #QConSF

@DivineOps

People do troubleshooting, responding, adapting, noticing
to keep the system functioning: it looks a lot like toil.
Are people rewarded for that?

Some SREs do more automation, others more on-call keeping the system running and learning from it.

@DivineOps #QConSF

@DivineOps

Cloud provides an industry standard for consistent infrastructure-level APIs
❤️☁️❤️

Also kubernetes 🤩

@DivineOps #QConSF

@DivineOps

Align your toil where your business value is.

Below that, call a PaaS API.

@DivineOps

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

Read 11 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Separate emails with commas Message

Share this page!

Jessica Joy Kerr

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @jessitron

Jessica Joy Kerr

Jessica Joy Kerr

Jessica Joy Kerr

Jessica Joy Kerr

Jessica Joy Kerr

Jessica Joy Kerr

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!