Discover and read the best of Twitter Threads about #hugops

Most recents (12)

Nearly a week since ~400 companies can not use any @Atlassian products like JIRA, Confluence. I've talked to several impacted teams and they are upset how poorly Atlassian is handling the biggest outage these teams experienced.

A thread on what Atlassian needs to fix and why: Atlassian system status: all red.
1. Outages happen, no matter how you try to avoid them. No one should be upset about this incident, nor search for who is to blame (apparently, a maintenance script)

What matters is what happens AFTER the incident is discovered.

2. Initially, Atlassian did just fine in notifying about something being wrong. They posted updates after the incident started. Updates posted after the incident started.
Read 21 tweets
ELI5: Today's Facebook outage (based on no inside info). Internet is designed in layers, like house built on a foundation. No coordination between layers - but there's dependency. 1st meaningful part is "layer 3" - the Internet Protocol (IP) network layer. [thread] /1
The DNS generally (not always) depends on what is known as the User Datagram Protocol (UDP) for transport - which is layer 4 on top of IP at layer 3. Then the DNS protocol itself happens at layer 7 on top of IP addressing/routing and UDP transport. /2
Specifically most DNS runs on UDP port 53 (aka UDP/53). There are different "port" numbers - unique numbers - assigned for specific uses. This makes operations/troubleshooting/interop easier. Find 'em all at iana.org/assignments/se… /3
Read 26 tweets
It's geometry homework! It's a SWOT analysis! No, it's the 2021 @Gartner_inc Magic Quadrant® for Cloud Infrastructure and Platform Services! Let's make fun of it.
It goes alphabetically and thus begins with @alibaba_cloud. "Strong in China," "actually makes money" and whatever the hell this is count as their strengths.
Cautions for @alibaba_cloud are "weak in not-China," "local competitors and also geopolitics," and "lack of transparency" that includes both discounting and technical details of service implementations.
Read 20 tweets
Ooh the Internet just melted.
#hugops to Fastly, who has a clear gift for disaster understatement.
“Flight slightly delayed.”
Read 4 tweets
For every five (5) retweets this tweet gets, I will add an Uncomfortable Cloud Industry Truth to this thread. Let's begin...
"We're fully redundant across AZs, regions, and even cloud providers!" crows the engineer with a single corporate credit card backing the entire house of cards.
Sudoku is a fun logic puzzle that stretches your mental faculties, but it doesn't solve any real issues; nobody NEEDS the numbers in the grid correctly. This is of course an allegory for Kubernetes.
Read 110 tweets
There’s a lot of buzz right now about a “massive DDoS attack” targeting the US, complete with scary-looking graphs (see Tweet below). While it makes for a good headline in these already dramatic times, it’s not accurate. The reality is far more boring. 1/X
It starts with T-Mobile. They were making some changes to their network configurations today. Unfortunately, it went badly. The result has been for around the last 6 hours a series of cascading failures for their users, impacting both their voice and data networks. 2/X
That caused a lot of T-Mobile users to complain on Twitter and other forums that they weren’t able to reach popular services. Then services like Down Detector scraps Twitter and report those services as being offline. 3/X
Read 8 tweets
I’m not sure where #hugops started; the general theme is that nobody enjoys downtime.
It doesn’t matter if it’s your fiercest competitor; the folks responsible for keeping the site up (read as: the ops folks) know that today it’s your downtime, but tomorrow it’s theirs.
Earlier today Amazon.com took an outage. Folks from Target, Walmart, and basically every other company in the world weren’t dunking on them. This stuff is hard.
Read 5 tweets
Every technology professional who pays attention to politics is apoplectic about what happened in Iowa last night. My take(s), in thread form...
1. This was an unnecessary use of technology. While I have a self-interested motive to see more tech in more places (future job security), you don't need an app for everything. Especially when you have an effective and resilient process in place.
In the language of Agile, there was no BUSINESS VALUE in replacing humans making a phone calls to report results with an app. If they were 10x'ing the caucus locations, you could make the case, but that was never happening. The old system worked fine.
Read 15 tweets
And now, @KoltonAndrus takes the #ChaosConf stage to talk about how the venerable "I meant to do that" excuse became an entire respected industry.
Talking about empathy for outages. YES. @blamelesshq is in the audience; I love the ethos. #ChaosConf
"The practice of intentionally breaking things started at Netflix, a streaming movie company. Obviously this maps directly to other fields like banking and autonomous vehicles." #ChaosConf
Read 7 tweets
All settled in at the #oow19 keynote, to be delivered by @FakeOracleLarry. Keynotes suspend my "don't be a snarky ass when you're an invited guest" rule. Oracle PR is already aware and drinking to forget in advance.

Join me in this thread.
"Wait, why is he starting the keynote snark livetweet thread with a disclaimer?"

Because #oow19 is an Oracle property and I'm nothing if not on-brand.
Holy crap the #oow19 keynote starts with an actual freaking disclaimer. They're reading this whole thing aloud!

I'm not kidding.
Read 51 tweets
A quick reminder and a thread regarding the @MarriottBonvoy power outage at #VelocityConf:

1) Just because Marriott isn’t a traditional tech company doesn’t mean they don’t deserve #hugops. This has to be just as rough for the employees as the guests

1/n ImageImage
2) #VelocityConf attendees think of this as an awesome opportunity to think through how you would approach a post-incident review for the event. Who would you talk to? What questions would you ask? How would you maintain a blameless culture?

My take continues...

2/n
Who would you talk to regarding the power outage?

There are a few main stakeholding entities involved (that I’ve been able to identify so far):

- Marriott employees and staff
- Guests of the Marriott
- Local power company
- #VelocityConf staff

3/n
Read 7 tweets
I was summoned in a thread about the 2017 S3 failure. You guessed it--it's threadin' time! (1/13)
First, Karan's great. I'm not trying to say he's being obnoxious, insulting, etc. He isn't. But he is wrong. The internet didn't turn into me. The internet turned into a bunch of spoiled childish shitheads. (2/13)
I saw a lot of crappy behavior when S3 went down. People implying that AWS was incompetent. People berating the @awssupport and @awscloud accounts on social media. (3/13)
Read 13 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!