Henry Robinson Profile picture
Infrastructure @SlackHQ. Distributed systems engineer. I have of late - but wherefore I know not - lost all my data.
Sep 17, 2020 8 tweets 3 min read
A few things I think were done right in that Snowflake paper in 2016 (reporting on 4-year old work!):

1. They got the ergonomics of the cloud early (for the space). Elasticity is a core feature. Storage is no longer a 'you' problem. Cost is compute-as-you go. 2. They had ACID out-of-the-box. This was a huge step forward over the state-of-the-art MPP analytical engines. It helps to have some of the best database engineers in the world on your founding team.
Apr 22, 2019 4 tweets 1 min read
Buried in the "deploy on Friday!!" argument is an interesting generalisation of 'crash-only' software; the idea that you should design features to fail *fast* so that you can detect and react close to deploy time.

I don't think this is actually possible in the general case. Infrastructure software can go wrong sloooowwwwly, in ways that are triggered only by prod traffic. Wiithout any data, I suggest that for this class of bugs maybe 80% of them could be caught within three days, whereas only 30% of them might show up in the first 8-10 hours.
Feb 9, 2019 8 tweets 3 min read
This turned into a great discussion. I was looking for resources I could give to someone designing their first few systems; advice and suggestions about how to structure their thoughts, critique their designs, and iterate.

Some of the highlights below.

Several people mentioned the, new to me, discipline of Non-Abstract Large System Design as practiced at Google.

Is it possible?
Can we do better?

Then

Is it feasible?
Is it resilient?

landing.google.com/sre/workbook/c…

Dec 11, 2017 10 tweets 2 min read
Ok, perhaps let's pump the brakes for a second. Systems and database engineering isn't dead; this paper didn't just replace Postgres with a GPU and a copy of the deep learning book. (1/n)

arxiv-vanity.com/papers/1712.01… I mean, what's been done here is fascinating (partly in its simplicity). Replacing B-Trees - which are functions from keys to ranges on disks - with a learned function is kind of cool.
Oct 20, 2017 7 tweets 2 min read
Great questions! Let me see if I can answer without making a hash of things. See thread below. 1. For CAP, we're considering only one object. In that case, serializability is a weaker property than linearizability (Not 'real-time').
Feb 14, 2017 10 tweets 2 min read
0. Just a reminder for anyone seeing claims that Cloud Spanner beats the CAP theorem: 1. CAP has always said only one thing: that there is always a particular network failure that forces you to give up either C or A.