Tweet

Andy Ellis

25 Apr, 11 tweets, 5 min read

@dakami

The first time I crossed paths with @dakami, let’s just say I wasn’t pleased. But it’s a little bit of a long story, and we did share a laugh at the end of it, so bear with me. #InMemoryOfDakami

The story starts in 2007, maybe. One of my biggest worries at Akamai was hardening our DNS infrastructure. If you take out a CDN’s top level DNS, that’s … pretty much it. 2/ #InMemoryOfDakami

You’re limited in how many top level name servers you can use for a given domain (by convention, 8 IPs, although really, 13). That’s gives you a weak spot that an attacker can go after, and I really wanted to solve it. 3/ #InMemoryOfDakami

We already had interesting ways to detect which resolvers were used to send HTTP traffic to edge servers, and, together with some brilliant architects in Mapping, we stumbled on a solution for protecting top level name servers . 4/ #InMemoryOfDakami

Imagine if every resolver on the planet, instead of using the same 8 top level name servers, used a unique set of 8? Especially if those 8 were close to them? It would be hard to take out the whole CDN, and if 8 went down, we’d know the targeting resolver. 5/ #InMemoryOfDakami

The TLDs didn’t give us a way to glue our name servers differently based on resolver, but we stumbled on a clever hack in DNS responses we could use.

The Additional section. #InMemoryOfDakami

DNS allowed you to send extra information in answers, as long as you stayed in your domain. We used two stage DNS - a first query to our top level NS, which sent you to low level NS, with a short TTL, and handled a second query to get you to a web server. 6?/ #InMemoryOfDakami

We could use the additional section to give you a new answer for where to find our top level name servers.

It would be unique to you, and, continually refreshed so a resolver would never go back to the TLDs!
We were so excited. #InMemoryOfDakami

@dakami

Of course, by this time, we’re just starting to build the system, and plan the safety checks, when a new bug gets announced in DNS.

A bug about using the additional section to send unasked for information.
@dakami had also figured out how to exploit it.
#InMemoryOfDakami

@dakami

Everyone raced to patch The Kaminsky Bug.
And we were left with a lot of wasted design effort. Let’s just say I uttered som choice words in @dakami’s direction.
9/ #InMemoryOfDakami

I ran into him a few years later, and told him the story. I told him I wished we’d gotten our system out, and how annoyed I was.

He told me it was a clever design, and bought me a drink. FIN/ #InMemoryOfDakami

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @csoandy

Andy Ellis

@csoandy

4 Feb 20

https://twitter.com/anildash/status/1224028378119929856

I see a variant of this almost weekly. If you ever find yourself denigrating humanity’s risk management, recognize that we got this far, and consider how your model could use updating. 1/

https://twitter.com/anildash/status/1224028378119929856

Humans take risk in everything we do. It’s really important to *not* be paralyzed by known risks. So you internalize the risks you live with, and generally ignore them. 2/

But the new and novel risks - even if they don’t *yet* rise to levels you stably tolerate - capture your attention because they *might* go catastrophic. 3/

Read 5 tweets

Andy Ellis

@csoandy

3 Jul 19

@Cloudflare

First: kudos to @Cloudflare for transparency here and throughout their incident.
Next: some thoughts on safety in distributed systems like this. (I don’t know how CF does it, so don’t take this as criticism of their practices, merely some musings from similar experiences) 1/

https://twitter.com/mjos_crypto/status/1146168236393807872

While test and QA is important, massive distributed systems with unconstrained user inputs are hard to simulate, so deployment to production is *always* risky. Call it “operational field testing,” but there is always the chance you’re going to find new failure modes there. 2/

There are several areas to scale safety: staged rollout, rapid rollback, error detection, edge failure rejection. 3/

Read 30 tweets

Andy Ellis

@csoandy

27 Jun 19

Dear $VENDOR,
Starting your pitch – especially in a social setting – with “What are you doing about problem X?” is a pretty clear setup. You’re putting your target on the defensive even before they open their mouth. 1/

Yes, you’re trying to get your target to commit to doing something that isn’t as “good” as your solution. And of course, that sets you up to close an improvement deal, right?

Wrong. 2/

Here’s the secret about IT security: it’s imperfect. There are a thousand problems, and there is room for improvement on just about every single one.

And every CIO and CSO knows this, but probably doesn’t want to tell you this. 3/

Read 6 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Andy Ellis

Try unrolling a thread yourself!

More from @csoandy

Andy Ellis

Andy Ellis

Andy Ellis

Did Thread Reader help you today?

Like this author's thread?