Andy Ellis Profile picture
25 Apr, 11 tweets, 5 min read
The first time I crossed paths with @dakami, let’s just say I wasn’t pleased. But it’s a little bit of a long story, and we did share a laugh at the end of it, so bear with me. #InMemoryOfDakami
The story starts in 2007, maybe. One of my biggest worries at Akamai was hardening our DNS infrastructure. If you take out a CDN’s top level DNS, that’s … pretty much it. 2/ #InMemoryOfDakami
You’re limited in how many top level name servers you can use for a given domain (by convention, 8 IPs, although really, 13). That’s gives you a weak spot that an attacker can go after, and I really wanted to solve it. 3/ #InMemoryOfDakami
We already had interesting ways to detect which resolvers were used to send HTTP traffic to edge servers, and, together with some brilliant architects in Mapping, we stumbled on a solution for protecting top level name servers . 4/ #InMemoryOfDakami
Imagine if every resolver on the planet, instead of using the same 8 top level name servers, used a unique set of 8? Especially if those 8 were close to them? It would be hard to take out the whole CDN, and if 8 went down, we’d know the targeting resolver. 5/ #InMemoryOfDakami
The TLDs didn’t give us a way to glue our name servers differently based on resolver, but we stumbled on a clever hack in DNS responses we could use.

The Additional section. #InMemoryOfDakami
DNS allowed you to send extra information in answers, as long as you stayed in your domain. We used two stage DNS - a first query to our top level NS, which sent you to low level NS, with a short TTL, and handled a second query to get you to a web server. 6?/ #InMemoryOfDakami
We could use the additional section to give you a new answer for where to find our top level name servers.

It would be unique to you, and, continually refreshed so a resolver would never go back to the TLDs!
We were so excited. #InMemoryOfDakami
Of course, by this time, we’re just starting to build the system, and plan the safety checks, when a new bug gets announced in DNS.

A bug about using the additional section to send unasked for information.
@dakami had also figured out how to exploit it.
#InMemoryOfDakami
Everyone raced to patch The Kaminsky Bug.
And we were left with a lot of wasted design effort. Let’s just say I uttered som choice words in @dakami’s direction.
9/ #InMemoryOfDakami
I ran into him a few years later, and told him the story. I told him I wished we’d gotten our system out, and how annoyed I was.

He told me it was a clever design, and bought me a drink. FIN/ #InMemoryOfDakami

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Andy Ellis

Andy Ellis Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @csoandy

4 Feb 20
I see a variant of this almost weekly. If you ever find yourself denigrating humanity’s risk management, recognize that we got this far, and consider how your model could use updating. 1/
Humans take risk in everything we do. It’s really important to *not* be paralyzed by known risks. So you internalize the risks you live with, and generally ignore them. 2/
But the new and novel risks - even if they don’t *yet* rise to levels you stably tolerate - capture your attention because they *might* go catastrophic. 3/
Read 5 tweets
3 Jul 19
First: kudos to @Cloudflare for transparency here and throughout their incident.
Next: some thoughts on safety in distributed systems like this. (I don’t know how CF does it, so don’t take this as criticism of their practices, merely some musings from similar experiences) 1/
While test and QA is important, massive distributed systems with unconstrained user inputs are hard to simulate, so deployment to production is *always* risky. Call it “operational field testing,” but there is always the chance you’re going to find new failure modes there. 2/
There are several areas to scale safety: staged rollout, rapid rollback, error detection, edge failure rejection. 3/
Read 30 tweets
27 Jun 19
Dear $VENDOR,
Starting your pitch – especially in a social setting – with “What are you doing about problem X?” is a pretty clear setup. You’re putting your target on the defensive even before they open their mouth. 1/
Yes, you’re trying to get your target to commit to doing something that isn’t as “good” as your solution. And of course, that sets you up to close an improvement deal, right?

Wrong. 2/
Here’s the secret about IT security: it’s imperfect. There are a thousand problems, and there is room for improvement on just about every single one.

And every CIO and CSO knows this, but probably doesn’t want to tell you this. 3/
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!