Andrey Meshkov Profile picture
CTO & Co-Founder at @AdGuard

Oct 4, 2021, 9 tweets

We just had a serious outage of @AdGuard DNS, but it was actually caused by @Facebook. What happened and how on earth @AdGuard may depend on FB? Let me try to explain. (1/9)

Everything started with Facebook name servers going down today. AdGuard DNS connects to them in order to find out the addresses of Facebook domains. So, they went down and now AdGuard DNS was responding with error to every request for FB domains’ addresses. (2/9)

This caused a considerable spike in the overall number of requests. What happened? Every app, every device was now repeatedly requesting FB domains as if they can’t live without it. (3/9)

The high number of requests is not much of a problem for us, we’re ready for higher load so this went almost unnoticed. So everything was working well until one crucial moment when Facebook engineers decided to null-route their nameservers. (4/9)

What does this mean? From now on requests to FB name servers not just fail, they TIME OUT. Now we could not respond quickly with an error and have to wait for a few seconds until we’re sure there will be no response. (5/9)

The worst part is that we weren’t doing any negative caching. It means that if we cannot resolve a domain, we were trying to do that again and again until it finally succeeds (it never did) instead of caching the negative result at least for a few seconds. (6/9)

So we had an overwhelming number of incoming queries that time out and simply exhaust the servers resources. This all lead to one of the worst outages we ever had with AG DNS. (7/9)

At some point we almost hit 1M queries per second (our normal load is about 250-300k). The most of the queries are encrypted (DoT/DoH/DoQ) so this is like 10x regular DNS load. (8/9)

It took us about an hour to figure all that out, implement a fix (negative caching it is) and deploy it to every AG DNS server. Everything works well now, but we learned a couple of very useful lessons. Thanks, FB! (9/9)

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling