Jan Schaumann Profile picture
Jul 18 36 tweets 11 min read
Pop quiz: what is the maximum size of a DNS response?
Everybody Knows(tm) that your DNS response MUST fit into 512 bytes, because that's the size of a UDP packet. Right?

Let's pretend that's true. How many A records can you put into a round-robin?

Here's a name that will return a bunch of A records and still fits into 512 bytes:
This returned 28 A records. So far, so good. But 28 IPv4 addresses is only 28 * 4 bytes = 112 bytes. Shouldn't we have been able to add a whole bunch of IPv4 addresses more?

Let's take a look at what the packets actually look like, using tcpdump(1):
We easily see we are able to squeeze it all into a single UDP packet, which gives us the above mentioned 28 stuffed into 504 bytes.

Wait, 504 bytes? What happened to the 28 * 4 = 112 bytes calculation? Let's dig deeper, this time using Wireshark:
Ok, so there's a bit more overhead in the DNS packet, where for every answer record we have at least 12 additional bytes.

Compare to the wire format from RFC1035, DNS message (identical for query and response) on the left and the RR format on the right:
That is, every complete DNS response has:

- 12 bytes DNS header
- a few bytes for the query
- for every A record:
- 2 bytes name
- 2 bytes type
- 2 bytes class
- 4 bytes ttl
- 2 bytes rdlength
- 4 bytes IPv4 address
- possibly additional bytes for additional records
The query for 512·size·dns·netmeister·org comes out to strlen("512·size·dns·netmeister·org") + 1 byte NULL + 1 byte numlabels + 2 bytes type + 2 bytes class = 33 bytes.

Add another 11 bytes additional records and we're at 12 + 33 + (28 * 16) + 11 = 504 bytes.
So that's why we had 28 A records: if we had one more A record, we'd need to add 16 more bytes and then yield 520 bytes.

But what if we do add more A records? Let's try to use a record that'd be just under 1024 bytes:
Okay, so with 60 A records (1017 bytes) returned here, clearly we can have a response > 512 bytes and still get it back in a single UDP packet.

But doesn't that contradict just about every standard answer on this here internet? How did we do that?
Enter EDNS(0) (RFC6891).

512 bytes is really not much, and adding e.g., DNSSEC to a response may increase the DNS result size notably, so it's useful for clients to be able to tell a DNS resolver that it can actually accept more than 512 octets.

en.wikipedia.org/wiki/Extension…
In your query, this is done via a pseudo-RR of type "OPT".

We can set this explicitly via the "+bufsize=4096" option to dig(1); the result looks like this:
We should now be able to get much larger results, right?

Let's query for a response that fits into 2048 bytes, clearly well below the 4096 UDP payload size we advertized. Since we expect this to fit into a single UDP packet, we can pass "+ignore" to dig(1).

Alas:
We are not getting any results, and instead notice that our response has the "tc" flag set, indicating it was truncated.

If we then repeat the same command without "+ignore", we then see dig(1) dutifully repeat the query using TCP when it encounters the truncated bit:
But why do we get a truncated response when we had asked for 4096 bytes payload size via EDNS(0)?

Looking at the Additional RR, it seems the server had _its_ UDP payload buffer size set to 1232 bytes. But what kind of whacky number is that?
Well, turns out it's not quite as arbitrary as it may seem.

First, let's note that we want to avoid packet fragmentation, because that enables certain DNS cache poisoning vectors:

web.archive.org/web/2019122507…
So we want to get everything into the Maximum Transmission Unit (MTU), very, very frequently Ethernet's 1500 bytes.

But IPv6 only mandates a minimum size of 1280 bytes, so let's be conservative; subtracting 40 bytes length of IPv6 header + 8 bytes UDP header => 1232 bytes
Btw, here are some measurements and observations from DNS-OARC in 2020:

indico.dns-oarc.net/event/36/contr…

And so we basically agreed on 1232 bytes EDNS0 since at least DNS Flag Day 2020:
dnsflagday.net/2020/
But suppose you _really_ want to bump up that size? In bind:

options {
edns-udp-size 4096;
max-udp-size 4096;
};

...and then we can indeed stuff the entire response into a single UDP packet:
But now we see that our packet _did_ get fragmented, leading to possible spoofing attacks - no bueno!

blog.powerdns.com/2018/09/10/spo…

blog.apnic.net/2019/07/12/its…
Ok, so let's stick with 1232 bytes. That's our UDP limit, then.

What if we retry using TCP? How many records can we return?

We saw that the previous name returned 123 A records. Let's try incrementing the number of A records we return and see how many bytes we can transmit:
Ok, so with all the overhead per record as well as for the response and additional records, we can at least get 2048 A records returned in just over 32K bytes.

But why can't we do 4096 A records? Let's see what happens in the packet capture:
The first response via UDP is, no surprise, truncated, so we retry via TCP. But now the DNS result delivered via TCP is also truncated!

That is, the DNS server has determined that the result will not fit into the maximum response size. Why is that?
Our payload is 4096 * 16 = 65536 bytes RDATA, which, at a 2 byte RDLENGTH field fits into the DNS packet.

But we also need to again account for the overhead noted above:

(4096 * 16) + 12 header + 36 query + 11 additional = 65595 bytes in total.
But the maximum size of an *IP* packet is 16 bit (via the IPv4 total length / IPv6 payload length).

So our limiting factor is now that *everything*, including payload and IP headers etc., has to fit into 65536 bytes.
So the maximum number of A records we can stuff into a response needs to be smaller than 65536 bytes to account for the overhead. How many is that?

With 4092 A records + overhead, this gives us a DNS response in 65530 bytes, with a cushy 6 bytes left to spare.
But can we max out the full 64K theoretical size? Maybe using a TXT record? Let's try.

RFC1035 limits the size of a name in the DNS to 255 octets, but we actually can have multiple names for a single RR.

That is different from having multiple RRs of the same type. Compare:
That is, we can create larger TXT records by using multiple names.

How large? Let's see:
Doubling the payload at every step, things make sense up until 32640.

Doubling again would yield 65280 bytes payload => too much.

But we can trim the last string and we can create the TXT record delivering 65211 bytes payload.
But what is the size limitation on _multiple_ TXT records?

If each TXT record is 255 bytes in length, and we account for the 13 bytes overhead per TXT record, then we can have 244 TXT records w/ 255 characters each + 1 TXT record w/ 72 characters:
And how many TXT records could we have if we put only a few bytes into each?

Each TXT record must be unique, so for simplicity's sake I picked four bytes per TXT record, which then leads me to 3851 records:
Now in all of the commands here, I ran dig(1) against the authoritative server in question. You might see different results with different resolvers.

Getting different results depending on some aspects of the network or the resolver in question - isn't debugging the DNS fun?
Suppose a resolver uses DNSSEC - in that case, the additional RRSIG record will blow up my carefully crafted response, and it looks like Google's resolver (which does query for DNSSEC) will then return SERVFAIL to the client:
Lol, Cloudflare, on the other hand, seems to have decided that it doesn't like my silly RRs at all and uses RFC8914 Extended Error code 15 - "blocked":
And of course you may also observe enterprise software attempting to mangle your DNS lookups, making flawed assumptions about DNS over TCP or monkey around with EDNS(0) and otherwise interfere in uneducated ways.

Yay middle-boxes and firewalls, too! Good times, good times.
Anyway, all that just to say that the correct answer to the original question is, of course: "it depends". :-)

As always, the DNS is more fun than a barrel of monkeys. Although that depends on your definition of "fun". And your monkeys.
Peace out - and remember, when in doubt: pcap or it didn't happen!

This thread about DNS response sizes as a blog post:
netmeister.org/blog/dns-size.…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jan Schaumann

Jan Schaumann Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @jschauma

Jun 3
Lol, blocking "URLs containing ${ may reduce your risk" - I like where this actively exploited Confluence RCE CVE-2022-26134 0-day is going...

confluence.atlassian.com/doc/confluence…
Atlassian now has updated the mitigation advise for CVE-2022-26134 to include replacing the xwork jar (or xwork, webwork, and CachedConfigurationProvider.class):

confluence.atlassian.com/doc/confluence…
xwork-1.0.3-atlassian-8, when you take Struts as the fine example to model your security after...

(Somebody diff that against atlassian-10, and you probably get your vuln.)
Read 5 tweets
May 12
All too often, you can't fix even trivial typos yourself (easily).

Editing is restricted or you may feel like you're stepping on somebody's toes; commenting disabled, or you fear being seen as nitpicking; or it's 100% static, with no indication of where it comes from.
All that contributes to doc-rot. Sometimes it's a result of overly protective least-privilege (popular in infosec docs) or of simply not considering a developer-centric flow when publishing internal documents.
To help avoid that and keep your documentation up to date, make contributions brain-dead simple and obvious. Let anybody comment (but do respond/resolve!); if you have edit history, let anybody edit. (I know, that's scary. But you can roll back.)
Read 9 tweets
May 12
One of the biggest problems in any large organization is documentation. And it's not even that documentation often doesn't exist (although that _is_ a big issue by itself), but that where it exists, it can't be found easily.
You all know the pain of trying to find content:
- search gdocs
- search confluence/wiki
- search jive (which for some reason is a thing)
- search git
- search browser history / closed tabs

Finally ask on Slack and somebody who happens to know the location gives you a link.
Yay, you finally found the link. Except...

- you don't have access perms
- the doc has a big header saying "obsolete"
- it was last modified 5 years ago (does that mean it's obsolete?)
- it looks technically correct, but contains numerous typos (how much do you trust this?)
Read 5 tweets
May 10
Good news, everyone! Coming to you live from Omicron Persei 8, in Hypno-Vision, and sponsored by Bachelor Chow:

🧵 What If Programming Languages Were Futurama Characters? 🧵
Fry is... Perl:

A bit goofy, but optimistic and well-meaning. Seems like it's been frozen for 1000 years, but still delivers. Kind, forgiving, but not always quite so bright. Accidentally became it's own grandfather (via autovivification).
Turanga Leela is... Go

Strong-willed, powerful, will get you through whatever trouble you're in. Makes all the rules, projecting self-confidence, but is constantly seeking peer confirmation. Actually a sewer mutant; lacks depth perception.
Read 15 tweets
Feb 19
It is a truth universally acknowledged, that any developer accessing a web service must be in want of using "curl -k".

-- Jane "DevOps" Austin
Let's discuss certificate errors and how to better understand them rather than ignore them.

That's right, it's a 📃🐞🧵!
Perhaps the most common error is certificate expiration.

Unfortunately, the error message doesn't usually give you the expiration date, so use "openssl s_client" to inspect that:
Read 28 tweets
Feb 12
I frequently see even senior engineers misdiagnose network errors, chasing false flags and inadvertently wasting time debugging problems that could more quickly be diagnosed by paying closer attention to the error messages provided.
The best way to cut down on wasted time is to quickly answer the question:

"Is it the DNS, the network, or the app?"

(And no, it's not _always_ the DNS.)

A quick weekend 🧵 on basic network troubleshooting:
Suppose you encounter

ssh: Could not resolve hostname <hostname>: nodename nor servname provided, or not known

or

ssh: Could not resolve hostname <hostname>: No address associated with hostname

Now, hold it, no need to whip out ping(8), traceroute(8), or tcpdump(1) just yet.
Read 27 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(