[Dnsmasq-discuss] Corrupted query causing FORMERR?
Simon Kelley
simon at thekelleys.org.uk
Sun Aug 20 22:11:23 UTC 2023
On 17/08/2023 18:08, John Horne wrote:
> Hello,
>
> We have for some time had reports of intermittent DNS query failures. For the
> servers concerned, a client on the server causes a query to be sent (via
> resolv.conf) to 127.0.0.1 which is the dnsmasq process. If the query is not in
> the cache, then it is forwarded to a DNS resolver server running Unbound.
>
> I have been running a short script which runs 'dig' every 10 seconds on a name.
>
> The unbound servers shows entries such as:
>
> ============
> Aug 16 21:08:48 unbound[1837198:1] query: 10.121.16.84
> sauopprdwebsite1.blob.core.windows.net. A IN
> Aug 16 21:08:48 unbound[1837198:1] reply: 10.121.16.84 - - - FORMERR - - -
> ============
>
> The script/dig output shows
>
> ============
> ; <<>> DiG 9.11.36-RedHat-9.11.36-8.el8_8.1 <<>>
> sauopprdwebsite1.blob.core.windows.net
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: FORMERR, id: 59752
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
>
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 1232
> ;; QUESTION SECTION:
> ;sauopprdwebsite1.blob.core.windows.net. IN A
>
> ;; Query time: 1 msec
> ;; SERVER: 127.0.0.1#53(127.0.0.1)
> ;; WHEN: Wed Aug 16 21:08:48 BST 2023
> ;; MSG SIZE rcvd: 67
> ============
>
>
> In running tcpdump (at a different time) showed that the query seemed to become
> corrupted. It was still seen as a DNS query, but as can be seen it contains
> part of a CNAME at the end, which in itself seems to be part of a reply to the
> query. That is, the CNAME 'blob.dub08' is actually part of a CNAME that is part
> of the reply for the DNS name 'sauopprdwebsite1.blob.core.windows.net'.
> Very odd!
>
> The tcpdump output was:
>
> ===========
> 17:30:19.344158 IP (tos 0x0, ttl 64, id 53147, offset 0, flags [DF], proto UDP
> (17), length 107)
> 10.121.16.84.38190 > 10.120.16.9.domain: [bad udp cksum 0x35b6 -> 0x96f9!]
> 32966+ [1au] A? sauopprdwebsite1.blob.core.windows.net. ar:
> sauopprdwebsite1.blob.core.windows.net. CNAME[|domain]
> 0x0000: 4500 006b cf9b 4000 4011 3599 0a79 1054 E..k.. at .@.5..y.T
> 0x0010: 0a78 1009 952e 0035 0057 35b6 80c6 0120 .x.....5.W5.....
> 0x0020: 0001 0000 0000 0001 1073 6175 6f70 7072 .........sauoppr
> 0x0030: 6477 6562 7369 7465 3104 626c 6f62 0463 dwebsite1.blob.c
> 0x0040: 6f72 6507 7769 6e64 6f77 7303 6e65 7400 ore.windows.net.
> 0x0050: 0001 0001 c00c 0005 0001 0000 0012 002d ...............-
> 0x0060: 0462 6c6f 620f 6475 6230 38 .blob.dub08
>
> 17:30:29.357617 IP (tos 0x0, ttl 64, id 54580, offset 0, flags [DF], proto UDP
> (17), length 107)
> 10.121.16.84.37068 > 10.120.16.9.domain: [bad udp cksum 0x35b6 -> 0xfd34!]
> 62988+ [1au] A? sauopprdwebsite1.blob.core.windows.net. ar: . OPT UDPsize=4096
> (79)
> 0x0000: 4500 006b d534 4000 4011 3000 0a79 1054 E..k.4 at .@.0..y.T
> 0x0010: 0a78 1009 90cc 0035 0057 35b6 f60c 0120 .x.....5.W5.....
> 0x0020: 0001 0000 0000 0001 1073 6175 6f70 7072 .........sauoppr
> 0x0030: 6477 6562 7369 7465 3104 626c 6f62 0463 dwebsite1.blob.c
> 0x0040: 6f72 6507 7769 6e64 6f77 7303 6e65 7400 ore.windows.net.
> 0x0050: 0001 0001 0000 2910 0000 0000 0000 0c00 ......).........
> 0x0060: 0a00 0820 2475 1b72 25a1 0e ....$u.r%..
> ===========
>
> The second tcpdump query output above correctly shows the 'OPT' record, and
> resolves the query with no problems.
>
>
> So, for some reason we are seeing corrupted DNS queries coming from dnsmasq to
> the Unbound server. Anyone any ideas as to what could be causing this or what
> could be checked?
>
> For additional info, the client servers are typically running Rocky 8 Linux in
> Azure with dnsmasq version 2.79. The Unbound server is a Rocky 9 Linux server
> in Azure running Unbound version 1.16.
> I have run the test script on Azure servers, a local VMware server and a local
> physical server. The Azure servers show many FORMERR failures, the VMware has
> only shown a few, and so far we had none from the physical server.
>
> Things tried include, disabling the 'edns0' option in the /etc/resolv.conf
> file; setting the max UDP packet size to 1232; setting the max UDP packet size
> to 512; using dnsmasq version 2.89. These all failed in that we still received
> FORMERR replies.
> The only option so far that has worked is to disable the use of dnsmasq via
> 127.0.0.1, and let each server send queries direct to the Unbound server. This
> has caused no FORMERR replies.
>
>
To recap:
Dnsmasq is somehow mangling a query before sending it to the upstream
server.
This behaviour is not consistent: it doesn't always happen.
I'm not clear if this is just for the particular query in the example,
or happens for others.
It seems to be associated with virtual servers.
What would be useful is to get packet dumps of the relevant packets: the
query going into dnsmasq, the query going upstream from dnsmasq to
unbound, the reply from unbound and the reply from dnsmasq to the
original requestor. Possibly the easiest way to do that would be use the
packet-dump facility built in to dnsmasq
--dumpfile=<path/to/file>
--dumpmask=0x000F
and send me the resulting file.
Cheers,
Simon.
>
> Thanks,
>
> John.
>
> --
> John Horne | Senior Operations Analyst | Technology and Information Services
> University of Plymouth | Drake Circus | Plymouth | Devon | PL4 8AA | UK
> ________________________________
> [https://www.plymouth.ac.uk/images/email_footer.gif]<http://www.plymouth.ac.uk/worldclass>
>
> This email and any files with it are confidential and intended solely for the use of the recipient to whom it is addressed. If you are not the intended recipient then copying, distribution or other use of the information contained is strictly prohibited and you should not rely on it. If you have received this email in error please let the sender know immediately and delete it from your system(s). Internet emails are not necessarily secure. While we take every care, University of Plymouth accepts no responsibility for viruses and it is your responsibility to scan emails and their attachments. University of Plymouth does not accept responsibility for any changes made after it was sent. Nothing in this email or its attachments constitutes an order for goods or services unless accompanied by an official order form.
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
>
More information about the Dnsmasq-discuss
mailing list