[Dnsmasq-discuss] Incorrect response for DNAME'd records in dnsmasq 2.80+

Dominick C. Pastore dominickpastore at dcpx.org
Mon Sep 14 05:45:02 BST 2020


This caught my eye because it's similar to a bug I noticed in 2.80. See (and ignore the first half of the message about CNAMEs; that was an unrelated issue):
http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2019q4/013483.html

It sounds like that was essentially the same issue, but without DNAMEs. It turned out it had already been fixed but the fix hadn't been released yet at the time:
http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=162e5e0062ce923c494cc64282f293f0ed64fc10

That fix was eventually in 2.81, but it looks like it misses the cases where the NXDOMAIN reply contains a CNAME or DNAME.

I've attached a patch that hopefully fixes this, but word of warning: I've only been able to verify that it fixes the CNAME case, not the DNAME case. I don't think it breaks the intended functionality from b6f926f, but I will admit, I don't feel familiar enough with the inner workings of Dnsmasq to verify that myself.

Regards,
Dominick

On Fri, Sep 11, 2020, at 7:53 PM, James Brown wrote:
> Just wanted to bump this thread since this is still kind of a show-stopper for anyone that uses DNAMEs heavily. Any thoughts on how to fix?
> 
> On Wed, Jul 29, 2020 at 12:16 PM James Brown <jbrown at easypost.com> wrote:
>> Indeed, that's the commit that did it.
>> 
>> I'm not sure why that change has any effect for DNAMEs, though (which are not being generated internally to dnsmasq)...
>> 
>> On Wed, Jul 29, 2020 at 12:07 PM Geert Stappers <stappers at stappers.nl> wrote:
>>> On Wed, Jul 29, 2020 at 11:23:17AM -0700, James Brown wrote:
>>> > I'm upgrading some test nodes in my employer's cluster from 2.78 to 2.82
>>> > and handling of DNAMEs in the new version seems different (and wrong).
>>> > 
>>> > The setup:
>>> > 
>>> > local.mycompany.net is a DNAME to local-<dcname>.mycompany.net, with
>>> > authoritative resolvers in each datacenter serving a different DNAME record
>>> > prod.mycompany.net is an unrelated domain
>>> > 
>>> > /etc/resolv.conf contains the line
>>> > 
>>> > search local.mycompany.net prod.mycompany.net
>>> > 
>>> > Imagine searching for the bare-word "foo", which is defined in
>>> > prod.mycompany.net but nowhere else.
>>> > 
>>> > Under dnsmasq 2.78, querying for the bare name "foo" using the system
>>> > resolver will correctly first attempt to query for "foo.local.mycompany.net",
>>> > get back a DNAME to foo.local-dcname.mycompany.net, then get an empty
>>> > response with the NXDOMAIN code; that will fail, and glibc will then query "
>>> > foo.prod.mycompany.net", which is the correct record.
>>> > 
>>> > Under dnsmasq 2.82, querying for the bare name "foo" using the system
>>> > resolver will correctly first attempt to query for "foo.local.mycompany.net",
>>> > get back a DNAME to foo.local-dcname.mycompany.net, gets back an empty
>>> > response with the NOERROR code. This causes the system resolver to stop
>>> > trying new search domains. This behavior seems to be dependent on caching;
>>> > the first request correctly returns NXDOMAIN but subsequent requests return
>>> > NOERROR. There's actually something more confusing to it than this; if the
>>> > first request is for A, then subsequent AAAA requests return NOERROR but
>>> > subsequent A requests return NXDOMAIN. Some kind of weird cache poisoning
>>> > between record types?
>>> > 
>>> > I bisected this in git and this behavioral change was introduced in
>>> > commit b6f926fbefcd2471699599e44f32b8d25b87b471.
>>> 
>>> $ git log b6f926fbe...b6f926fbe^1
>>> commit b6f926fbefcd2471699599e44f32b8d25b87b471
>>> Author: Simon Kelley <simon at thekelleys.org.uk>
>>> Date:   Tue Aug 21 17:46:52 2018 +0100
>>> 
>>>     Don't return NXDOMAIN to empty non-terminals.
>>> 
>>>     When a record is defined locally, eg an A record for one.two.example then
>>>     we already know that if we forward, eg an AAAA query for one.two.example,
>>>     and get back NXDOMAIN, then we need to alter that to NODATA. This is handled
>>>     by  check_for_local_domain(). But, if we forward two.example, because
>>>     one.two.example exists, then the answer to two.example should also be
>>>     a NODATA.
>>> 
>>>     For most local records this is easy, just to substring matching.
>>>     for A, AAAA and CNAME records that are in the cache, it's more difficult.
>>>     The cache has no efficient way to find such records. The fix is to
>>>     insert empty (none of F_IPV4, F_IPV6 F_CNAME set) records for each
>>>     non-terminal.
>>> 
>>>     The same considerations apply in auth mode, and the same basic mechanism
>>>     is used there too.
>>> 
>>> 
>>> Regards
>>> Geert Stappers
>>> -- 
>>> Silence is hard to parse
>>> 
>>> _______________________________________________
>>> Dnsmasq-discuss mailing list
>>> Dnsmasq-discuss at lists.thekelleys.org.uk
>>> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
>> 
>> 
>> -- 
>> James Brown
>> Engineer
> 
> 
> -- 
> James Brown
> Engineer
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Fix-bug-where-cached-NXDOMAIN-CNAMEs-return-NODATA.patch
Type: text/x-patch
Size: 1016 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20200914/74c2c39c/attachment.bin>


More information about the Dnsmasq-discuss mailing list