[Dnsmasq-discuss] Incorrect response for DNAME'd records in dnsmasq 2.80+
Geert Stappers
stappers at stappers.nl
Wed Jul 29 19:50:09 BST 2020
On Wed, Jul 29, 2020 at 11:23:17AM -0700, James Brown wrote:
> I'm upgrading some test nodes in my employer's cluster from 2.78 to 2.82
> and handling of DNAMEs in the new version seems different (and wrong).
>
> The setup:
>
> local.mycompany.net is a DNAME to local-<dcname>.mycompany.net, with
> authoritative resolvers in each datacenter serving a different DNAME record
> prod.mycompany.net is an unrelated domain
>
> /etc/resolv.conf contains the line
>
> search local.mycompany.net prod.mycompany.net
>
> Imagine searching for the bare-word "foo", which is defined in
> prod.mycompany.net but nowhere else.
>
> Under dnsmasq 2.78, querying for the bare name "foo" using the system
> resolver will correctly first attempt to query for "foo.local.mycompany.net",
> get back a DNAME to foo.local-dcname.mycompany.net, then get an empty
> response with the NXDOMAIN code; that will fail, and glibc will then query "
> foo.prod.mycompany.net", which is the correct record.
>
> Under dnsmasq 2.82, querying for the bare name "foo" using the system
> resolver will correctly first attempt to query for "foo.local.mycompany.net",
> get back a DNAME to foo.local-dcname.mycompany.net, gets back an empty
> response with the NOERROR code. This causes the system resolver to stop
> trying new search domains. This behavior seems to be dependent on caching;
> the first request correctly returns NXDOMAIN but subsequent requests return
> NOERROR. There's actually something more confusing to it than this; if the
> first request is for A, then subsequent AAAA requests return NOERROR but
> subsequent A requests return NXDOMAIN. Some kind of weird cache poisoning
> between record types?
>
> I bisected this in git and this behavioral change was introduced in
> commit b6f926fbefcd2471699599e44f32b8d25b87b471.
$ git log b6f926fbe...b6f926fbe^1
commit b6f926fbefcd2471699599e44f32b8d25b87b471
Author: Simon Kelley <simon at thekelleys.org.uk>
Date: Tue Aug 21 17:46:52 2018 +0100
Don't return NXDOMAIN to empty non-terminals.
When a record is defined locally, eg an A record for one.two.example then
we already know that if we forward, eg an AAAA query for one.two.example,
and get back NXDOMAIN, then we need to alter that to NODATA. This is handled
by check_for_local_domain(). But, if we forward two.example, because
one.two.example exists, then the answer to two.example should also be
a NODATA.
For most local records this is easy, just to substring matching.
for A, AAAA and CNAME records that are in the cache, it's more difficult.
The cache has no efficient way to find such records. The fix is to
insert empty (none of F_IPV4, F_IPV6 F_CNAME set) records for each
non-terminal.
The same considerations apply in auth mode, and the same basic mechanism
is used there too.
Regards
Geert Stappers
--
Silence is hard to parse
More information about the Dnsmasq-discuss
mailing list