[Dnsmasq-discuss] NXDOMAIN on exisiting A record

Sasha Litvak alexander.v.litvak at gmail.com
Wed Jul 10 14:18:21 BST 2019


Petr,

Thank you very much for your help.   I will follow your advice and report
my findings to the list.

On Wed, Jul 10, 2019, 4:47 AM Petr Mensik <pemensik at redhat.com> wrote:

> Hello Alex,
>
> I would try removing all-servers and clear-on-reload statements away. I
> would use just one server for testing, retesting all of them for the
> same behaviour. When you do not know which server is used, it is hard to
> debug better.
>
> I think dots in server=/.X/ are not necessary and maybe even misleading.
> Try it without them, just server=/X/ip
>
> I think one second timeout is too short. Just use only localhost in
> /etc/resolv.conf and debug what happens with dnsmasq. Record what
> queries are sent to dnsmasq and what dnsmasq forwards to configured
> servers.
>
> Note I discovered already requests without recursion desired bit set are
> forwarded always, do not serve any local records. But that should not be
> the issue. Try dig +rec and dig +norec to rule it out.
>
> Regards,
> Petr
>
> On 7/7/19 10:28 PM, Alex Litvak wrote:
> > (luck of sleep, fixing some mistakes in text)
> >
> > Hello everyone,
> >
> > I run consul services on my network where services are registered with
> > <xyz>.service.consul when they start.  All containers and bare metal
> > hosts are running dnsmasq 2.80.
> > I noticed that if I restart one of the containers, one of the hosts
> > continue failing to resolve the service name.  I assume that dnsmasq is
> > a culprit because:
> >
> > 1. I can resolve service xyz.service.consul against standard dns servers
> > with dig.
> > 2. Dnsmasq listening on 127.0.0.1 is the first line in the resolve.conf
> > and when I run tcpdump against port 53 on interface lo I see it returns
> > NXDOMAIN on each A record query for service in question.
> > 3. If I restart dnsmasq everything is back to normal again.  Even more
> > weird, if I send SIGHUP to dnsmasq, which only causes a reread of
> > /etc/hosts file, everything is back to normal as far as service
> > resolution goes.
> >
> > I have this problem only happening  on some hosts without the pattern I
> > can recognize.  For example I have two nodes with the same config, os,
> > kernel version, dnsmasq version, etc ... and one of them has the problem
> > 100% after service xyz.service.consul restart and the other is not.
> >
> > Where do I start troubleshooting? Any ideas are welcome.
> >
> > Here is a standard dnsmasq confugration.
> >
> > port=53
> > domain-needed
> > bogus-priv
> > interface=lo
> > listen-address=127.0.0.1
> > no-dhcp-interface=127.0.0.1
> > #bind-interfaces
> > no-resolv
> > all-servers
> > dns-forward-max=500
> >
> > # If you don't want dnsmasq to read /etc/hosts, uncomment the
> > # following line.
> > #no-hosts
> > # or if you want it to read another file, as well as /etc/hosts, use
> > # this.
> > #addn-hosts=/etc/banner_add_hosts
> >
> > #log-queries=extra
> > #log-facility=/var/log/dnsmasq.log
> > log-async=25
> >
> > # Set the cachesize here.
> > cache-size=10000
> > min-cache-ttl=5
> > #neg-ttl=3600
> >
> > # If you want to disable negative caching, uncomment this.
> > #no-negcache
> >
> > # For debugging purposes, log each DNS query as it passes through
> > # dnsmasq.
> > #log-queries
> > clear-on-reload
> >
> > server=10.0.48.12
> > server=10.0.48.11
> > server=10.0.21.63
> > server=10.0.21.61
> >
> > server=/.la.consul/10.0.73.43
> > server=/.la.consul/10.0.73.40
> > server=/.la.consul/10.0.73.28
> > server=/.chi-pbx.consul/10.1.73.1
> > server=/.chi-pbx.consul/10.1.73.2
> > server=/.chi-pbx.consul/10.1.73.3
> > server=/.consul/10.0.73.43
> > server=/.consul/10.0.73.40
> > server=/.consul/10.0.73.28
> >
> > Resolver config
> >
> > search ''
> > options  timeout:1 attempts:1
> > nameserver 127.0.0.1
> > nameserver 10.0.48.11
> > nameserver 10.0.48.12
> > nameserver 10.0.21.63
> >
> >
> >
> > _______________________________________________
> > Dnsmasq-discuss mailing list
> > Dnsmasq-discuss at lists.thekelleys.org.uk
> > http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
>
> --
> Petr Menšík
> Software Engineer
> Red Hat, http://www.redhat.com/
> email: pemensik at redhat.com  PGP: 65C6C973
>
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20190710/0431da9a/attachment.html>


More information about the Dnsmasq-discuss mailing list