[Dnsmasq-discuss] dnsmasq 2.86 crash

Eloy Paris peloy at chapus.net
Tue Oct 19 17:13:46 UTC 2021


Hi Simon,

On Tue, Oct 19, 2021 at 03:51:28PM +0100, Simon Kelley wrote:

> On 15/10/2021 04:45, Eloy Paris wrote:
> > Hi Simon,
> > 
> > I am running 2.87test4, which has the commit you mention below.
> > 
> > I've tested bringing down and up the external interfaces of the machine
> > (the ones that dnsmasq uses to reach the recursive DNS servers to
> > fulfill DNS requests it receives) and have not been able to reproduce a
> > crash anymore.
> > 
> > However, after bringing down an interface and a few seconds later
> > bringing it back up, DNS resolution stops working.
> > 
> > I see this in the system log right after I re-enable the interface that
> > I previously disabled:
> > 
> > Oct 14 23:28:14 chapilu dnsmasq[79367]: reading /etc/resolv.conf
> > 
> > and the packet capture shows:
> > 
> > 23:29:24.737039 IP 192.168.122.165.60261 > 192.168.122.1.53: 18552+ A? google.com. (28)
> > 23:29:24.737126 IP 192.168.122.1.53 > 192.168.122.165.60261: 18552 Refused 0/0/0 (28)
> > 
> > Under what conditions does dnsmasq respond to a resolution request with
> > REFUSED; no servers in /etc/resolv.conf?
> > 
> > I guess there might be a race condition here because I just sent SIGHUP
> > to the dnsmasq process and the system log shows this:
> > 
> 
> 
> As Petr says, the REFUSED reply is when there are no suitable servers to
> forward a query to.
> 
> Based on the previous crash, which is triggered by there being no
> configured servers, this is expected. Reading /etc/resolv.conf is
> inherently racy - but the code should keep trying in the case that it
> finds an empty /etc/resolv.conf to mitigate this.
> 
> You're not setting --no-poll are you?

Sorry for the lack of updates...

First of all, the crashes are gone with 2.87test4.

Second, my apologies but it must have been a late night or early morning
when I wrote:

----------------------------------------------------------------------
Oct 14 23:28:14 chapilu dnsmasq[79367]: reading /etc/resolv.conf

[...]

I guess there might be a race condition here because I just sent SIGHUP
to the dnsmasq process and the system log shows this:

Oct 14 23:38:34 chapilu dnsmasq[79367]: read /etc/hosts - 4 addresses

and now DNS resolution works again!
----------------------------------------------------------------------

Obviously, /etc/resolv.conf and /etc/hosts are different files that
serve very different purposes. (I just read "/etc/hosts" as
"/etc/resolv.conf" and thought that after sending the SIGHUP, dnsmasq
found "4 DNS servers". That is still what may have happened -- I sent
SIGHUP, it re-read /etc/resolv.conf, and that is what made it work, but
my reading of the filenames was incorrect and I may have confused
everyone.)

Third, I am running with "log-debug" and I do not see a dump of upstream
servers in the log, as Petr mentioned. Perhaps I should instrument the
code to see that for my own troubleshooting and understanding of what
dnsmasq actually finds when it logs "reading /etc/resolv.conf".

I am seeing the new issue happen very often -- the machine goes to sleep
and when it comes back it seems like dnsmasq does not have upstream
servers to forward requests to, so the virtual machine that relies on
dnsmasq for DNS resolution cannot resolve anything.

Finally, no, I am not running with --no-poll. dnsmasq is being invoked
by libvirtd as:

/usr/bin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leaseshelper

and, if it helps, /var/lib/libvirt/dnsmasq/default.conf contains:

----------------------------------------------------------------------
strict-order
pid-file=/run/libvirt/network/default.pid
except-interface=lo
bind-dynamic
interface=virbr0
dhcp-range=192.168.122.2,192.168.122.254,255.255.255.0
dhcp-no-override
dhcp-authoritative
dhcp-lease-max=253
dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile
addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts
log-debug
----------------------------------------------------------------------

Cheers,

Eloy Paris.-

> 
> 
> > Oct 14 23:38:34 chapilu dnsmasq[79367]: read /etc/hosts - 4 addresses
> > 
> > and now DNS resolution works again!
> > 
> > No idea why dnsmasq is automatically detecting one change in
> > /etc/resolv.conf, and it apparently is one that does not contain any
> > servers.
> > 
> > Cheers,
> > 
> > Eloy Paris.-
> > 
> > On Wed, Oct 13, 2021 at 09:44:56AM +0100, Simon Kelley wrote:
> > 
> >> Based on the location of the crash, and the circumstances that cause it,
> >> my guess is that this will be fixed by
> >>
> >> https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=d290630d31f4517ab26392d00753d1397f9a4114
> >>
> >> Please could you try that, and get back to us if it doesn't sort the
> >> problem?
> >>
> >>
> >> Cheers,
> >>
> >> Simon.



More information about the Dnsmasq-discuss mailing list