[Dnsmasq-discuss] No servers after coming back from sleep (was dnsmasq 2.86 crash)

Eloy Paris peloy at chapus.net
Wed Oct 20 15:20:35 UTC 2021


The trigger of the behavior I am observing is not putting the machine to
sleep but shutting down the interface that connects the machine to the
upstream DNS servers (which coincidentally happens when putting the
machine to sleep) -- after I shut down the upstream interface of machine
and re-enable it, daemon->servers remains NULL regardless of contents of
/etc/resolv.conf.  Sending SIGHUP, or touching /etc/resolv.conf, causes
dnsmasq to re-read /etc/resolv.conf but daemon->servers remains NULL
through all this.

I've added instrumentation all over the place because I don't know the
code, and this is what I am seeing:

External interface is disabled:

Oct 20 10:39:39 chapilu dnsmasq[198384]: /etc/resolv.conf: # Generated by NetworkManager
Oct 20 10:39:39 chapilu dnsmasq[198384]: cleanup_servers(): on entry daemon->servers = 0x563729effa20
Oct 20 10:39:39 chapilu dnsmasq[198384]: cleanup_servers(): on exit daemon->servers = (nil)
Oct 20 10:39:39 chapilu dnsmasq[198384]: no servers found in /etc/resolv.conf, will retry

Expected. Note daemon->servers = (nil) because there are no servers in
resolv.conf.

Then I enabled the interface:

Oct 20 10:39:51 chapilu dnsmasq[198384]: /etc/resolv.conf: # Generated by NetworkManager
Oct 20 10:39:51 chapilu dnsmasq[198384]: /etc/resolv.conf: search example.com
Oct 20 10:39:51 chapilu dnsmasq[198384]: /etc/resolv.conf: nameserver 64.102.6.247
Oct 20 10:39:51 chapilu dnsmasq[198384]: reload_servers(): adding server 64.102.6.247
Oct 20 10:39:51 chapilu dnsmasq[198384]: reload_servers(): adding server via add_update_server()
Oct 20 10:39:51 chapilu dnsmasq[198384]: add_update_server(): flags = 2048, daemon->servers = (nil)
Oct 20 10:39:51 chapilu dnsmasq[198384]: add_update_server(): added server to tail.
Oct 20 10:39:51 chapilu dnsmasq[198384]: reload_servers(): add_update_server() returned 1; daemon->servers = (nil)
Oct 20 10:39:51 chapilu dnsmasq[198384]: /etc/resolv.conf: nameserver 173.37.137.85
Oct 20 10:39:51 chapilu dnsmasq[198384]: reload_servers(): adding server 173.37.137.85
Oct 20 10:39:51 chapilu dnsmasq[198384]: reload_servers(): adding server via add_update_server()
Oct 20 10:39:51 chapilu dnsmasq[198384]: add_update_server(): flags = 2048, daemon->servers = (nil)
Oct 20 10:39:51 chapilu dnsmasq[198384]: add_update_server(): added server to tail.
Oct 20 10:39:51 chapilu dnsmasq[198384]: reload_servers(): add_update_server() returned 1; daemon->servers = (nil)
Oct 20 10:39:51 chapilu dnsmasq[198384]: /etc/resolv.conf: nameserver 173.37.142.73
Oct 20 10:39:51 chapilu dnsmasq[198384]: reload_servers(): adding server 173.37.142.73
Oct 20 10:39:51 chapilu dnsmasq[198384]: reload_servers(): adding server via add_update_server()
Oct 20 10:39:51 chapilu dnsmasq[198384]: add_update_server(): flags = 2048, daemon->servers = (nil)
Oct 20 10:39:51 chapilu dnsmasq[198384]: add_update_server(): added server to tail.
Oct 20 10:39:51 chapilu dnsmasq[198384]: reload_servers(): add_update_server() returned 1; daemon->servers = (nil)
Oct 20 10:39:51 chapilu dnsmasq[198384]: cleanup_servers(): on entry daemon->servers = (nil)
Oct 20 10:39:51 chapilu dnsmasq[198384]: cleanup_servers(): on exit daemon->servers = (nil)
Oct 20 10:39:51 chapilu dnsmasq[198384]: reading /etc/resolv.conf
Oct 20 10:39:51 chapilu dnsmasq[198384]: check_servers(): daemon->servers is NULL!
Oct 20 10:39:51 chapilu dnsmasq[198384]: check_servers(): 0 servers in daemon->servers
Oct 20 10:39:51 chapilu dnsmasq[198384]: check_servers(): daemon->local_domains is NULL!
Oct 20 10:39:51 chapilu dnsmasq[198384]: check_servers(): 0 servers in daemon->local_domains
Oct 20 10:39:51 chapilu dnsmasq[198384]: cleanup_servers(): on entry daemon->servers = (nil)
Oct 20 10:39:51 chapilu dnsmasq[198384]: cleanup_servers(): on exit daemon->servers = (nil)

So, there are valid servers to be added to the daemon->servers linked
list, but upon return from add_update_server(), daemon->servers is NULL.
That doesn't seem right.

Looking at recent changes to add_update_server() I found:

https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commitdiff;h=eb88eed1fc8ed246e9355531c2715fa2f7738afc

I have reverted that commit and now bouncing the external interface does
not cause add_update_server() to leave daemon->servers with NULL, and
things work.

I believe there is something wrong with the daemon->servers_tail logic
introduced by the above commit. I'll try to determine what is wrong with
the logic but please feel free to beat me to it because it might take me
a little while.

Cheers,

Eloy Paris.-

On Wed, Oct 20, 2021 at 05:35:47AM -0400, Eloy Paris wrote:
> On Tue, Oct 19, 2021 at 01:13:46PM -0400, Eloy Paris wrote:
> 
> > I am seeing the new issue happen very often -- the machine goes to sleep
> > and when it comes back it seems like dnsmasq does not have upstream
> > servers to forward requests to, so the virtual machine that relies on
> > dnsmasq for DNS resolution cannot resolve anything.
> 
> I've done some troubleshooting of this and my /etc/resolv.conf seems
> stable when the machine comes back from sleep and dnsmasq reads it.
> However, for some reason the servers there don't seem to be added to
> the daemon->servers linked list.
> 
> dnsmasq.c:poll_resolv() has:
> 
> ----------------------------------------------------------------------
>   if (latest)
>     {
>       static int warned = 0;
>       if (reload_servers(latest->name))
>         {
>           my_syslog(LOG_INFO, _("reading %s"), latest->name);
>           warned = 0;
>           check_servers(0);
> ----------------------------------------------------------------------
> 
> I instrumented check_servers(), as that is what logs "using nameserver
> xyz", and my syslog has this in the working case (before I put the
> machine to sleep):
> 
> Oct 20 05:16:32 chapilu dnsmasq[167055]: /etc/resolv.conf: search example.com
> Oct 20 05:16:32 chapilu dnsmasq[167055]: /etc/resolv.conf: nameserver 1.2.3.4
> Oct 20 05:16:32 chapilu dnsmasq[167055]: /etc/resolv.conf: nameserver 1.2.3.5
> Oct 20 05:16:32 chapilu dnsmasq[167055]: /etc/resolv.conf: nameserver 1.2.3.6
> Oct 20 05:16:32 chapilu dnsmasq[167055]: reading /etc/resolv.conf
> Oct 20 05:16:32 chapilu dnsmasq[167055]: check_servers(): Server #1: domain = , interface =
> Oct 20 05:16:32 chapilu dnsmasq[167055]: using nameserver 1.2.3.4#53
> Oct 20 05:16:32 chapilu dnsmasq[167055]: check_servers(): Server #2: domain = , interface =
> Oct 20 05:16:32 chapilu dnsmasq[167055]: using nameserver 1.2.3.5#53
> Oct 20 05:16:32 chapilu dnsmasq[167055]: check_servers(): Server #3: domain = , interface =
> Oct 20 05:16:32 chapilu dnsmasq[167055]: using nameserver 1.2.3.6#53
> Oct 20 05:16:32 chapilu dnsmasq[167055]: check_servers(): 3 servers in daemon->servers
> Oct 20 05:16:32 chapilu dnsmasq[167055]: check_servers(): 0 servers in daemon->local_domains
> 
> (Doing "touch /etc/resolv.conf" when things are working [before I put
> the machine to sleep], produces the above as well.)
> 
> However, when I put the machine to sleep, and later resume, I get this:
> 
> Oct 20 05:17:26 chapilu dnsmasq[167055]: /etc/resolv.conf: # Generated by NetworkManager
> Oct 20 05:17:26 chapilu dnsmasq[167055]: /etc/resolv.conf: search example.com
> Oct 20 05:17:26 chapilu dnsmasq[167055]: /etc/resolv.conf: nameserver 1.2.3.4
> Oct 20 05:17:26 chapilu dnsmasq[167055]: /etc/resolv.conf: nameserver 1.2.3.5
> Oct 20 05:17:26 chapilu dnsmasq[167055]: /etc/resolv.conf: nameserver 1.2.3.6
> Oct 20 05:17:26 chapilu dnsmasq[167055]: reading /etc/resolv.conf
> Oct 20 05:17:26 chapilu dnsmasq[167055]: check_servers(): 0 servers in daemon->servers
> Oct 20 05:17:26 chapilu dnsmasq[167055]: check_servers(): 0 servers in daemon->local_domains
> 
> So it seems like after the resume from sleep, /etc/resolv.conf is
> stable, has the correct upstream DNS servers, but somehow
> daemon->servers ends up with nothing; what could cause this?
> 
> Cheers,
> 
> Eloy Paris.-
> 



More information about the Dnsmasq-discuss mailing list