[Dnsmasq-discuss] Dnsmasq not resolving addresses for an hour

John Knight John.Knight at belkin.com
Tue Oct 18 23:36:07 BST 2016


Hi All,

Still trying to figure this one out,  At this point, I don't think the issue is caused by time going backwards, although I am convinced that there are issues with dnsmasq when time does go backwards.  There is a lot of previous discussion on the backwards time issue.

What I am pursuing at the moment is the "retry later" that is supposed to occur after reload_servers() fails because the /etc/resolv.conf is being updated at the same time as dnsmasq is trying to read it.

One thing I have noticed is that dnsmasq does NOT call poll_resolv() very often.  I would expect that if retry later condition occurs, dnsmasq should be calling poll_resolv() frequently as nothing can really be done until the upstream dns server is identified.

Now for some questions:

The main while(1) loop uses select() to determine if it has work to do.  In most cases, it appears to use timeout of 0, which I believe means just wait indefinitely for work on the file descriptors.  Other times, it appears that the timeout is set to a quarter second when doing a tftp transfer or polling the dbus.

Now what concerns me is that when a "retry later" condition occurs, we may get stuck on the select() for a long period of time.  Alas, I do not know how frequent one might expect to see work arrive on the file descriptors that select is watching, so I don't really know if this is a long time or not.  It seems though that in this failure scenario, the poll_resolv() function does NOT get called very often at all.

My gut feeling is that there always needs to be a timeout on the select call as the poll_resolv() should be called fairly frequently.  The code that exists today where poll_resolv() normally is called from this loop suggests a poll rate of about once a second.  This definitely does not happen today.  By just adding a my_syslog() message to the top of poll_resolv(), it is very clear from the logfile that it is not called often, and way to infrequently to resolve the "retry later" condition in a timely manner.

Going forward, as the next thing for me to try, I am going to add a timeout for the select... perhaps a modest once a second or two.

But I would like to know what you all of think of this... does this make sense to do?  Is there ever a case where we might not get any work on the files select is monitoring for nearly an hour?  I am trying to make sense of this issue.

Thanks,

John Knight

__________________________________________________________________ Confidential This e-mail and any files transmitted with it are the property of Belkin International, Inc. and/or its affiliates, are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not one of the named recipients or otherwise have reason to believe that you have received this e-mail in error, please notify the sender and delete this message immediately from your computer. Any other use, retention, dissemination, forwarding, printing or copying of this e-mail is strictly prohibited. Pour la version fran?aise: http://www.belkin.com/email-notice/French.html F?r die deutsche ?bersetzung: http://www.belkin.com/email-notice/German.html __________________________________________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20161018/7fc09d11/attachment.html>


More information about the Dnsmasq-discuss mailing list