[Dnsmasq-discuss] Dnsmasq not resolving addresses for an hour
John Knight
John.Knight at belkin.com
Tue Oct 18 23:36:07 BST 2016
Hi All,
Still trying to figure this one out, At this point, I don't think the issue is caused by time going backwards, although I am convinced that there are issues with dnsmasq when time does go backwards. There is a lot of previous discussion on the backwards time issue.
What I am pursuing at the moment is the "retry later" that is supposed to occur after reload_servers() fails because the /etc/resolv.conf is being updated at the same time as dnsmasq is trying to read it.
One thing I have noticed is that dnsmasq does NOT call poll_resolv() very often. I would expect that if retry later condition occurs, dnsmasq should be calling poll_resolv() frequently as nothing can really be done until the upstream dns server is identified.
Now for some questions:
The main while(1) loop uses select() to determine if it has work to do. In most cases, it appears to use timeout of 0, which I believe means just wait indefinitely for work on the file descriptors. Other times, it appears that the timeout is set to a quarter second when doing a tftp transfer or polling the dbus.
Now what concerns me is that when a "retry later" condition occurs, we may get stuck on the select() for a long period of time. Alas, I do not know how frequent one might expect to see work arrive on the file descriptors that select is watching, so I don't really know if this is a long time or not. It seems though that in this failure scenario, the poll_resolv() function does NOT get called very often at all.
My gut feeling is that there always needs to be a timeout on the select call as the poll_resolv() should be called fairly frequently. The code that exists today where poll_resolv() normally is called from this loop suggests a poll rate of about once a second. This definitely does not happen today. By just adding a my_syslog() message to the top of poll_resolv(), it is very clear from the logfile that it is not called often, and way to infrequently to resolve the "retry later" condition in a timely manner.
Going forward, as the next thing for me to try, I am going to add a timeout for the select... perhaps a modest once a second or two.
But I would like to know what you all of think of this... does this make sense to do? Is there ever a case where we might not get any work on the files select is monitoring for nearly an hour? I am trying to make sense of this issue.
Thanks,
John Knight
__________________________________________________________________ Confidential This e-mail and any files transmitted with it are the property of Belkin International, Inc. and/or its affiliates, are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not one of the named recipients or otherwise have reason to believe that you have received this e-mail in error, please notify the sender and delete this message immediately from your computer. Any other use, retention, dissemination, forwarding, printing or copying of this e-mail is strictly prohibited. Pour la version fran?aise: http://www.belkin.com/email-notice/French.html F?r die deutsche ?bersetzung: http://www.belkin.com/email-notice/German.html __________________________________________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20161018/7fc09d11/attachment.html>
More information about the Dnsmasq-discuss
mailing list