[Dnsmasq-discuss] Strange behavior when making the nameserver machine use dnsmasq

Simon Kelley simon at thekelleys.org.uk
Fri Mar 27 10:15:56 GMT 2009


Zack Little wrote:
> Hello.  I have searched the archive and cannot find any 
> information about this.
>  
> I followed this statement when setting up dnsmasq:
>  
> "Making the nameserver machine use dnsmasq.
> In the simple configuration described above, processes local to the 
> machine will not use dnsmasq, since they get their information about 
> which nameservers to use from /etc/resolv.conf, which is set to the 
> upstream nameservers. To fix this, simply replace the nameserver in 
> /etc/resolv.conf with the local address 127.0.0.1 and give the 
> address(es) of the upstream nameserver(s) to dnsmasq directly. You can 
> do this using either the server option, or by putting them into another 
> file, and telling dnsmasq about its location with the resolv-file option."
> 
> dnsmasq is launched as follows:
>  
> /bin/dnsmasq -q -k -c 0 -N -o -r /ramdisk/resolv.conf
> 
> The /ramdisk/resolv.conf contains the following:
> nameserver 10.102.1.25    
> nameserver 135.54.66.1    
> nameserver 209.183.35.23
> nameserver 209.183.35.24
> 
> When dnsmasq starts it logs the following:
> <30>Mar 26 16:42:57 dnsmasq[2889]: started, version 2.43 cache disabled
> <30>Mar 26 16:42:57 dnsmasq[2889]: compile time options: IPv6 GNU-getopt 
> no-ISC-leasefile no-DBus no-I18N TFTP
> <30>Mar 26 16:42:57 dnsmasq[2889]: reading /ramdisk/resolv.conf
> <30>Mar 26 16:42:57 dnsmasq[2889]: using nameserver 209.183.35.24#53
> <30>Mar 26 16:42:57 dnsmasq[2889]: using nameserver 209.183.35.23#53
> <30>Mar 26 16:42:57 dnsmasq[2889]: using nameserver 135.54.66.1#53
> <30>Mar 26 16:42:57 dnsmasq[2889]: using nameserver 10.102.1.25#53
> <30>Mar 26 16:42:57 dnsmasq[2889]: read /etc/hosts - 6 addresses
> 
>  From a Windows Vista PC *behind* the device running dnsmasq if I send 
> one ping (ping -n 1)
> dnsmasq logs the following:
>  
> <31>Mar 26 16:43:06 dnsmasq[2889]: query[A] www.google.com 
> <http://www.google.com/> from 10.77.7.5
> <31>Mar 26 16:43:06 dnsmasq[2889]: forwarded www.google.com 
> <http://www.google.com/> to 10.102.1.25
> <31>Mar 26 16:43:07 dnsmasq[2889]: query[A] www.google.com 
> <http://www.google.com/> from 10.77.7.5
> <31>Mar 26 16:43:07 dnsmasq[2889]: forwarded www.google.com 
> <http://www.google.com/> to 135.54.66.1
> <31>Mar 26 16:43:08 dnsmasq[2889]: query[A] www.google.com 
> <http://www.google.com/> from 10.77.7.5
> <31>Mar 26 16:43:08 dnsmasq[2889]: forwarded www.google.com 
> <http://www.google.com/> to 209.183.35.23
> <31>Mar 26 16:43:08 dnsmasq[2889]: reply www.google.com 
> <http://www.google.com/> is <CNAME>
> <31>Mar 26 16:43:08 dnsmasq[2889]: reply www.l.google.com 
> <http://www.l.google.com/> is 66.102.7.104
> <31>Mar 26 16:43:08 dnsmasq[2889]: reply www.l.google.com 
> <http://www.l.google.com/> is 66.102.7.99
>  
> The first DNS, 10.102.1.25, does eventually respond with a 'standard 
> query response, server failure', but not until 16:43:21.98.  The second 
> dns, 135.54.66.1, doesn't respond.  The third DNS, 209.183.35.23, 
> responds with an answer.  This all works properly and the name is 
> resolved.  dnsmasq appears to be waiting approximately one second before 
> trying the next DNS.
>  
> On the device dnsmasq is running on the /etc/resolv.conf only contains 
> "nameserver 127.0.0.1".
>  
> The problem occurs when a process on the same device dnsmasq is running 
> on tries to resolve a name.  A 'ping -c 1 www.google.com' 
> <http://www.google.com'/> from the command line on the device causes 
> dnsmasq to log the following:
>  
> <31>Mar 26 16:43:29 dnsmasq[2889]: query[A] www.google.com 
> <http://www.google.com/> from 127.0.0.1
> <31>Mar 26 16:43:29 dnsmasq[2889]: forwarded www.google.com 
> <http://www.google.com/> to 10.102.1.25
> <31>Mar 26 16:43:39 dnsmasq[2889]: query[A] www.google.com 
> <http://www.google.com/> from 127.0.0.1
> <31>Mar 26 16:43:39 dnsmasq[2889]: forwarded www.google.com 
> <http://www.google.com/> to 10.102.1.25
>  
> As you can see the request was sent to loopback and dnsmasq handled it.  
> That is what is expected.  However, dnsmasq is acting very differently.  
> After sending a request to the first DNS it waits 10 seconds instead of 
> 1 second for the second attempt.  Instead of using the second DNS for 
> the next attempt it again uses the first DNS.  The first DNS request 
> eventually is answered at 16:43:45.06.
>  
> Because only the first DNS is used (and also because the retries are so 
> far apart) the device that is running dnsmasq is not able to resolve 
> names. 
>  
> Is this a known bug?
>

No, but it provides me with a perfect opportunity for a public service 
announcement, since this information needs to go to a wider audience.

Sorry about the shouting;

DON'T USE --STRICT-ORDER

Strict-order almost never does what people expect/want it to do, which 
is to put a priority order on the list of servers in /etc/resolv.conf. 
It mainly just disrupts dnsmasq's mechanism for dealing with broken or 
down servers. If I could, I'd remove it. If there is ever dnsmasq-3, it 
will go.


If you remove --strict order, then dnsmasq will send the first query, in 
parallel, top all the name servers. It will note that first one which 
provides a good answer, and use just that until a query times-out, when 
it will "run the race" over all the servers again.

BTW My guess  is that the behaviour difference you are seeing in how the 
queries are handled is because the repeated query from 127.0.0.1 doesn't 
have the same transaction-id as teh first query, so dnsmasq doesn't 
recognise it as a retry.


Cheers,

Simon.





More information about the Dnsmasq-discuss mailing list