[Dnsmasq-discuss] Intermittent DNSMASQ resolution failures

Simon Kelley simon at thekelleys.org.uk
Fri Feb 15 07:43:24 GMT 2013


On 15/02/13 06:32, Stuart Wilson wrote:
> Hi,
>
> I have noticed lately on several machines using the local cable ISP in
> my neck of the woods, that using the DHCP supplied DNS servers from my
> ISP is significantly slower than using a global DNS service like
> OpenDNS. With that in mind I configured the Linux box I use at home to
> use OpenDNS. It was working great, until I decided to fix it by adding
> DNSMASQ as a local caching server to lighten the load. It works fine
> most of the time, but sometimes I get intermittent failures to resolve
> names. At first I just noticed delays getting to some websites, and
> occasionally it would fail entirely. At times though it became
> unacceptable and failed a lot. So, I started testing name resolution in
> a shell using the "host" command, and found that it did indeed sometimes
> give me a ";; connection timed out; no servers could be reached" error.
> When I specifically ask the host command to query the OpenDNS server
> directly, bypassing DNSMASQ, it never fails and is always very fast.
>
> I got really curious about this and captured some packets with
> Wireshark. First of a host query going through DNSMASQ that failed, and
> then one going directly to the DNS server. I did indeed get no reply
> back on the query that failed. The only difference I could find between
> the packets being sent to OpenDNS by DNSMASQ, and those going direclty
> from the OS to OpenDNS, is that the queries that failed from DNSMASQ had
> the DF (don't fragment) bit set. Now it is quite possible I'm missing
> something here, but it occurs to me that my using DNS servers half way
> across the internet, rather than right down the street at the local ISP,
> could be causing packets with the DF bit set to get dropped. Is there
> any way to tell DNSMASQ to not set the DF bit? Can anyone think of
> another reason why this is failing for me?
> --
>

Dnsmasq doesn't explicitly do anything to affect the value of the DF 
flag in outgoing datagrams. My guess is that it's being set by the 
kernel as part of path-MTU discovery.

Your average DNS query UDP packet is unlikely to be big enough to get 
fragmented over even the smallest link.

It would be instructive to look at the resolver code in glibc to see if 
that turns off MTU discovery or fragmentation explicitly.


Beware when doing your tests that it easy to get confused: a query fails 
because it takes to long somewhere down the line, so you repeat it by a 
different route that ends up at the same recursive resolver. By the time 
you do the query again, all the relevant information has arrived and 
been cached, and, surprise, surprise, the query succeeds.



Cheers,

Simon.





More information about the Dnsmasq-discuss mailing list