[Dnsmasq-discuss] following RFC6106 triggers bug in network-manager

Gui Iribarren gui at altermundi.net
Tue Nov 5 08:21:09 GMT 2013


Hello,
so, we started suffering frequent, periodic disconnects on clients since 
upgrading dnsmasq 2.62 -> 2.66

tracking down the issue, it came down to a network-manager bug while 
maintaining the RDNSS list, where an unhandled expiring RDNSS lifetime 
results in a full reconnection

problem is, the kernel only understands the *router* lifetime, but 
ignores everything about the RDNSS lifetime; and if the latter is 
shorter than the former, then the RDNSS expires before the kernel sends 
a RS to handle the *router* expiring lifetime.

in dnsmasq 2.62, router lifetime was equal to RDNSS lifetime, as shown 
below:

# rdisc6 wlan0
Soliciting ff02::2 (ff02::2) on wlan0...

Hop limit                 :           64 (      0x40)
Stateful address conf.    :           No
Stateful other conf.      :           No
Mobile home agent         :           No
Router preference         :       medium
Neighbor discovery proxy  :           No
Router lifetime           :         1800 (0x00000708) seconds
[...]
  Recursive DNS server     : fe80::fad1:11ff:fe54:3381
   DNS server lifetime     :         1800 (0x00000708) seconds
  from fe80::fad1:11ff:fe54:3381

this prevented the situation where the network-manager bug would happen: 
as the kernel would issue a RS to renew the router lifetime, the RDNSS 
was renewed as well, just in time

in network-manager 0.9.6 the bug is fixed (NM sends a RS by itself, 
before RDNSS expires, independent of RtrAdvLifetime)
but notably debian squeeze still ships 0.9.4, which reconnects to the 
network every 20 minutes when talking to a dnsmasq v2.66 (worked well 
against v2.62)

Router lifetime           :         1800 (0x00000708) seconds
   DNS server lifetime     :         1200 (0x000004b0) seconds

then, even though it's the debian/etc maintainers who should fix their 
packages...

   https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/993571

can we anyway consider going back to the old behaviour in dnsmasq, to 
help mitigation?
(RtrAdvLifetime = RDNSSLifetime)

i understand v2.66 follows RFC6106

      Lifetime      32-bit unsigned integer.  The maximum time, in
                    seconds (relative to the time the packet is sent),
                    over which this RDNSS address MAY be used for name
                    resolution.  Hosts MAY send a Router Solicitation to
                    ensure the RDNSS information is fresh before the
                    interval expires.  In order to provide fixed hosts
                    with stable DNS service and allow mobile hosts to
                    prefer local RDNSSes to remote RDNSSes, the value of
                    Lifetime SHOULD be bounded as
                    MaxRtrAdvInterval <= Lifetime <= 2*MaxRtrAdvInterval
                    where MaxRtrAdvInterval is the Maximum RA Interval
                    defined in [RFC4861].  A value of all one bits
                    (0xffffffff) represents infinity.  A value of zero
                    means that the RDNSS address MUST no longer be used.

but this RFC has been criticised already[1] (since it creates a fragile 
situation, where a single or couple of RA packet losses - common in wifi 
scenarios - are enough to lose the race condition)

     [1]: https://bugzilla.redhat.com/show_bug.cgi?id=753482#c38

and using RtrAdvLifetime = RDNSSLifetime only defies the "SHOULD" 
keyword used in the RFC, strictly speaking.
in addition, dnsmasq (contrary to radvd) actually provides the RDNSS 
service itself, so it's shouldn't be much of an issue to announce a 
longer lifetime for that?

just a thought :)

Cheers!

gui



More information about the Dnsmasq-discuss mailing list