[Dnsmasq-discuss] following RFC6106 triggers bug in network-manager
Gui Iribarren
gui at altermundi.net
Tue Nov 5 08:21:09 GMT 2013
Hello,
so, we started suffering frequent, periodic disconnects on clients since
upgrading dnsmasq 2.62 -> 2.66
tracking down the issue, it came down to a network-manager bug while
maintaining the RDNSS list, where an unhandled expiring RDNSS lifetime
results in a full reconnection
problem is, the kernel only understands the *router* lifetime, but
ignores everything about the RDNSS lifetime; and if the latter is
shorter than the former, then the RDNSS expires before the kernel sends
a RS to handle the *router* expiring lifetime.
in dnsmasq 2.62, router lifetime was equal to RDNSS lifetime, as shown
below:
# rdisc6 wlan0
Soliciting ff02::2 (ff02::2) on wlan0...
Hop limit : 64 ( 0x40)
Stateful address conf. : No
Stateful other conf. : No
Mobile home agent : No
Router preference : medium
Neighbor discovery proxy : No
Router lifetime : 1800 (0x00000708) seconds
[...]
Recursive DNS server : fe80::fad1:11ff:fe54:3381
DNS server lifetime : 1800 (0x00000708) seconds
from fe80::fad1:11ff:fe54:3381
this prevented the situation where the network-manager bug would happen:
as the kernel would issue a RS to renew the router lifetime, the RDNSS
was renewed as well, just in time
in network-manager 0.9.6 the bug is fixed (NM sends a RS by itself,
before RDNSS expires, independent of RtrAdvLifetime)
but notably debian squeeze still ships 0.9.4, which reconnects to the
network every 20 minutes when talking to a dnsmasq v2.66 (worked well
against v2.62)
Router lifetime : 1800 (0x00000708) seconds
DNS server lifetime : 1200 (0x000004b0) seconds
then, even though it's the debian/etc maintainers who should fix their
packages...
https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/993571
can we anyway consider going back to the old behaviour in dnsmasq, to
help mitigation?
(RtrAdvLifetime = RDNSSLifetime)
i understand v2.66 follows RFC6106
Lifetime 32-bit unsigned integer. The maximum time, in
seconds (relative to the time the packet is sent),
over which this RDNSS address MAY be used for name
resolution. Hosts MAY send a Router Solicitation to
ensure the RDNSS information is fresh before the
interval expires. In order to provide fixed hosts
with stable DNS service and allow mobile hosts to
prefer local RDNSSes to remote RDNSSes, the value of
Lifetime SHOULD be bounded as
MaxRtrAdvInterval <= Lifetime <= 2*MaxRtrAdvInterval
where MaxRtrAdvInterval is the Maximum RA Interval
defined in [RFC4861]. A value of all one bits
(0xffffffff) represents infinity. A value of zero
means that the RDNSS address MUST no longer be used.
but this RFC has been criticised already[1] (since it creates a fragile
situation, where a single or couple of RA packet losses - common in wifi
scenarios - are enough to lose the race condition)
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=753482#c38
and using RtrAdvLifetime = RDNSSLifetime only defies the "SHOULD"
keyword used in the RFC, strictly speaking.
in addition, dnsmasq (contrary to radvd) actually provides the RDNSS
service itself, so it's shouldn't be much of an issue to announce a
longer lifetime for that?
just a thought :)
Cheers!
gui
More information about the Dnsmasq-discuss
mailing list