[Dnsmasq-discuss] [BUG] [PATCH] RA are sent too fast and slows down the machine

Petr Mensik pemensik at redhat.com
Wed Aug 28 20:38:00 BST 2019


I have found what is going on.

That RA seems to be switching between dynamically assigned address and
manually assigned address. It is just wrong to assume there is one
address on physical interface, especially in IPv6 world.

It seems my patch (attached), checking just subnet and not caring for
exact address inside, fixes advertisement floods. But I am not sure
whether it also does not stop announces for new dynamic addresses as it
should. It might help to use valid parameter to distinguish between
static address and dynamic. I am unsure if it is required for both or
just dynamic one?

I am sure it would send once for newly created interface. I think it
should be enough, right?

Some notes from debugging:

Breakpoint 1, construct_worker (scope=<optimized out>, flags=<optimized
out>, preferred=<optimized out>, valid=1800,
    vparam=0x7ffc9afc2b60, if_index=2, prefix=64, local=0xa6dda4) at
2: /x *local = {__in6_u = {__u6_addr8 = {0xfc, 0x58, 0xa, 0x22, 0x18,
0xd, 0x78, 0x0, 0x8, 0x21, 0xd1, 0xff, 0xfe, 0x74, 0xec,
      0x2a}, __u6_addr16 = {0x58fc, 0x220a, 0xd18, 0x78, 0x2108, 0xffd1,
0x74fe, 0x2aec}, __u6_addr32 = {0x220a58fc, 0x780d18,
      0xffd12108, 0x2aec74fe}}}

Breakpoint 1, construct_worker (scope=<optimized out>, flags=<optimized
out>, preferred=<optimized out>, valid=-1,
    vparam=0x7ffc9afc2b60, if_index=2, prefix=64, local=0xa6ddec) at
685			ra_start_unsolicited(param->now, template);
2: /x *local = {__in6_u = {__u6_addr8 = {0xfc, 0x58, 0xa, 0x22, 0x18,
0xd, 0x78, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1},
    __u6_addr16 = {0x58fc, 0x220a, 0xd18, 0x78, 0x0, 0x0, 0x0, 0x100},
__u6_addr32 = {0x220a58fc, 0x780d18, 0x0, 0x1000000}}}

Cooperative ip link:
2: simbr: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UP group default qlen 1000
    link/ether 0a:21:d1:74:ec:2a brd ff:ff:ff:ff:ff:ff
    inet scope global simbr
       valid_lft forever preferred_lft forever
    inet6 fc58:a22:180d:7800:821:d1ff:fe74:ec2a/64 scope global dynamic
       valid_lft 1699sec preferred_lft 1699sec
    inet6 fc58:a22:180d:7800::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::821:d1ff:fe74:ec2a/64 scope link
       valid_lft forever preferred_lft forever


On 8/27/19 10:42 PM, Maarten de Vries wrote:
> Hey,
> I haven't dug very deep yet, but I can comment on the intent of the
> particular commit: without it, dnsmasq didn't do any unsolicited RAs on
> interfaces that are created after dnsmasq was started. It definitely
> should do unsolicited RAs on those interfaces too, although obviously
> not quite so many so often. I'm not sure why that happens. Note that the
> commit didn't introduce the fast RAs, it only enabled unsolicited RAs
> (including fast) for newly created interfaces too.
> I wonder why this happens in those test cases and at-least one Raspberry
> Pi, but not on my server. Is there any information you could provide to
> pinpoint when exactly this bug triggers and when not? For example: what
> happens if the virtual interface is created before dnsmasq starts? Does
> it also trigger on bridge interfaces (which is what I personally tested
> the commit with) for you?
> I will attempt to investigate too, but I'm somewhat swamped for time so
> I can't promise fast results.
> Kinds regards,
> Maarten
> On 27-08-2019 10:45, Iain Lane wrote:
>> On Wed, Aug 21, 2019 at 08:59:07PM +0200, Petr Mensik wrote:
>>> Hi Simon and Maarten,
>>> we discovered when playing with NetworkManager-ci [1], that lastest
>>> release is somehow broken. Test running dnsmasq are quite slow on latest
>>> release.
>>> I have created repeatable started script that reproduces it. Then used
>>> git bisect to find when it was broken. It seems fast sending were
>>> intentional in commit 0a496f059c1e9 [2], but maybe way it affects the
>>> system were underestimated. It is significant for systems that hit such
>>> issue. I think it has to be fixed to slow it down to short time
>>> interval, not endless loop. Reported as Fedora bug [3].
>> Thanks for this Petr. Would you be able to share the script you've used,
>> so that perhaps an upstream developer could recreate the bug?
>> Mainly I wanted to chime in and say that (in addition to the other
>> instance referenced), we found this in the NetworkManager testsuite in
>> Ubuntu. I didn't come up with a nice reproducer at the time, but we did
>> identify the same commit and we've reverted it in Ubuntu. I posted on
>> the ML back then but we didn't get much traction and I didn't follow up
>> very aggressively.
>> http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2018q4/012709.html
>> https://launchpadlibrarian.net/405377161/dnsmasq_2.80-1_2.80-1ubuntu1.diff.gz
>>    (the commit ID referenced in the changelog there seems or from
>>    somewhere else, it's the same patch)
>> Cheers,
>> _______________________________________________
>> Dnsmasq-discuss mailing list
>> Dnsmasq-discuss at lists.thekelleys.org.uk
>> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss

Petr Menšík
Software Engineer
Red Hat, http://www.redhat.com/
email: pemensik at redhat.com  PGP: 65C6C973
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Fix-RA-unsolicited-spam.patch
Type: text/x-patch
Size: 1086 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20190828/2291c4bd/attachment.bin>

More information about the Dnsmasq-discuss mailing list