[Dnsmasq-discuss] Infinite(?) RTR-ADVERTs being sent out [in Ubuntu NetworkManager testuite]

Iain Lane laney at ubuntu.com
Thu Dec 20 17:21:36 GMT 2018


On Thu, Dec 20, 2018 at 01:06:30PM +0000, Iain Lane wrote:
>   dnsmasq-dhcp[2010]: RTR-ADVERT(veth42) 2600::
> 
> repeating over and over. You can view a log file including all the
> dnsmasq log entries at [2] - it's huge though, so I suggest downloading
> it and using a real text editor instead of your browser.
> 
> This test starts dnsmasq in ra-only mode, and that seems to be the case
> that's broken. Since this started happening in a new version, I was able
> to bisect 2.79 to 2.80, and I found that commit
> 0a496f059c1e9d75c33cce4c1211d58422ba4f62 is the first bad commit.
> Indeed, reverting that on master causes the NM testsuite to start
> passing again.

I attached gdb to dnsmasq while it was doing this. Here's the stack
trace of what was going on at the time:

(gdb) bt
#0  0x00007fb1b5a7cfd4 in __GI___libc_write (fd=10, buf=0x55c255462ce8, nbytes=62) at ../sysdeps/unix/sysv/linux/write.c:26
#1  0x000055c253c4fb94 in log_write () at log.c:179
#2  0x000055c253c502fd in my_syslog (priority=6, format=0x55c253c77e39 "RTR-ADVERT(%s) %s") at log.c:389
#3  0x000055c253c5de2e in add_prefixes (local=0x55c25545d65c, prefix=64, scope=0, if_index=13, flags=4, preferred=3600, valid=3600, 
    vparam=0x7ffcd1e77ce0) at radv.c:725
#4  0x000055c253c493a0 in iface_enumerate (family=10, parm=0x7ffcd1e77ce0, callback=0x55c253c5d829 <add_prefixes>) at netlink.c:268
#5  0x000055c253c5cad4 in send_ra_alias (now=1545314006, iface=13, iface_name=0x7ffcd1e77e1c "veth42", dest=0x0, send_iface=13)
    at radv.c:291
#6  0x000055c253c5d826 in send_ra (now=1545314006, iface=13, iface_name=0x7ffcd1e77e1c "veth42", dest=0x0) at radv.c:553
#7  0x000055c253c5e0f8 in periodic_ra (now=1545314006) at radv.c:808
#8  0x000055c253c3e256 in lease_update_file (now=1545314006) at lease.c:355
#9  0x000055c253c52e7e in dhcp_construct_contexts (now=1545314006) at dhcp6.c:789
#10 0x000055c253c36812 in newaddress (now=1545314006) at network.c:1721
#11 0x000055c253c39902 in async_event (pipe=8, now=1545314006) at dnsmasq.c:1413
#12 0x000055c253c38e71 in main (argc=11, argv=0x7ffcd1e78258) at dnsmasq.c:1084

There's a few places you can get an async_event of EVENT_NEWADDR. Some
further grubbing around shows that it's coming from iface_enumerate() ->
nl_async() -> async_event(EVENT_NEWADDR).

You'll see that iface_enumerate() is frame #4 in that trace too, so this
smells like it might be a source of infinite recursion to me: every time
we call add_prefixes() we also queue another call of the same thing -
the new code is to blame via dhcp_construct_contexts() calling
construct_worker() to enter the new block setting param->newone = 1,
which is checked in dhcp_construct_options().

What's not clear to me is how best to cut this off. If we only called
the new code one time that would probably solve it...

Cheers,

-- 
Iain Lane                                  [ iain at orangesquash.org.uk ]
Debian Developer                                   [ laney at debian.org ]
Ubuntu Developer                                   [ laney at ubuntu.com ]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20181220/c3679c2e/attachment.sig>


More information about the Dnsmasq-discuss mailing list