[Dnsmasq-discuss] RA support in dnsmasq

Gene Czarcinski gene at czarc.net
Fri Nov 30 21:03:03 GMT 2012


On 11/30/2012 12:45 PM, Simon Kelley wrote:
> On 30/11/12 17:20, Gene Czarcinski wrote:
>> On 11/30/2012 11:32 AM, Simon Kelley wrote:
>>> On 30/11/12 15:54, Gene Czarcinski wrote:
>>>> On 11/29/2012 04:18 PM, Simon Kelley wrote:
>>>>> On 29/11/12 20:31, Gene Czarcinski wrote:
>>>>>
>>>>>> I spoke too quickly.
>>>>>>
>>>>>> The cause of the problem is libvirt related but I am not sure what
>>>>>> just
>>>>>> yet.
>>>>>>
>>>>>> I was running a libvirt that had a lot of "stuff" on it but 
>>>>>> seemed to
>>>>>> work OK. Then, earlier today I update to a point that appears to be
>>>>>> somewhat beyond the leading edge and, although I was not getting any
>>>>>> RTR-ADVERT messages, it turned out that there were/are big-time
>>>>>> problems
>>>>>> running qemu-kvm. So, back off/downgrade to the previous version.
>>>>>> Qemu-kvm now works but the RTR-ADVERT messages are back.
>>>>>>
>>>>>> This may be a bit time-consuming to debug!
>>>>>>
>>>>> Are you seeing the new log message in netlink.c?
>>>>>
>>>>>
>>>> The good news is that libvirt is working again (I must have done a
>>>> git-pull in the middle of an update).  Thus, I am not seeing the large
>>>> numbers of RTR-ADVERT.
>>>>
>>>> Yes, I am seeing the new log message and I have a question about that.
>>>> Every time a new virtual network interface is started, something 
>>>> must be
>>>> doing some type of broadcast because all of the dnsmasq instances (the
>>>> new one and all the "old" ones) suddenly wake up and issue a flurry of
>>>> RA packets and related syslog messages.  To kick the flurry off, there
>>>> one of the new "unsolicited" syslog messages from each dnsmasq 
>>>> instance.
>>>>
>>>> Is this something you would expect?  Is this "normal?"  The libvirt
>>>> folks they are not doing it.
>>> I'd expect it. The code you instrumented gets run whenever a "new
>>> address" event happens, which is whenever an address is added to an
>>> interface. "Every time a new virtual network interface is started" is a
>>> good proxy for that.
>>>
>>> The dnsmasq code isn't very discriminating, it updates it's idea of
>>> which interfaces hace which addresses, and then does a minute of fast
>>> advertisements on all of them. It might be possible to only do the fast
>>> advertisements on new interfaces, but implementing that isn't totally
>>> trivial.
>>>
>>>
>> Yes, I doubt very much if it would be trivial.  However, I do not
>> believe that this is the basic problem.
>>
>> When the problem occurs, one of the networks "suddenly" attempts to work
>> with the real NIC rather than the virtual one defined in its config
>> file.  I slightly changed the IPv4 and IPv6 addresses defined for this
>> network and the problem went away.  I have also "just" seen the problem
>> happen on another system which also had that virtual address defined.
>>
>> BTW, these configurations all use interface= and bind-dynamic rather
>> than the "old" bind-interface with listen-address= specified for each
>> specified IPv4 and IPv6 address.  I had not noticed the problem
>> previously.  Why it occurs at all with just this specific address is
>> puzzling.
>>
>> The configuration in which causes problems is:
>> ------------------------------------------
>> # dnsmasq conf file created by libvirt
>> strict-order
>> domain-needed
>> domain=net6
>> expand-hosts
>> local=/net6/
>> pid-file=/var/run/libvirt/network/net6.pid
>> bind-dynamic
>> interface=virbr11
>> dhcp-range=192.168.6.128,192.168.6.254
>> dhcp-no-override
>> dhcp-leasefile=/var/lib/libvirt/dnsmasq/net6.leases
>> dhcp-lease-max=127
>> dhcp-hostsfile=/var/lib/libvirt/dnsmasq/net6.hostsfile
>> addn-hosts=/var/lib/libvirt/dnsmasq/net6.addnhosts
>> dhcp-range=fd00:beef:10:6::1,ra-only
>> -------------------------------------------------
>>
>> When I changed all the "6" to "160", the problem, disappeared. And
>> there is another network defined almost the same with "8" instead of "6"
>> and I have had no problems with it.
>>
>> The real NIC is configured as a DHCP client  for both IPv4 and IPv6. It
>> is assigned "nailed" addresses of 192.168.17.2/24 and 
>> fd00:dead:beef:17::2.
>>
>> And I just discovered why crazy stuff is happening (but I do not know
>> what causes it) ... the P33p1 NIC has:
>>    inet6 fd00:beef:10:6:3285:a9ff:fe8f:e982/64 scope global dynamic
>
> Is that the "real NIC"?
>
Yes, p33p1 is the real NIC.  This is going to be a real PITA to debug 
because I believe part of the problem is a race condition. 
NetworkManager has this really long dance it goes through to bring up 
the IPv6 interface.

But, I do not have any proof of that and as I just proved to myself, 
getting things to repeat are going to be difficult.

At this point I am not sure that bind-dynamic was related.  I went 
through the syslogs I still have and the first occurrence was on  8 
November.  That is well before bind-dynamic was integrated in.

Attached are some limited copies of syslogs that I thought you might 
find of interest.  It seems like the "strangeness" seem to happen right 
after I update libvirt and libvirtd is restarted which then gets dnsmasq 
started.

If I cannot get this figured out and "fixed", I will need to disable use 
of dnsmasq for RA service and fall back on radvd.

Frustrating .. so close and yet so far!

Gene
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RTR-ADVERT-condor-p32p1.log
Type: text/x-log
Size: 14020 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20121130/5454baba/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RTR-ADVERT-falcon-p33p1.log
Type: text/x-log
Size: 16325 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20121130/5454baba/attachment-0003.bin>


More information about the Dnsmasq-discuss mailing list