[Dnsmasq-discuss] RA support in dnsmasq

Simon Kelley simon at thekelleys.org.uk
Fri Nov 30 21:18:37 GMT 2012


On 30/11/12 21:03, Gene Czarcinski wrote:
> On 11/30/2012 12:45 PM, Simon Kelley wrote:
>> On 30/11/12 17:20, Gene Czarcinski wrote:
>>> On 11/30/2012 11:32 AM, Simon Kelley wrote:
>>>> On 30/11/12 15:54, Gene Czarcinski wrote:
>>>>> On 11/29/2012 04:18 PM, Simon Kelley wrote:
>>>>>> On 29/11/12 20:31, Gene Czarcinski wrote:
>>>>>>
>>>>>>> I spoke too quickly.
>>>>>>>
>>>>>>> The cause of the problem is libvirt related but I am not sure what
>>>>>>> just
>>>>>>> yet.
>>>>>>>
>>>>>>> I was running a libvirt that had a lot of "stuff" on it but
>>>>>>> seemed to
>>>>>>> work OK. Then, earlier today I update to a point that appears to be
>>>>>>> somewhat beyond the leading edge and, although I was not getting any
>>>>>>> RTR-ADVERT messages, it turned out that there were/are big-time
>>>>>>> problems
>>>>>>> running qemu-kvm. So, back off/downgrade to the previous version.
>>>>>>> Qemu-kvm now works but the RTR-ADVERT messages are back.
>>>>>>>
>>>>>>> This may be a bit time-consuming to debug!
>>>>>>>
>>>>>> Are you seeing the new log message in netlink.c?
>>>>>>
>>>>>>
>>>>> The good news is that libvirt is working again (I must have done a
>>>>> git-pull in the middle of an update).  Thus, I am not seeing the large
>>>>> numbers of RTR-ADVERT.
>>>>>
>>>>> Yes, I am seeing the new log message and I have a question about that.
>>>>> Every time a new virtual network interface is started, something
>>>>> must be
>>>>> doing some type of broadcast because all of the dnsmasq instances (the
>>>>> new one and all the "old" ones) suddenly wake up and issue a flurry of
>>>>> RA packets and related syslog messages.  To kick the flurry off, there
>>>>> one of the new "unsolicited" syslog messages from each dnsmasq
>>>>> instance.
>>>>>
>>>>> Is this something you would expect?  Is this "normal?"  The libvirt
>>>>> folks they are not doing it.
>>>> I'd expect it. The code you instrumented gets run whenever a "new
>>>> address" event happens, which is whenever an address is added to an
>>>> interface. "Every time a new virtual network interface is started" is a
>>>> good proxy for that.
>>>>
>>>> The dnsmasq code isn't very discriminating, it updates it's idea of
>>>> which interfaces hace which addresses, and then does a minute of fast
>>>> advertisements on all of them. It might be possible to only do the fast
>>>> advertisements on new interfaces, but implementing that isn't totally
>>>> trivial.
>>>>
>>>>
>>> Yes, I doubt very much if it would be trivial.  However, I do not
>>> believe that this is the basic problem.
>>>
>>> When the problem occurs, one of the networks "suddenly" attempts to work
>>> with the real NIC rather than the virtual one defined in its config
>>> file.  I slightly changed the IPv4 and IPv6 addresses defined for this
>>> network and the problem went away.  I have also "just" seen the problem
>>> happen on another system which also had that virtual address defined.
>>>
>>> BTW, these configurations all use interface= and bind-dynamic rather
>>> than the "old" bind-interface with listen-address= specified for each
>>> specified IPv4 and IPv6 address.  I had not noticed the problem
>>> previously.  Why it occurs at all with just this specific address is
>>> puzzling.
>>>
>>> The configuration in which causes problems is:
>>> ------------------------------------------
>>> # dnsmasq conf file created by libvirt
>>> strict-order
>>> domain-needed
>>> domain=net6
>>> expand-hosts
>>> local=/net6/
>>> pid-file=/var/run/libvirt/network/net6.pid
>>> bind-dynamic
>>> interface=virbr11
>>> dhcp-range=192.168.6.128,192.168.6.254
>>> dhcp-no-override
>>> dhcp-leasefile=/var/lib/libvirt/dnsmasq/net6.leases
>>> dhcp-lease-max=127
>>> dhcp-hostsfile=/var/lib/libvirt/dnsmasq/net6.hostsfile
>>> addn-hosts=/var/lib/libvirt/dnsmasq/net6.addnhosts
>>> dhcp-range=fd00:beef:10:6::1,ra-only
>>> -------------------------------------------------
>>>
>>> When I changed all the "6" to "160", the problem, disappeared. And
>>> there is another network defined almost the same with "8" instead of "6"
>>> and I have had no problems with it.
>>>
>>> The real NIC is configured as a DHCP client  for both IPv4 and IPv6. It
>>> is assigned "nailed" addresses of 192.168.17.2/24 and
>>> fd00:dead:beef:17::2.
>>>
>>> And I just discovered why crazy stuff is happening (but I do not know
>>> what causes it) ... the P33p1 NIC has:
>>>    inet6 fd00:beef:10:6:3285:a9ff:fe8f:e982/64 scope global dynamic
>>
>> Is that the "real NIC"?
>>
> Yes, p33p1 is the real NIC.  This is going to be a real PITA to debug
> because I believe part of the problem is a race condition.
> NetworkManager has this really long dance it goes through to bring up
> the IPv6 interface.
>
> But, I do not have any proof of that and as I just proved to myself,
> getting things to repeat are going to be difficult.
>
> At this point I am not sure that bind-dynamic was related.  I went
> through the syslogs I still have and the first occurrence was on  8
> November.  That is well before bind-dynamic was integrated in.
>
> Attached are some limited copies of syslogs that I thought you might
> find of interest.  It seems like the "strangeness" seem to happen right
> after I update libvirt and libvirtd is restarted which then gets dnsmasq
> started.
>
> If I cannot get this figured out and "fixed", I will need to disable use
> of dnsmasq for RA service and fall back on radvd.
>
> Frustrating .. so close and yet so far!
>

I wonder if the virbr* interfaces are bridged to the "real" NICs, such 
that when a prefix is advertised on the virbr interface, it causes the 
real interface to add an address for that prefix. Because dnsmasq is 
configured to advertise the prefix, that then causes the advertisements 
via the real NIC.

Just a thought.

Simon.





More information about the Dnsmasq-discuss mailing list