[Dnsmasq-discuss] possible Bug: DHCPDISCOVER no address available

Simon Kelley simon at thekelleys.org.uk
Tue May 28 15:57:19 BST 2013


On 25/05/13 11:54, Thomas Kärgel wrote:
> Hi Simon,
>
>
> Am 24.05.2013 17:53, schrieb Simon Kelley:
>>
>>
>> Thanks for that. I've not found anything obvious yet that could cause
>> this. What mechansim adds stuff to the hostsfile and SIGHUPs dnsmasq?
>> We've not seen the logs for that procedure, AFAIK.
>>
>> Cheers,
>>
>> Simon.
>
> The mechanism works like this:
>
> 1. a new machine is created with openstack nova
>
> 2. nova asks quantum for network ports in specific networks required for
> the new machine.
>
> 3. quantum creates the port and assigns an ip address out of the
> ip-range for this network. An mac-address is generated by quantum for
> the port. A port is at this point in time not more than an entry in the
> quantum-database.
>
> 4. quantum triggers quantum-dhcp-agent over RPC to reload the
> allocations for the network in which the new port was created.
>
> 5.Quantum-dhcp-agent then SigHUPs corresponding dnsmasq-process
> recreates the hostfile (completely afaik) and sigHUPs dnsmasq again.
> This (recreating hostsfile and sigHUP) is also done twice as i now notice.
> I don't know why the programmer choose to sigHUP dnsmasq before and
> after recreating the hostsfile. IMHO it is only necessary to sigHUP
> dnsmasq once after recreating the hostfile. Do you think this might
> cause a problem?

Given that the whole thing is a bit racey, it could be a problem, AFAIK 
there's no guarantee that two signals sent to a process in quick 
succession won't be merged, so that effect of this could be to start 
dnsmasq reading the old version of the file (or even a truncated version 
of it) and for it _not _ to read the new version.

> And why is PID 12926 sigHUPed twice? Shouldn't it be sigHUPing 12927 the
> second time?

No, that's fine. 12926 is the main process and 12927 is a helper process 
used to run the DHCP script as root. You should always signal the main 
process.

>
> 5. parallel to the 2 previous steps nova-compute is triggering the
> hypervisor to create the new machine. In most cases the hypervisor also
> creates the VIFs with the mac-address generated in step 3 (That depends
> on which hypervisor is used. In my case libvirt/XEN).
>
> 6. a Quantum plugin  (quantum-linuxbridge-agent for example) is notified
> about the new port. The task of this plugin is to add the port to the
> corresponding bridge (or ovs-switch, if quantum-openvswitch-agent is used).
>
>
> Here are some logs i fetched this morning from the environment i'm
> working on.
>
>
> ps aux |grep dnsmasq
> dnsmasq  12926  0.1  0.0   9172   716 ?        S    May23   3:01
> /usr/sbin/dnsmasq --no-hosts --no-resolv --strict-order
> --bind-interfaces --interface=ns-806c6c7b-2b --except-interface=lo
> --domain=openstacklocal
> --pid-file=/var/lib/quantum/dhcp/d041e640-c37f-4b8f-9878-ecbc1b26d12f/pid --dhcp-hostsfile=/var/lib/quantum/dhcp/d041e640-c37f-4b8f-9878-ecbc1b26d12f/host
> --dhcp-optsfile=/var/lib/quantum/dhcp/d041e640-c37f-4b8f-9878-ecbc1b26d12f/opts
> --dhcp-script=/usr/bin/quantum-dhcp-agent-dnsmasq-lease-update
> --leasefile-ro --dhcp-range=set:tag0,10.70.149.0,static,120s
> root     12927  0.0  0.0   9172   320 ?        S    May23   0:27
> /usr/sbin/dnsmasq --no-hosts --no-resolv --strict-order
> --bind-interfaces --interface=ns-806c6c7b-2b --except-interface=lo
> --domain=openstacklocal
> --pid-file=/var/lib/quantum/dhcp/d041e640-c37f-4b8f-9878-ecbc1b26d12f/pid --dhcp-hostsfile=/var/lib/quantum/dhcp/d041e640-c37f-4b8f-9878-ecbc1b26d12f/host
> --dhcp-optsfile=/var/lib/quantum/dhcp/d041e640-c37f-4b8f-9878-ecbc1b26d12f/opts
> --dhcp-script=/usr/bin/quantum-dhcp-agent-dnsmasq-lease-update
> --leasefile-ro --dhcp-range=set:tag0,10.70.149.0,static,120s
>
>
> And the log of quantum-dhcp-agent:
>
> 2013-05-25 12:38:24    DEBUG [quantum.openstack.common.rpc.amqp] Making
> asynchronous cast on q-plugin...
> 2013-05-25 12:38:24    DEBUG [amqplib] Closed channel #1
> 2013-05-25 12:38:24    DEBUG [amqplib] using channel_id: 1
> 2013-05-25 12:38:24    DEBUG [amqplib] Channel open
> 2013-05-25 12:38:26    DEBUG [quantum.openstack.common.rpc.amqp] Making
> asynchronous cast on q-plugin...
> 2013-05-25 12:38:26    DEBUG [quantum.agent.linux.utils] Running
> command: sudo quantum-rootwrap /etc/quantum/rootwrap.conf kill -HUP 12926
> 2013-05-25 12:38:26    DEBUG [amqplib] Closed channel #1
> 2013-05-25 12:38:26    DEBUG [amqplib] using channel_id: 1
> 2013-05-25 12:38:26    DEBUG [amqplib] Channel open
> 2013-05-25 12:38:26    DEBUG [quantum.agent.linux.utils]
> Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf',
> 'kill', '-HUP', '12926']
> Exit code: 0
> Stdout: ''
> Stderr: ''
> 2013-05-25 12:38:26    DEBUG [quantum.agent.linux.dhcp] Reloading
> allocations for network: d041e640-c37f-4b8f-9878-ecbc1b26d12f
> 2013-05-25 12:38:26    DEBUG [quantum.agent.linux.utils] Running
> command: sudo quantum-rootwrap /etc/quantum/rootwrap.conf kill -HUP 12926
> 2013-05-25 12:38:26    DEBUG [quantum.agent.linux.utils]
> Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf',
> 'kill', '-HUP', '12926']
> Exit code: 0
> Stdout: ''
> Stderr: ''
> 2013-05-25 12:38:26    DEBUG [quantum.agent.linux.dhcp] Reloading
> allocations for network: d041e640-c37f-4b8f-9878-ecbc1b26d12f
> 2013-05-25 12:38:26    DEBUG [quantum.agent.linux.utils] Running
> command: sudo quantum-rootwrap /etc/quantum/rootwrap.conf kill -HUP 12926
> 2013-05-25 12:38:26    DEBUG [quantum.openstack.common.rpc.amqp] Making
> asynchronous cast on q-plugin...
> 2013-05-25 12:38:26    DEBUG [amqplib] Closed channel #1
> 2013-05-25 12:38:26    DEBUG [amqplib] using channel_id: 1
> 2013-05-25 12:38:26    DEBUG [amqplib] Channel open
> 2013-05-25 12:38:26    DEBUG [quantum.openstack.common.rpc.amqp] Making
> asynchronous cast on q-plugin...
> 2013-05-25 12:38:26    DEBUG [quantum.agent.linux.utils]
> Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf',
> 'kill', '-HUP', '12926']
> Exit code: 0
> Stdout: ''
> Stderr: ''
> 2013-05-25 12:38:26    DEBUG [quantum.agent.linux.dhcp] Reloading
> allocations for network: d041e640-c37f-4b8f-9878-ecbc1b26d12f
> 2013-05-25 12:38:26    DEBUG [amqplib] Closed channel #1
> 2013-05-25 12:38:26    DEBUG [amqplib] using channel_id: 1
> 2013-05-25 12:38:26    DEBUG [amqplib] Channel open
> 2013-05-25 12:38:26    DEBUG [quantum.openstack.common.rpc.amqp] Making
> asynchronous cast on q-plugin...
>
>
>

Try changing the dhcp-agent to only send sigHUP _after_ updating the 
file. If that doesn't work, the next stage is to alter dnsmasq to log a 
line count or checksum or similar, so we can see it's actually reading 
the file in the state we expect.


Cheers,

Simon.



More information about the Dnsmasq-discuss mailing list