[Dnsmasq-discuss] [PATCH] Offered IPv4 DHCP multiple times

Simon Kelley simon at thekelleys.org.uk
Thu Dec 16 13:51:42 UTC 2021



On 13/12/2021 23:04, Petr Menšík wrote:
> Hello Simon and others.
> 
> In certain situations, dnsmasq DHCP will offer multiple different
> clients single IP address. Later it ACKs the first client and NACK the
> second. It relies on ability of those clients to retry, but it seems
> netbooting software often cannot recover from such behaviour.

Thne netbooting software is broken. This is not a surprise.
> 
> Attaching script I use to reproduce this issue. Just create some local
> bridge, have a limited IP pool on its dhcp-range option. Just few
> addresses above actual number instances. They start roughtly at the same
> time.
> 
> There is also pcap file in linked bug with some other reports. Good
> summary is in commetn 85 [3].
> 
> In attached patches, I introduced thing I call temporary leases. Those
> leases are never saved into leased file. They have short time duration,
> set to 30 s same as ping timeout. It ensures even with
> dhcp-sequential-ip, different clients have reservations for different
> addresses. It helps especially in case --no-ping is used. Without this
> change it takes quite long to retry multiple times
> discover-offer-request solutions. Because pings contain sort of
> workaround for this deficiency, but will cover only 6 different
> addresses in default configuration. Then it switches to overload,
> similar to no-ping. Then it offers multiple clients the same address,
> but when the 2nd client requests the lease, it denies it again.

The ping-cache was never intended to fix this problem, not least because
it can be disabled with --no-ping. The way it's intended to work with a
busy server is that clients are offered addresses based on a hash of
their hardware address, and the pool of addresses is large enough to
avoid most collisions. This clearly doesn't work when sequential
addresses are enabled, since the offered addresses are not randomised,
so the ping-cache is used as a band-aid. It doesn't make things work at
high loads.

If dnsmasq is configured suitably for the requirements, ie --no-ping
set, --dhcp-sequential-ip NOT set and an address pool significantly
larger than the expected number of clients, is there still a problem?
> 
> I think I were able to find relative simple algorithm. I think IPv6
> should receive similar approach. We side-stepped this by offering
> different address in ACK in thread [4]. While it seems that works, I
> think it would be better to not offer address it later rejects itself.
> We test DHCP clients abilities for no good reason.
> 
> With this patches, even multiple clients without ping boot fast enough,
> even when they start at similar time. Starting at similar time if common
> thing on boot of cloud hosting instances, which may use dnsmasq for
> local caching. OpenStack is example that recorded it, but it can happen
> even in normal machines. For example in a classroom with 10 computers.
> 
> Would you look at it or test it, whether some issues with those changes
> can be found?
> 

I'll take a look.


Cheers,

Simon.

> Cheers,
> Petr
> 
> 3. https://bugzilla.redhat.com/show_bug.cgi?id=2028704#c85
> 4.
> https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2021q3/015585.html
> 
> On 12/8/21 01:18, Petr Menšík wrote:
>>
>> Hi Simon and others,
>>
>> I am debugging strange issue, which happens inside OpenStack in
>> certain situations. It seems under not precisely defined conditions
>> dnsmasq returns "no address available" error even in situation, when
>> not yet all leases are used.
>>
>> It seems do_icmp_ping is responsible for ruling out recently tried IP
>> addresses. It seems a bit weird address allocation happens only for
>> addresses recently not pinged. I have found another place which does
>> do_icmp_ping, but does not use hash value computed from hardware
>> address. Even when it is already known at that time. First patch
>> attached adds hash also to second place. That should mean single
>> address would use shared ping. The second patch simplifies a bit
>> do_icmp_patch and its return value. Instead of artificially ensuring
>> hash would match, just return correct value when hash matches. The
>> second change is just optional optimization.
>>
>> Few details are at RH bug #2028704 [1]. Original tested version 2.79
>> did not contain commit 0669ee7a69a
>> <http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=0669ee7a69a004ce34fed41e50aa575f8e04427b>
>> [2], which improves the situation. But I think there remain cases when
>> ping is not accepted when it should be. Testing with latest release
>> did not work according to report. I think the first patch may fix
>> still missing part.
>>
>> Cheers,
>> Petr
>>
>> 1. https://bugzilla.redhat.com/show_bug.cgi?id=2028704
>> 2.
>> http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=0669ee7a69a004ce34fed41e50aa575f8e04427b
>>
>> -- 
>> Petr Menšík
>> Software Engineer
>> Red Hat, http://www.redhat.com/
>> email: pemensik at redhat.com
>> PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
> 
> -- 
> Petr Menšík
> Software Engineer
> Red Hat, http://www.redhat.com/
> email: pemensik at redhat.com
> PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
> 
> 
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
> 



More information about the Dnsmasq-discuss mailing list