[Dnsmasq-discuss] dnsmasq using 100% of cpu
Geert Stappers
geert.stappers at hendrikx-itc.nl
Mon Feb 17 14:37:22 GMT 2020
On 17-02-2020 14:31, Donald Sharp wrote:
> Running:
>
> sharpd at eva:~/dnsmasq$ /sbin/dnsmasq --version
> Dnsmasq version 2.80 Copyright (c) 2000-2018 Simon Kelley
2018, no short-git-hashes nor simular indicators on source version.
> Compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua
> TFTP conntrack ipset auth DNSSEC loop-detect inotify dumpfile
> ----
>
> When I install several hundred thousand routes into the kernel and
> remove them( or some variation thereof ), dnsmasq eventually ends up
> running 100% cpu:
>
> top - 18:45:18 up 1 day, 7:44, 1 user, load average: 2.70, 2.65, 2.34
> Tasks: 424 total, 3 running, 421 sleeping, 0 stopped, 0 zombie
> %Cpu(s): 12.1 us, 6.9 sy, 0.0 ni, 80.2 id, 0.0 wa, 0.0 hi, 0.7
> si, 0.0 st
> MiB Mem : 32131.3 total, 19483.6 free, 6620.3 used, 6027.4
> buff/cache
> MiB Swap: 32718.0 total, 31693.0 free, 1025.0 used. 24698.2 avail Mem
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
> COMMAND
> 293183 nobody 20 0 11040 2040 1688 R 99.7 0.0 148:48.40
> dnsmasq
The "CPU 100%" made me do `git log` and a "find" on 'CPU'. I found
commit df6636bff61aa53ed7ad4b34d940805193c0bc74
Author: Florent Fourcot <florent.fourcot at wifirst.fr>
Date: Mon Feb 11 17:04:44 2019 +0100
lease: prune lease as soon as expired
We detected a performance issue on a dnsmasq running many dhcp sessions
(more than 10 000). At the end of the day, the server was only releasing
old DHCP leases but was consuming a lot of CPU.
It looks like curent dhcp pruning:
1) it's pruning old sessions (iterate on all current leases). It's
important to note that it's only pruning session expired since more
than one second
2) it's looking for next lease to expire (iterate on all current leases
again)
3) it launchs an alarm to catch next expiration found in step 2). This
value can be zero for leases just expired (but not pruned).
So, for a second, dnsmasq could fall in a "prune loop" by doing:
* Not pruning anything, since difftime() is not > 0
* Run alarm again with zero as argument
On a server with very large number of leases and releasing often
sessions, that can waste a very big CPU time.
Signed-off-by: Florent Fourcot <florent.fourcot at wifirst.fr>
>
> strace output:
>
> poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5,
> events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8,
> events=POLLIN}], 6, -1) = 1 ([{fd=4, revents=POLLERR}])
> ....
> poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5,
> events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8,
> events=POLLIN}], 6, -1) = 1 ([{fd=4, revents=POLLERR}])
> poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5,
> events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8,
> events=POLLIN}], 6, -1) = 1 ([{fd=4, revents=PO^Cstrace: Process
> 293183 detached
>
> I can pretty much make this happen at will. What can I provide to
> help debug this?
Start with stating how recent the source is that you are using.
>
> As a side note, I was not placing these routes into the default linux
> routing table. Does dnsmasq need to be paying attention to these routes?
Side notes in a separate thread please.
>
> donald
>
Regards
Geert Stappers
More information about the Dnsmasq-discuss
mailing list