[Dnsmasq-discuss] dnsmasq using 100% of cpu

Geert Stappers geert.stappers at hendrikx-itc.nl
Mon Feb 17 14:37:22 GMT 2020


On 17-02-2020 14:31, Donald Sharp wrote:

> Running:
>
> sharpd at eva:~/dnsmasq$ /sbin/dnsmasq --version
> Dnsmasq version 2.80  Copyright (c) 2000-2018 Simon Kelley

2018,  no  short-git-hashes nor simular indicators on source version.


> Compile time options: IPv6 GNU-getopt DBus i18n IDN DHCP DHCPv6 no-Lua
> TFTP conntrack ipset auth DNSSEC loop-detect inotify dumpfile
> ----
>
> When I install several hundred thousand routes into the kernel and
> remove them( or some variation thereof ), dnsmasq eventually ends up
> running 100% cpu:
>
> top - 18:45:18 up 1 day,  7:44,  1 user,  load average: 2.70, 2.65, 2.34
> Tasks: 424 total,   3 running, 421 sleeping,   0 stopped,   0 zombie
> %Cpu(s): 12.1 us,  6.9 sy,  0.0 ni, 80.2 id,  0.0 wa,  0.0 hi,  0.7
> si,  0.0 st
> MiB Mem :  32131.3 total,  19483.6 free,   6620.3 used,   6027.4
> buff/cache
> MiB Swap:  32718.0 total,  31693.0 free,   1025.0 used.  24698.2 avail Mem
>
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
> COMMAND                            
>  293183 nobody    20   0   11040   2040   1688 R  99.7   0.0 148:48.40
> dnsmasq       


The "CPU 100%" made me do  `git log` and a "find" on 'CPU'.  I found


commit df6636bff61aa53ed7ad4b34d940805193c0bc74
Author: Florent Fourcot <florent.fourcot at wifirst.fr>
Date:   Mon Feb 11 17:04:44 2019 +0100

    lease: prune lease as soon as expired
   
    We detected a performance issue on a dnsmasq running many dhcp sessions
    (more than 10 000). At the end of the day, the server was only releasing
    old DHCP leases but was consuming a lot of CPU.
   
    It looks like curent dhcp pruning:
     1) it's pruning old sessions (iterate on all current leases). It's
     important to note that it's only pruning session expired since more
     than one second
     2) it's looking for next lease to expire (iterate on all current leases
     again)
     3) it launchs an alarm to catch next expiration found in step 2). This
     value can be zero for leases just expired (but not pruned).
   
    So, for a second, dnsmasq could fall in a "prune loop" by doing:
     * Not pruning anything, since difftime() is not > 0
     * Run alarm again with zero as argument
   
    On a server with very large number of leases and releasing often
    sessions, that can waste a very big CPU time.
   
    Signed-off-by: Florent Fourcot <florent.fourcot at wifirst.fr>




>
> strace output:
>
> poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5,
> events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8,
> events=POLLIN}], 6, -1) = 1 ([{fd=4, revents=POLLERR}])
>     ....
> poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5,
> events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8,
> events=POLLIN}], 6, -1) = 1 ([{fd=4, revents=POLLERR}])
> poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=5,
> events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8,
> events=POLLIN}], 6, -1) = 1 ([{fd=4, revents=PO^Cstrace: Process
> 293183 detached
>
> I can pretty much make this happen at will.  What can I provide to
> help debug this?

Start with stating how recent the source is that you are using.


>
> As a side note, I was not placing these routes into the default linux
> routing table.  Does dnsmasq need to be paying attention to these routes?

Side notes in a separate thread  please.


>
> donald
>

Regards

Geert Stappers





More information about the Dnsmasq-discuss mailing list