[Dnsmasq-discuss] dnsmasq using 100% cpu on router

David Joslin davidj at nkcc.org.uk
Fri Apr 25 08:37:16 UTC 2014


Hi Kevin and thanks for the help.

Is it possible to upgrade the dnsmasq version on the router without waiting
for the author of the tomato firmware to include a later version in a
release of his firmware (and you mentioned that dnsmasq in tomato isn't a
clean pull of Simon's release)?

Why would changing the location of the leasefile to a usb stick make a
difference? If the issue, as Simon suggests, is caused by the constant
rewriting of the lease database, then wouldn't its current location (which
on a router would be RAM) be a faster/better option than a usb stick? Or is
there another possible issue here that I've missed?

The only recent change I've made to the router was the addition of a usb
stick as the location for the writing of system logs and bandwidth and IP
traffic usage logs (so that they weren't lost on a reboot). I had wondered
if the cause of the problem was related to the speed of writing this stuff
(which obviously includes dnsmasq logging) to the usb stick rather than
RAM. That's why I turned off dnsmasq logging at one point but it didn't
seem to make any difference.

Thanks again for your help and I'll wait for your comments on the above.

Cheers

David




On 24 April 2014 21:13, Kevin Darbyshire-Bryant <
kevin at darbyshire-bryant.me.uk> wrote:

> On 24/04/2014 20:49, Simon Kelley wrote:
> > On 24/04/14 20:41, David Joslin wrote:
> >> Thanks for the reply, Simon.
> >>
> >> DNSSEC isn't enabled.
> >>
> >> I wonder if the pattern of the problem gives any clues...
> >>
> >> As I said, on a normal day with around 40-50 clients on the network
> there
> >> is no problem at all with dnsmasq managing to use barely 0 - 2% of the
> CPU.
> >> When the problem occurred there were a little over 100 clients. Running
> top
> >> showed dnsmasq using 100% cpu so I restarted dnsmasq and kept an eye on
> >> top. For maybe 5 or 10 minutes there was no problem, with dnsmasq using
> >> very little cpu. Then dnsmasq would start to peak at maybe 20-30% for a
> >> couple of seconds before dropping back. Then it would start peaking at
> >> higher and higher levels before dropping back. Eventually, after running
> >> for maybe half an hour it would start peaking at over 90% and staying
> there
> >> for longer before dropping back. At this point dns requests would become
> >> very slow (and maybe time out). And then dnsmasq would hit 100% cpu and
> >> would stay there. Dns requests would time out and only restarting
> dnsmasq
> >> would fix the problem. The pattern would then start over again.
> >>
> >> I may be wrong but it doesn't seem that dnsmasq is hitting a bug that
> >> suddenly causes it to loop and hog the cpu until it's killed. It seems
> to
> >> gradually show more and more of the problem before it eventually hogs
> 100%
> >> cpu and has to be killed.
> >>
> >> If the problem was caused by dnsmasq being overloaded with requests, is
> it
> >> likely or possible that 50 clients could put very little load on it but
> 100
> >> clients could swamp it? Also, would the problem not show itself as soon
> as
> >> dnsmasq was restarted rather than showing the gradual increase in peak
> >> usage until it hits 100%?
> >
> > Logs would help. The pattern doesn't look familiar, but if I had to
> > guess, I'd say that the problem is DHCP, not DNS. Every change to the
> > DHCP lease database causes the file storing it to be re-written, and I
> > suspect that's what's eating CPU, in disk wait.
> >
> > Version of dnsmasq in use would be useful, and a copy of your config (to
> > me privately, if you prefer.)
> >
> > When dnsmasq is running at 100%, try running
> >
> > strace -p <pid of dnsmasq process>
> >
> > that will run forever, printing what syscalls are being made, you can
> > ctrl-c it after a show while, which will stop strace, but not dnsmasq.
> >
> >
> > Cheers,
> >
> >
> > Simon
> >
> >
>
> Chaps,
>
> Please be aware that the dnsmasq included in tomato is not a clean
> 'pull' out of Simon's release but includes some tweaks, mainly to the
> lease writing code (where it outputs 'remaining leasetime' rather than
> expiry time)  There's also a 'helper' function that upon receipt of
> SIGUSR1 (or it may be 2 I can't remember) dumps the leasefile in a
> tomato specific format so that it may be read & parsed into the 'dhcp
> status' page.
>
> Those changes were 'formalised' by me into IFDEF conditional compilation
> flags when I first investigated updating dnsmasq from v2.61 to something
> slightly newer which fixed the IPv6 RA flags.  The original changes by
> Jon Zarate were identified and re-inserted after a few false starts.  I
> am no 'C' coder!
>
> My suggestion for a start are to upgrade to dnsmasq 2.70 rather than a
> test release of 2.69.  Also try changing the location of the leasefile
> to somewhere else e.g. a USB stick if your router supports it.
>
> I've not encountered anything like this but then I don't have 100 clients.
>
> Kevin
>
>
>
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20140425/613ddb7b/attachment-0001.html>


More information about the Dnsmasq-discuss mailing list