[Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

Simon Kelley simon at thekelleys.org.uk
Tue Apr 13 23:34:03 UTC 2021


Tagging onto the end of the thread just to report the results of my
research.

This started because of problems with the OOM killer in a
resource-constrained system that was prompting OOM death when it spawned
sub-processes to handle TCP connections. I proposed a trick of putting
the large in-memory dataset in sub-process, so that the main dnsmasq
process which forks to handle TCP requests stays small. Reading around,
that probably won't work: the OOM killer weights size of all children
when looking for a victim. I think that trying to outsmart the OOM
killer is probably a hiding to nothing. It's possible for the OS to
protect critical daemons using oom_adj and friends, and supporting that
in OpenWRT looks like a better way to go.

We then moved onto the fact that adblocking involves thousands of lines of

local=/example.com/

and the code supporting that didn't really envisage such large numbers.
I think improvements can be made there, and I'll look at doing that in
more detail.


Cheers,
Simon.




On 09/04/2021 23:28, Neal P. Murphy wrote:
> On Fri, 9 Apr 2021 18:24:47 +0100
> Simon Kelley <simon at thekelleys.org.uk> wrote:
> 
>> On 05/04/2021 16:16, Gordon Shawn wrote:
>>>     Date: Thu, 1 Apr 2021 22:11:17 -0400
>>>     From: "Neal P. Murphy" <neal.p.murphy at alum.wpi.edu
>>>     <mailto:neal.p.murphy at alum.wpi.edu>>
>>>     To: dnsmasq-discuss at lists.thekelleys.org.uk
>>>     <mailto:dnsmasq-discuss at lists.thekelleys.org.uk>
>>>     Subject: Re: [Dnsmasq-discuss] Partial denial of service with dnsmasq
>>>             on resource constrained systems
>>>     Message-ID: <20210401221117.47313352 at playground>
>>>     Content-Type: text/plain; charset=US-ASCII
>>>
>>>     On Thu, 1 Apr 2021 23:55:08 +0100
>>>     Simon Kelley <simon at thekelleys.org.uk
>>>     <mailto:simon at thekelleys.org.uk>> wrote:
>>>   
>>>     > >
>>>     > > One other thing I saw while testing with large blocklists was a  
>>>     noticeable  
>>>     > > latency increase, likely related to lookup times. I recall some  
>>>     discussion  
>>>     > > on the ML where you mentioned work on a hash/tree solution was in
>>>     > > progress. Were those changes completed?
>>>     > >     
>>>     >
>>>     >
>>>     > This seems to be the crucial aspect here: large blocklists. Is we move
>>>     > the large blocklists to a subsystem designed to handle them, then the
>>>     > problem goes away.
>>>     >
>>>     > I could do with a handle on exactly how people are configuring dnsmasq
>>>     > to do ad blocking. It's not something I have much experience of.  
>>>
>>>     On Smoothwall Express, I've conf'ed dnsmasq to 'undefine' a large
>>>     number of FQDNs using the form 'local=/8teenporno.com/
>>>     <http://8teenporno.com/>' I pull the Shalla data and use the ads,
>>>     pron, warez, and a few other categories.
>>>
>>>     768 000 FQDNs makes dnsmasq use around 100MiB of RAM. On an Atom
>>>     N270 running SWE, response time is generally in the range of 75 ms
>>>     to 100 ms when there's no traffic. With the DL saturated (using
>>>     speedtest.net <http://speedtest.net>), response times range from
>>>     500ms to 2s. Saturated UL doesn't seem to affect response time much.
>>>
>>>     I've been satisfied with its operation; I see almost no ads and
>>>     pretty much nothing in the other categories I use.
>>>
>>>     N
>>>
>>> This is still about 130 bytes per FQDN, sounds like a lot for the RAM usage.
>>>
>>> The key issue to me with large lists, is that addn-hosts etc can be
>>> reloaded by SIGHUP and dnsmasq can be restarted very quickly which is
>>> fine, however when  you have large conf files with many --local or
>>> --address lines, a full dnsmasq restart is required and it can take
>>> seconds or minutes which is very bad when you need update the blocklists. 
>>>
>>> It will be nice for local/address/cname lines be parsed just like the
>>> addn-host files, i.e. with a SIGHUP they can be quickly parsed and used
>>> instead of being part of  the full blown dnsmasq.conf for a time
>>> consuming parsing.
>>>   
>>
>> If the time spent in (re)starting dnsmasq is almost all taken in parsing
>> enormous blocklists, then adding the ability to re-read the blocklists
>> without re-starting dnsmasq isn't going to help: the existing dnsmasq
>> process will just go away from minutes parsing the blocklist instead.
> 
> On my 1.6GHz Atom N270, it takes dnsmasq seven seconds to be killed and restarted and to read 768,000 'local' entries (that is, its RAM usage has peaked and it has daemonized itself); it takes another 7 seconds for dnsmasq's CPU usage to drop down to 5%. For most SOHO uses (and others where DNS use isn't really high), that's not too unreasonable considering the garbage that can be kept out of one's network. At a site where there is very high utilization of DNS, seven seconds would be far too long; but then, they likely wouldn't be using dnsmasq anyway.
> 
> It might be nice if dnsmasq had a likely optional ipset-like functionality to add/change/delete local domains on the fly. But saving seven seconds once a day (or once a week) might not be worth the development effort.
> 
> FWIW, I use the following Shalla categories in my blocklist:
>   - adv  
>   - aggressive
>   - costtraps
>   - fortunetelling
>   - hacking
>   - porn 
>   - redirector
>   - sex_lingerie
>   - spyware
>   - tracker
>   - violence
> 
> One of these days, I'll put the 112,000 IPv4 addrs that are in those categories into a netfilter set and block them as well.
> 
> N
> 
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
> 




More information about the Dnsmasq-discuss mailing list