[Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

Kevin 'ldir' Darbyshire-Bryant ldir at darbyshire-bryant.me.uk
Fri Apr 16 09:04:52 UTC 2021



> On 14 Apr 2021, at 00:34, Simon Kelley <simon at thekelleys.org.uk> wrote:
> 
> Tagging onto the end of the thread just to report the results of my
> research.
> 
> This started because of problems with the OOM killer in a
> resource-constrained system that was prompting OOM death when it spawned
> sub-processes to handle TCP connections. I proposed a trick of putting
> the large in-memory dataset in sub-process, so that the main dnsmasq
> process which forks to handle TCP requests stays small. Reading around,
> that probably won't work: the OOM killer weights size of all children
> when looking for a victim. I think that trying to outsmart the OOM
> killer is probably a hiding to nothing. It's possible for the OS to
> protect critical daemons using oom_adj and friends, and supporting that
> in OpenWRT looks like a better way to go.
> 
> We then moved onto the fact that adblocking involves thousands of lines of
> 
> local=/example.com/
> 
> and the code supporting that didn't really envisage such large numbers.
> I think improvements can be made there, and I'll look at doing that in
> more detail.

Hi Simon,

Thanks for picking this up again.  There are multiple interlinked problems:

(ab)use of local=/example.com/ or similar for adblock lists which leads to memory usage & (apparently) slower response time due to the linear search of ’servers’ handling the local domain.  I’m sure the ‘local’ list could be put into some sort of tree/hash structure to speed that up but the memory consumption will inherently be there to some (hopefully optimised) extent.  The list has to exist :-)

TCP requests causing a fork of dnsmasq with all that ‘local/server’ list memory usage, up to 21 times.  I’m looking at my APU 2 running openwrt at the moment with a 46000 line (small) adblock ‘address=/foo.bar/‘ list - dnsmasq consuming circa 12MB.  21*12MB 252MB isn’t going to cause my APU to sweat memory wise but a lesser device could very well ask ‘Where am I going to find this extra 228MB of memory from then?’.

I don’t agree that your ‘large dataset sub-process with small tcp handling children’ won’t work.  Yes linux oom killer takes into account all the children, but the tcp children are going to be much(much!) smaller, presumably the size of a basic dnsmasq instance of which most of it will be program text.  A base dnsmasq on said APU2 takes 2.5MB (the above example would then total 12MB + 20*2.5=62MB) That’s quite a difference on a constrained system between asking for lumps of 12MB vs 2.5MB.  Whether this approach makes sense from a latency perspective I don’t know… I’m assuming that large process is acting as an ‘address validator’ and hence the requesting children will need to wait for its ‘yes/no’ answer subject to process scheduling etc.

Tony mentioned another possible solution which avoids the sub-process malarky of marking the ’server list’ memory in some way that the OS knows it’s shared and therefore don’t need to have ‘real memory’ for all those tcp sub-processes.  I’m sure he’ll be along in a minute to explain further.

Cheers,

Kevin D-B

gpg: 012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20210416/dd0dd751/attachment.sig>


More information about the Dnsmasq-discuss mailing list