[Dnsmasq-discuss] Partial denial of service with dnsmasq on resource constrained systems

Thu Apr 1 02:43:04 UTC 2021

From: "Tony Ambardar" <Tony.Ambardar at gmail.com>

On 27/03/2021 17:21, Simon wrote:
>> On 24/03/2021 19:55, Ian wrote:
>>
>> It seems that on resource constrained routers, it’s possible to execute
>> a non-critical denial of service attack against the router simply by
>> opening multiple tcp queries to dnsmasq, which then forks for each tcp
>> connection up to MAX_PROCS times, resulting in oom-killer being invoked
>> after the router runs out of memory.
>>
>> One could imagine a malicious app or shell script constantly spawning
>> new tcp connections and keeping the router out of memory as a result.
>>
>> This problem came to light on the Openwrt forum as a user had a taxi
>> booking app that opened multiple tcp connections to dnsmasq.
>>
>> A simple patch to add a long form configuration option
>> “—max-procs=<number>” to dnsmasq that allows MAX_PROCS to be overridden
>> at runtime fixed the user’s problem.
>>
>> Not sure if this is the best way of dealing with the problem, but wanted
>> to bring this to the list’s attention.
>
>
> The default value of MAX_PROCS is 20, which doesn't seem excessive, my
> point being that systems which run out of memory when dnsmasq forks 20
> times are likely a pretty small niche, and reducing the default is not a
> good idea on most systems. Note that dnsmasq doesn't exec() after
> forking, so the text segments will be shared, and I'd expect that not a
> lot of the memory in the TCP-handling process would be written, so
> copy-on-write will share data pages too. I don't have figures, but it's
> certainly not the case that each fork doubles the memory footprint.
>
> Adding this is a run-time option is only useful if people know that
> their system in vulnerable, and use the option, or a distribution always
> sets the option. But if OpenWrt determines that this is a general
> problem on OpenWrt systems, the best solution would be for OpenWRT
> packages to be compiled with MAX_PROCS set to a lower value. Carrying a
> single-line patch to src/config.h is a sensible way to do that.
>
>
> Any look at the dnsmasq man page shows that we're not averse to adding
> configurability, but the configurability has to have real-world uses,
> and options which have to be set in ill-defined circumstances to avoid
> catastrophic problems are not good options. It's a judgement call, but
> my judgement whenever this was written (at least a decade ago) was that
> this wasn't a useful parameter for a user to tweak. I can't help
> thinking that changing that now isn't really solving the underlying
> problem, but I can't offer a better solution.
>
> Comments? How do we fix this?
>
>
> Simon.

Hi Simon,

I also hit the OOM issue a few years back, while evaluating dnsmasq
performance on OpenWrt with large server lists loaded, and spent some time
investigating. I could be hazy on some details but the following covers
the basics.

You're right that text segments are fairly small and shared; memory usage
was dominated by storage for blocklists read from file. This makes the
problem more general than just tiny systems, since people tend to size
their blocklists proportional to system memory size.

You're also right that actual memory footprint increases only minimally
with each fork() thanks to copy-on-write; I'm certain these OOM systems
aren't really exhausting memory. But I do think there's confusion around
memory usage optimizations like COW vs. memory accounting used for OOM.

I recall looking at dnsmasq process statistics on OOM invocation, and
noticed their VM set sizes were usually close to total system memory, i.e.
COW wasn't relevant. And from a dnsmasq proc memory map, the large segment
storing the blocklist was marked read-write. I suspect that despite COW,
since that memory is *potentially* writable it's being accounted for at
fork() time.

A possible fix I'd suggest is to update dnsmasq's memory handling. IIRC,
we use the same cache structure and memory allocation for both DNS cache
and storing static server lists read from file. Perhaps use a separate,
page-aligned memory pool to store these lists, then after initialization
(and before forking) use mprotect() to set the region as read-only.

Assuming it works, this would have the advantage of being a no-knobs
solution vs. setting kludgey process or connection limits.

One other thing I saw while testing with large blocklists was a noticeable
latency increase, likely related to lookup times. I recall some discussion
on the ML where you mentioned work on a hash/tree solution was in
progress. Were those changes completed?

Best regards,
Tony