[Dnsmasq-discuss] dnsmasq 2.86 seems to stop reading from one of its dns sockets after a period of time under load

Fri May 13 19:48:12 UTC 2022

On 10/05/2022 16:40, Tom Keddie via Dnsmasq-discuss wrote:
> Hi All,
> 
>     I think you're saying that it's not surprising that dnsmasq is not
>     reading from the socket because the send queue is also full.
> 
> 
> As per this thread on netdev 
> (https://lore.kernel.org/netdev/CABUuw65R3or9HeHsMT_isVx1f-7B6eCPPdr+bNR6f6wbKPnHOQ@mail.gmail.com/ 
> <https://lore.kernel.org/netdev/CABUuw65R3or9HeHsMT_isVx1f-7B6eCPPdr+bNR6f6wbKPnHOQ@mail.gmail.com/>) 
> it seems we were consuming the socket send buffer with pending packets 
> waiting for ARP responses that were never coming.  This was causing 
> failures sending to devices that were still live.
> 
> As per that thread we increased the /proc/sys/net/core/wmem_default 
> value so all sockets will have larger send buffers (the device has very 
> few sockets in use). It might be useful to add dnsmasq config options to 
> increase SO_SNDBUF on the dhcp and dns sockets to allow more granular 
> control.
> 
> Thanks, Tom Keddie

So queries are being received, and answered, but the reply is being 
dropped by the kernel because the send queue is full of replies to dead 
hosts? If the hosts are dead, where are the queries coming from to 
generate these blocked replies?

It might be sensible to automatically increase the send queue length 
when a packer send gets EAGAIN. at least the first time, but I'd like to 
understand exactly what's going on first.

Simon.

> 
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss