<div dir="ltr"><div dir="ltr">Hi Simon,<div><br></div><div>Thanks for your response. I don't have the detailed logs but it's a noisy qa wireless environment where clients are coming and going a lot. eg. In syslog I could see instances where we would get a DHCP request and then a L2 wireless disassociate message would appear immediately afterwards, that response isn't going to be deliverable as unicast (although for dhcp it might fall back to broadcast eventually). </div><div><br></div><div>As we know, DNS isn't logged in such a manner but you could see the same scenario unfolding where we get a bunch of dns requests, the client drops off immediately afterwards and the responses can't be delivered. When there's a lot of requests or a lot of clients you can see how the socket buffer would fill.</div><div><br></div><div>Increasing the socket buffers as I described below allowed the test to run for the required 96 hours, without it we weren't making it past the 48 hour mark.</div><div><br></div><div>A dynamic solution might work provided it was carefully bound to prevent DoS. If you have something you'd like us to test I probably arrange a time slot, it's a busy setup that needs lots of hardware though.</div><div><br></div><div>Thanks,<br>Tom Keddie</div><div><br></div><div>ps. this is a controlled environment (as much as you can control wifi), there are no malicious actors nor intent in this scenario. It's a soak test with a large variety of clients all doing busy work like video streaming etc.</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, May 13, 2022 at 12:48 PM Simon Kelley <<a href="mailto:simon@thekelleys.org.uk">simon@thekelleys.org.uk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
<br>
On 10/05/2022 16:40, Tom Keddie via Dnsmasq-discuss wrote:<br>
> Hi All,<br>
> <br>
> I think you're saying that it's not surprising that dnsmasq is not<br>
> reading from the socket because the send queue is also full.<br>
> <br>
> <br>
> As per this thread on netdev <br>
> (<a href="https://lore.kernel.org/netdev/CABUuw65R3or9HeHsMT_isVx1f-7B6eCPPdr+bNR6f6wbKPnHOQ@mail.gmail.com/" rel="noreferrer" target="_blank">https://lore.kernel.org/netdev/CABUuw65R3or9HeHsMT_isVx1f-7B6eCPPdr+bNR6f6wbKPnHOQ@mail.gmail.com/</a> <br>
> <<a href="https://lore.kernel.org/netdev/CABUuw65R3or9HeHsMT_isVx1f-7B6eCPPdr+bNR6f6wbKPnHOQ@mail.gmail.com/" rel="noreferrer" target="_blank">https://lore.kernel.org/netdev/CABUuw65R3or9HeHsMT_isVx1f-7B6eCPPdr+bNR6f6wbKPnHOQ@mail.gmail.com/</a>>) <br>
> it seems we were consuming the socket send buffer with pending packets <br>
> waiting for ARP responses that were never coming. This was causing <br>
> failures sending to devices that were still live.<br>
> <br>
> As per that thread we increased the /proc/sys/net/core/wmem_default <br>
> value so all sockets will have larger send buffers (the device has very <br>
> few sockets in use). It might be useful to add dnsmasq config options to <br>
> increase SO_SNDBUF on the dhcp and dns sockets to allow more granular <br>
> control.<br>
> <br>
> Thanks, Tom Keddie<br>
<br>
So queries are being received, and answered, but the reply is being <br>
dropped by the kernel because the send queue is full of replies to dead <br>
hosts? If the hosts are dead, where are the queries coming from to <br>
generate these blocked replies?<br>
<br>
It might be sensible to automatically increase the send queue length <br>
when a packer send gets EAGAIN. at least the first time, but I'd like to <br>
understand exactly what's going on first.<br>
<br>
<br>
Simon.<br>
<br>
> <br>
> _______________________________________________<br>
> Dnsmasq-discuss mailing list<br>
> <a href="mailto:Dnsmasq-discuss@lists.thekelleys.org.uk" target="_blank">Dnsmasq-discuss@lists.thekelleys.org.uk</a><br>
> <a href="https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss" rel="noreferrer" target="_blank">https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss</a><br>
</blockquote></div></div>