[Dnsmasq-discuss] [PATCH] TCP client timeout setting

Wed May 31 11:40:30 UTC 2023

Attached my bigger attempt to solve this problem. 7 separate patches 
included.

Because in rare circumstances servers might have changed since TCP 
connection process started, I serialize domain, server address and its 
last position in server array. Hacked a redirection with special -2 
value from cache inserting, where it always reads just one server. It 
then tries server at reported position first, iterating on all servers 
for this domain.

In rare cases there might be a problem, because it does not send source 
address or interface, which might identify correct server. But I doubt 
those would make it different in any real world examples. We have no 
simple identification for changed servers. It works in basic testing well.

Added also separation of TCP and UDP last servers. It should be able to 
forward UDP to server responding just over UDP and TCP to server 
responding just TCP. That should be quite rare case, more teoretical 
than real-world. Maybe change of UDP server should change also TCP, 
because UDP test can be done in parallel.

I have found also unwanted difference from UDP queries. If the response 
is REFUSED, even that were accepted as valid last_server response. Now 
it sets TCP last_server just after non-refused response, not just 
successful connection.

I have tried to look into glibc, that does not seem to set any timeout 
for TCP (vc) queries. Default timeout in dig tool is 10 seconds, it does 
not seem to tweak number of SYN packets sent. I think it just measures 
time before reply arrives. I think ideally we should be able to spawn 
another TCP connection to the other server if it didn't respond in few 
seconds. And wait for fastest response from any of those. But that would 
require quite significant rework of current code.

Did just a basic testing, but those changes improve tested situation.

What do you think about it?

Cheers,
Petr

On 26. 05. 23 18:19, Simon Kelley wrote:
>
>
> On 25/05/2023 20:32, Petr Menšík wrote:
>> This problem is best tested by an example, taken from [2] but a bit 
>> modified.
>>
>> Let's create hepothetical network issue with one forwarder, which 
>> worked fine a while ago.
>>
>> $ sudo iptables -I INPUT -i lo -d 127.0.0.255 -j DROP
>>
>> Now start dnsmasq and send tcp query to it
>>
>> $ dnsmasq -d --log-queries --port 2053 --no-resolv 
>> --conf-file=/dev/null --server=127.0.0.255 --server=127.0.0.1
>> $ dig +tcp @localhost -p 2053 test
>>
>> ;; communications error to ::1#2053: timed out
>> ;; communications error to ::1#2053: timed out
>> ;; communications error to ::1#2053: timed out
>> ;; communications error to 127.0.0.1#2053: timed out
>>
>> ; <<>> DiG 9.18.15 <<>> +tcp @localhost -p 2053 test
>> ; (2 servers found)
>> ;; global options: +cmd
>> ;; no servers could be reached
>>
>> Because dig waits much shorter time than dnsmasq does, it never 
>> receives any reply. Even when the other server is responding just 
>> fine. That is main advantage of having local cache running, isn't it? 
>> It should improve things!
>>
>> Now lets be persistent and keep trying:
>>
>> $ time for TRY in {1..6}; do dig +tcp @localhost -p 2053 test; done
>>
>> After few timeouts, it will finally notice something is wrong and 
>> tries also the second server, which will answer fast. However this 
>> works only with dnsmasq -d, which is not used in production. If I 
>> replace it with dnsmasq -k, it will not answer at all!
>>
>> $ dnsmasq -k --log-queries --port 2053 --no-resolv 
>> --conf-file=/dev/null --server=127.0.0.255 --server=127.0.0.1
>> $ time for TRY in {1..8}; do dig +tcp @localhost -p 2053 test; done
>>
>> ...
>> ;; communications error to ::1#2053: timed out
>> ;; communications error to ::1#2053: timed out
>> ;; communications error to ::1#2053: timed out
>> ;; communications error to 127.0.0.1#2053: timed out
>>
>> ; <<>> DiG 9.18.15 <<>> +tcp @localhost -p 2053 test
>> ; (2 servers found)
>> ;; global options: +cmd
>> ;; no servers could be reached
>>
>>
>> real    5m20,602s
>> user    0m0,094s
>> sys    0m0,115s
>>
>> This is because with -k it spawns tcp workers, which start always 
>> with whatever last_server prepared by last UDP. And until any UDP 
>> query arrives to save the day, it will stubbornly try non-responding 
>> server first. Even when the other one answers in miliseconds. Notice 
>> it have been trying 5 minutes without success.
>>
>> I think this has to be fixed somehow. This is corner case, because 
>> TCP queries are usually caused by UDP queries with TC bit set. But 
>> there exist real-world examples, where TCP only query makes sense. 
>> But dnsmasq does not handle them well. Summarized this at [3].
>>
>> My proposal would be sending UDP query + EDNS0 header in case sending 
>> query failed to the main process, which can then trigger forwarders 
>> responsiveness and change the last_server to a working one. So 
>> subsequent attempts do not fall into the blackhole again and again. 
>> EDNS0 header would be there to increase chance for a positive reply 
>> from upstream, which can be cached.
>>
>> Would you have other ideas, how to solve this problem?
>>
>> Cheers,
>> Petr
>>
>
>
> The long delay awaiting a connection from a non-responsive server may 
> be improved by reducing the value of the TCP_SYNCNT socket option, at 
> least on Linux.
>
>
> I think it's pretty easy to pass back the identity of a server which 
> is responding to TCP connections to the main process using the same 
> mechanism that passes back cache entries. The only wrinkle is if the 
> list of servers changes between forking the child process and is 
> sending back data about which server worked, for instance is the 
> srever list gets reconfigured. Detecting that just needs an "epoch" 
> counter to be included. It's rare, so just rejecting a "use this 
> server" update from a child that was spawned in a different epoch from 
> the current one should avoid problems. Provided the epoch is the same, 
> indices into the server[] array are valid to send across the pipe.
>
> I like the idea of using a different valid server for TCP and UDP.
>
> Note that the TCP code does try to pick a good server. It's not 
> currently much good with long connection delays, but it does cope with 
> ignoring a server which accepts connections and then immediately 
> closes them. I guess that must have been a real-world problem sometime.
>
> Cheers,
>
> Simon.
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=2160466#c6
>> [3] https://bugzilla.redhat.com/show_bug.cgi?id=2160466#c13
>>
>> On 19. 05. 23 13:40, Petr Menšík wrote:
>>> When analysing report [1] for non-responding queries over TCP, I 
>>> have found forwarded TCP connections have quite high timeout. If for 
>>> whatever reason the forwarder currently set as a last used forwarder 
>>> is dropping packets without reject, the TCP will timeout for about 
>>> 120 seconds on my system. That is way too much, I think any TCP 
>>> clients will give up far before that. This is just quick workaround 
>>> to improve the situation, not final fix.
>>>
>>> ...
>>>
>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=2160466
>>>
>>>
>>> _______________________________________________
>>> Dnsmasq-discuss mailing list
>>> Dnsmasq-discuss at lists.thekelleys.org.uk
>>> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss 
>>>
>
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss

-- 
Petr Menšík
Software Engineer, RHEL
Red Hat, http://www.redhat.com/
PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-dns-tcp-timeout-option.patch
Type: text/x-patch
Size: 4874 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20230531/13f76c57/attachment-0007.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Reduce-few-TCP-related-repeated-code.patch
Type: text/x-patch
Size: 17089 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20230531/13f76c57/attachment-0008.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Report-changed-TCP-servers-to-master-process.patch
Type: text/x-patch
Size: 9089 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20230531/13f76c57/attachment-0009.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-Add-logging-of-TCP-server-changes.patch
Type: text/x-patch
Size: 1082 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20230531/13f76c57/attachment-0010.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-Initialize-last_server-right-after-allocation.patch
Type: text/x-patch
Size: 813 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20230531/13f76c57/attachment-0011.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0006-Set-last_server-for-TCP-similar-way-to-UDP.patch
Type: text/x-patch
Size: 2330 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20230531/13f76c57/attachment-0012.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0007-Make-last-used-DNS-server-separate-for-UDP-and-TCP.patch
Type: text/x-patch
Size: 5015 bytes
Desc: not available
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20230531/13f76c57/attachment-0013.bin>