[Dnsmasq-discuss] dnsmasq 2.91test9 + TCP + stale cache
Simon Kelley
simon at thekelleys.org.uk
Fri Jan 31 23:20:27 UTC 2025
On 1/31/25 16:19, Dominik Derigs wrote:
> Hey Simon,
>
> we have found another (small) thing. The requirements for reintroducing
> are:
>
> 1. Using --use-stale-cache
> 2. A query is received via TCP
> 3. The cached record is stale
>
> Querying the stale record causes the query to be "refreshed". However,
> when, at the same time, the client disconnects (and the TCP fork exits
> accordingly), the received reply will never be received and find its way
> in the mother process's cache.
>
> Could we postpone the shutdown of TCP forks in case a refreshment query
> is still ongoing?
>
That's not what happens, or at least that's not what is supposed to happen.
The control flow in tcp_request() when stale data is found in the cache
goes as follows.
1) read query from TCP client connection.
2) Lookup in cache and get stale answer.
3) Return stale answer to requestor over TCP connection.
4) Close client TCP connection server-side. This forces the client to
open another connection and create a new process if it has more queries.
The reason for this is that the existing process now
5) sends the query upstream and blocks awaiting the answer.
6) receives the answer and caches it in the local process This also
serialises the answer into the pipe to the parent process.
7) return from tcp_request() and the process exits.
At this point the data has either been read from the pipe by the parent
process and inserted into its cache, or it is still in the pipe buffer
and will shortly be read. This is why the process-management code in
dnsmasq.c doesn't free a process slot until _both_ the process has gone
and the pipe has returned EOF and been closed.
Pipes don't disappear and lose data until both ends have been closed. If
the write end is closed but there is queued data, then the read end can
still read the queued data.
The control flow in tcp_request() is hard to follow, since it repeats
the loop both to read another query and the get an answer to it, and to
get an answer to a query which has already been answered and send that
back to mother.
The pseudocode which describes what a TCP child process does looks like
this.
do-stale = FALSE
while (1)
{
have_answer = FALSE
if (!do_stale)
{
read_query()
if (no_query_client_connection_closed())
exit();
if (query_in_cache())
have_answer = TRUE
}
if (!have_answer)
{
send_query_upstream()
get_answer_from_upstream()
insert_answer_into_cache_and_pipe_to_parent()
}
if (do_stale)
exit();
return_answer()
if (answer_was_stale())
{
do_stale = TRUE;
close_client_connection()
}
}
The client disconnecting doesn't cause the process to exit before the
new data has been pushed into the pipe. If you can demonstrate that it
does, that's a bug.
Cheers,
Simon.
More information about the Dnsmasq-discuss
mailing list