[Dnsmasq-discuss] [PATCH] Fix potential untracked tcp_request sub-processes

Wed Apr 14 10:24:29 UTC 2021

On vrijdag 9 april 2021 17:26:42 CEST Simon Kelley wrote:
> On 09/04/2021 11:04, Tijs Van Buggenhout wrote:
> > Hi all,
> > 
> > We are using dnsmasq as part of OpenWrt for a managed VPN solution. As
> > soon as the VPN is up, all DNS requests are forwarded to an upstream DNS
> > server over the VPN.
> > 
> > In case of VPN transmission problems there is a delay until it is finally
> > decided that the tunnel is effectively down, during which DNS requests are
> > still accepted, but never answered.
> > 
> > It happens in those circumstances clients switch to TCP protocol rather
> > than UDP, doing so for all IP families configured for their device. The
> > same queries now enter both on dnsmasq TCP IPv4 and IPv6 socket. Queries
> > are accepted, and tcp_request sub-processes spawned, which are tracked
> > using tcp_pids[MAX_PROCS].
> > 
> > However an overcommit may happen on tcp_pids in the sense that there is an
> > implicit relationship (1-to-1) between the TCP listen sockets selected for
> > POLLIN, and the tcp_request processes spawned and accounted for in
> > tcp_pids. Meaning that a free spot in tcp_pids can only be guaranteed if
> > only one (new) TCP connection is accepted per poll cycle. If more than
> > one is allowed, e.g. IPv4 and IPv6, and tcp_pids has only one spot free,
> > both connections are accepted and result in tcp_request sub-processes,
> > for which the last one is not accounted for in tcp_pids.
> > 
> > In our situation, when the VPN tunnel went down eventually, dnsmasq is
> > restarted, all pids accounted for in tcp_pids are signaled the ALRM has
> > finished, and the main process waits for them to stop. However those
> > unaccounted for continued to exist for minutes after the main process was
> > terminated, and hence a failure to (re)start dnsmasq due to the bind
> > failure (address already in use).
> > 
> > Two problems thus arise:
> > 
> > * untracked tcp_request sub-processes may occur
> > * tcp_request sub-processes do not close inherited listen sockets
> > 
> > For the first problem I could think of two solutions: ensure that only one
> > tcp_request is accepted per poll cycle (downside is delay and a preference
> > for IPv4 over IPv6), or introduce a new tcp6_pids[MAX_PROC] to track IPv6
> > connections (in the latest code that also means an additional
> > tcp6_pipes).
> > 
> > Closing listening sockets in the child process seems like the right thing
> > to do, but due to the many may be cumbersome to implement. The problem is
> > less apparent if all outstanding connections are waited for before
> > termination of the main process. Still, what if the main process receives
> > a SIGKILL rather than a SIGTERM.
> > 
> > Any thoughts/impressions? Below is the (simplest) patch addressing the one
> > tcp_request per poll cycle.
> 
> Thanks for the comprehensive analysis.
> 
> To deal with the second problem first, I think this is not a problem, as
> long as the first problem is fixed: if we manage not to lose track of
> forked TCP connection handlers, then none of the problems arise.
> 
> Now the first problem: there's a better solution: the code is, basically:
> 
> if (there is at least one free process slot)
>    listen on all sockets for incoming tcp connections()
> 
> for each listener with an incoming tcp connection {
>    accept the connection()
>    fork a process to handle it()
>    if (there is a free process slot)
>      store the process in the slot()
> 
> 
> As you note, if there there are more incoming connections in a poll
> cycle, then the process get made, and then lost.
> 
> 
> This is fixed by changing the code to
> 
> if (there is at least one free process slot)
>    listen on all sockets for incoming tcp connections()
> 
> for each listener with an incoming tcp connection {
>    if (there is a free process slot)
>      accept the connection()
>      fork a process to handle it()
>      store the process in the slot()
> 
> If we run out of process slots, it's fine to leave some pending
> connections not accept()ed. They'll be queued in the kernel and picked
> up later when there are slots.
> 
> This should handle all incoming requests eventually, without starvation
> of any particular interface or address type.
> 
> The fix at
> 
> https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=ad90eb075dfeeb193
> 6e8bc0f323fcc23f89364d4
> 
> should do the trick. If you could try it out, I'd be very grateful.
> 
> 
> Cheers
> 
> Simon.
> 
> 
> 
> 
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss

Hi Simon,

Thank you for the prompt reply/action, and sorry for the late reply. It took 
me some time to backport the patch to the version we are using, and create a 
reproducible test-setup.

But all seems fine. The code looks good to me. The number of sub-processes 
never succeeds MAX_PROCS. I verified during straces that both IPv4 and IPv6 
sockets can accept new connections in one and the same poll cycle.

17:38:18 poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, 
events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, 
events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}], 8, -1) = 3 
([{fd=7, revents=POLLIN}, {fd=9, revents=POLLIN}, {fd=11, revents=POLLIN}]) 
<0.000065>

fd 7 and 9 are TCP IPv4 and IPv6 sockets respectively.

I did not test in combination with TFTP (whether new TFTP connections can be 
handled while tcp_pids is depleted).

Best regards,
Tijs