[Dnsmasq-discuss] Leftover helper process after main process exit on FreeBSD

Simon Kelley simon at thekelleys.org.uk
Fri Jun 20 16:13:02 UTC 2025



On 6/13/25 13:30, Roman Bogorodskiy wrote:
> Hi,
> 
> I've noticed an issue on FreeBSD which I can reproduce this way:
> 
> # ./src/dnsmasq --interface=bridge0 --except-interface=lo0 --dhcp-range=192.168.127.2,192.168.127.254,255.255.255.0 --dhcp-script=/usr/bin/true
> $  ps aux|grep dnsm
> nobody     12741    0,0  0,0    14500    3128  -  I    13:43             0:00,00 ./src/dnsmasq --interface=bridge0 --except-interface=lo0 --dhcp-range=192.168.127.2,192.168.127.254,255.255.255.0 --dhcp-script=/usr/bin/true
> root       12742    0,0  0,0    14500    3008  -  I    13:43             0:00,00 ./src/dnsmasq --interface=bridge0 --except-interface=lo0 --dhcp-range=192.168.127.2,192.168.127.254,255.255.255.0 --dhcp-script=/usr/bin/true
> novel      12763    0,0  0,0    14192    2588  1  S+   13:44             0:00,00 grep dnsm
> $
> # kill 12741
> $ ps aux|grep dns
> root       12742    0,0  0,0    14500    3008  -  I    13:43             0:00,00 ./src/dnsmasq --interface=bridge0 --except-interface=lo0 --dhcp-range=192.168.127.2,192.168.127.254,255.255.255.0 --dhcp-script=/usr/bin/true
> novel      12785    0,0  0,0    14192    2560  1  S+   13:45             0:00,00 grep dns
> $
> 
>   There is a leftover process. When I attach to it using gdb I see:
> 
> (gdb) attach 12742
> Attaching to program: /usr/home/novel/code/dnsmasq/src/dnsmasq, process 12742
> Reading symbols from /lib/libc.so.7...
> Reading symbols from /usr/lib/debug//lib/libc.so.7.debug...
> Reading symbols from /lib/libsys.so.7...
> Reading symbols from /usr/lib/debug//lib/libsys.so.7.debug...
> Reading symbols from /libexec/ld-elf.so.1...
> Reading symbols from /usr/lib/debug//libexec/ld-elf.so.1.debug...
> _read () at _read.S:4
> 4       PSEUDO(read)
> (gdb) bt
> #0  _read () at _read.S:4
> #1  0x00000000002208a1 in read_write (fd=19, packet=0x8204deea8 "\260\236\212\"\b", size=112, rw=1) at util.c:783
> #2  0x000000000024e6ca in create_helper (event_fd=16, err_fd=18, uid=0, gid=0, max_fd=1877346) at helper.c:199
> #3  0x000000000023b1f1 in main (argc=5, argv=0x8204df170) at dnsmasq.c:743
> (gdb)
> 
> So it looks like it's stuck reading from pipefd[0]:
> 
> (gdb) fr 2
> #2  0x000000000024e6ca in create_helper (event_fd=16, err_fd=18, uid=0, gid=0, max_fd=1877346) at helper.c:199
> 199           if (!read_write(pipefd[0], (unsigned char *)&data, sizeof(data), RW_READ))
> (gdb)
> 
> It also looks like both fd's are open in the helper side:
> 
> (gdb) p pipefd
> $12 = {19, 20}
> (gdb)
> 
> (gdb) call fcntl(20, 1)
> $13 = 0
> (gdb)
> 
> Now if I close(20):
> 
> (gdb) call close(20)
> $14 = 0
> (gdb) c
> Continuing.
> [Inferior 1 (process 12742) exited normally]
> (gdb)
> 
> 
> So the following change fixed this for me:
> 
> --- a/src/helper.c
> +++ b/src/helper.c
> @@ -96,6 +96,8 @@ int create_helper(int event_fd, int err_fd, uid_t uid, gid_t gid, long max_fd)
>         close(pipefd[0]); /* close reader side */
>         return pipefd[1];
>       }
> +  else
> +      close(pipefd[1]);
> 
>     /* ignore SIGTERM and SIGINT, so that we can clean up when the main process gets hit
>        and SIGALRM so that we can use sleep() */
> 
> 
> FWIW, that's happening on FreeBSD 15.0-CURRENT amd64 and latest master
> of dnsmasq.
> 
> However, I'm not sure that these reproduction steps are 100% sufficient.
> I wasn't able to reproduce that on another FreeBSD 14.2-RELEASE amd64
> system with Dnsmasq version 2.91.


I'm not sure what the bug is, but I'm very suspicious of commit 
8a5fe8ce6bb6c2bd81f237a0f4a2583722ffbd1c, even though it's in the 2.91 
codebase.

The write side of the pipe in the helper process is supposed to be 
closed by the call

close_fds(max_fd, pipefd[0], event_fd, err_fd);

at line 134 of src/helper.c

That call should close() ALL open fds except STDIN, STDOUT and STDERR, 
and the three fds passed in as arguments. This preserves the 
reader-side, as pipefd[0] is one of the arguments, but the write side 
should be closed. I checked in Linux (which doesn't exhibit the bug) and 
that's exactly what does happen.

If you look at the code for close_fds() there are two code paths. A dumb 
one which calls close() for every possible fd between zero and the 
system max except for the six which are to be spared. Then there's a 
smart path which reads a directory in /proc to find out which fds are 
actually open, and only closes those.

The smart path saves a lot of work on servers which are configured to 
support enormous numbers of open files per process.

The smart path used to only exist on Linux, but was introduced on BSD 
during the 2.91 development at the end of 2024. My suspicion is that 
that is the cause of the regression.

The smart path is same for Linux and BSD except that the directory full 
of links to open files is at /proc/self/fd on Linux and /dev/fd on *BSD 
If these directories don't exist then the code falls back to the dumb 
code path.

So, can you try and determine why close_fds() is not closing the 
write-side of the pipe in the helper process(), since that should 
already be doing what your patch does?


Cheers,

Simon.





> 
> Thanks,
> Roman
> 
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
> 




More information about the Dnsmasq-discuss mailing list