[Dnsmasq-discuss] Leftover helper process after main process exit on FreeBSD
Simon Kelley
simon at thekelleys.org.uk
Fri Jun 20 16:13:02 UTC 2025
On 6/13/25 13:30, Roman Bogorodskiy wrote:
> Hi,
>
> I've noticed an issue on FreeBSD which I can reproduce this way:
>
> # ./src/dnsmasq --interface=bridge0 --except-interface=lo0 --dhcp-range=192.168.127.2,192.168.127.254,255.255.255.0 --dhcp-script=/usr/bin/true
> $ ps aux|grep dnsm
> nobody 12741 0,0 0,0 14500 3128 - I 13:43 0:00,00 ./src/dnsmasq --interface=bridge0 --except-interface=lo0 --dhcp-range=192.168.127.2,192.168.127.254,255.255.255.0 --dhcp-script=/usr/bin/true
> root 12742 0,0 0,0 14500 3008 - I 13:43 0:00,00 ./src/dnsmasq --interface=bridge0 --except-interface=lo0 --dhcp-range=192.168.127.2,192.168.127.254,255.255.255.0 --dhcp-script=/usr/bin/true
> novel 12763 0,0 0,0 14192 2588 1 S+ 13:44 0:00,00 grep dnsm
> $
> # kill 12741
> $ ps aux|grep dns
> root 12742 0,0 0,0 14500 3008 - I 13:43 0:00,00 ./src/dnsmasq --interface=bridge0 --except-interface=lo0 --dhcp-range=192.168.127.2,192.168.127.254,255.255.255.0 --dhcp-script=/usr/bin/true
> novel 12785 0,0 0,0 14192 2560 1 S+ 13:45 0:00,00 grep dns
> $
>
> There is a leftover process. When I attach to it using gdb I see:
>
> (gdb) attach 12742
> Attaching to program: /usr/home/novel/code/dnsmasq/src/dnsmasq, process 12742
> Reading symbols from /lib/libc.so.7...
> Reading symbols from /usr/lib/debug//lib/libc.so.7.debug...
> Reading symbols from /lib/libsys.so.7...
> Reading symbols from /usr/lib/debug//lib/libsys.so.7.debug...
> Reading symbols from /libexec/ld-elf.so.1...
> Reading symbols from /usr/lib/debug//libexec/ld-elf.so.1.debug...
> _read () at _read.S:4
> 4 PSEUDO(read)
> (gdb) bt
> #0 _read () at _read.S:4
> #1 0x00000000002208a1 in read_write (fd=19, packet=0x8204deea8 "\260\236\212\"\b", size=112, rw=1) at util.c:783
> #2 0x000000000024e6ca in create_helper (event_fd=16, err_fd=18, uid=0, gid=0, max_fd=1877346) at helper.c:199
> #3 0x000000000023b1f1 in main (argc=5, argv=0x8204df170) at dnsmasq.c:743
> (gdb)
>
> So it looks like it's stuck reading from pipefd[0]:
>
> (gdb) fr 2
> #2 0x000000000024e6ca in create_helper (event_fd=16, err_fd=18, uid=0, gid=0, max_fd=1877346) at helper.c:199
> 199 if (!read_write(pipefd[0], (unsigned char *)&data, sizeof(data), RW_READ))
> (gdb)
>
> It also looks like both fd's are open in the helper side:
>
> (gdb) p pipefd
> $12 = {19, 20}
> (gdb)
>
> (gdb) call fcntl(20, 1)
> $13 = 0
> (gdb)
>
> Now if I close(20):
>
> (gdb) call close(20)
> $14 = 0
> (gdb) c
> Continuing.
> [Inferior 1 (process 12742) exited normally]
> (gdb)
>
>
> So the following change fixed this for me:
>
> --- a/src/helper.c
> +++ b/src/helper.c
> @@ -96,6 +96,8 @@ int create_helper(int event_fd, int err_fd, uid_t uid, gid_t gid, long max_fd)
> close(pipefd[0]); /* close reader side */
> return pipefd[1];
> }
> + else
> + close(pipefd[1]);
>
> /* ignore SIGTERM and SIGINT, so that we can clean up when the main process gets hit
> and SIGALRM so that we can use sleep() */
>
>
> FWIW, that's happening on FreeBSD 15.0-CURRENT amd64 and latest master
> of dnsmasq.
>
> However, I'm not sure that these reproduction steps are 100% sufficient.
> I wasn't able to reproduce that on another FreeBSD 14.2-RELEASE amd64
> system with Dnsmasq version 2.91.
I'm not sure what the bug is, but I'm very suspicious of commit
8a5fe8ce6bb6c2bd81f237a0f4a2583722ffbd1c, even though it's in the 2.91
codebase.
The write side of the pipe in the helper process is supposed to be
closed by the call
close_fds(max_fd, pipefd[0], event_fd, err_fd);
at line 134 of src/helper.c
That call should close() ALL open fds except STDIN, STDOUT and STDERR,
and the three fds passed in as arguments. This preserves the
reader-side, as pipefd[0] is one of the arguments, but the write side
should be closed. I checked in Linux (which doesn't exhibit the bug) and
that's exactly what does happen.
If you look at the code for close_fds() there are two code paths. A dumb
one which calls close() for every possible fd between zero and the
system max except for the six which are to be spared. Then there's a
smart path which reads a directory in /proc to find out which fds are
actually open, and only closes those.
The smart path saves a lot of work on servers which are configured to
support enormous numbers of open files per process.
The smart path used to only exist on Linux, but was introduced on BSD
during the 2.91 development at the end of 2024. My suspicion is that
that is the cause of the regression.
The smart path is same for Linux and BSD except that the directory full
of links to open files is at /proc/self/fd on Linux and /dev/fd on *BSD
If these directories don't exist then the code falls back to the dumb
code path.
So, can you try and determine why close_fds() is not closing the
write-side of the pipe in the helper process(), since that should
already be doing what your patch does?
Cheers,
Simon.
>
> Thanks,
> Roman
>
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
>
More information about the Dnsmasq-discuss
mailing list