[Dnsmasq-discuss] [PATCH] bpf.c: fix memory leak in arp_enumerate() on BSD
Simon Kelley
simon at thekelleys.org.uk
Sat May 9 21:54:34 UTC 2026
Patch looks good to me. Applied.
Thanks for that.
Simon.
On 08.05.2026 22:46, Sagie D. wrote:
> Declaring struct iovec buff as static indeed resolves the memory
> ballooning issue, and is the better approach because it minimizes the
> heap fragmentation, as you pointed out.
>
> Following is a patch that adds IPv6 neighbor support. Note the diff is
> against the previous bpf.c version (without the static declarations).
>
> ------------------------------------------------------
>
> The arp_enumerate() function (for *BSD) is extended into a per-family
> helper arp_enumerate_family(), called sequentially for AF_INET and
> AF_INET6, allowing the NDP neighbour cache to be enumerated alongside
> the ARP table. An empty table for either family is treated as vacuous
> success rather than an error. Both tables are acquired in *BSD via
> PF_ROUTE sysctl calls that return raw kernel structures; consequently,
> for IPv6, the sockaddr_in6 peer address extracted from them has an
> embedded link-local scope ID. This ID is extracted into sin6_scope_id,
> and bytes 2-3 of the address are then cleared, per the KAME API
> contract.
>
> Signed-off-by: Sagie D.
> ---
> diff --git a/src/bpf.c b/src/bpf.c
> index dd67735..8543b26 100644
> --- a/src/bpf.c
> +++ b/src/bpf.c
> @@ -47,56 +47,84 @@ static union all_addr del_addr;
>
> #if defined(HAVE_BSD_NETWORK) && !defined(__APPLE__)
>
> -int arp_enumerate(void *parm, callback_t callback)
> +static int arp_enumerate_family(int family, void *parm, callback_t callback)
> {
> int mib[6];
> size_t needed;
> char *next;
> struct rt_msghdr *rtm;
> - struct sockaddr_inarp *sin2;
> struct sockaddr_dl *sdl;
> - struct iovec buff;
> + static struct iovec buff = { NULL, 0 };
> int rc;
>
> - buff.iov_base = NULL;
> - buff.iov_len = 0;
> -
> mib[0] = CTL_NET;
> mib[1] = PF_ROUTE;
> mib[2] = 0;
> - mib[3] = AF_INET;
> + mib[3] = family;
> mib[4] = NET_RT_FLAGS;
> #ifdef RTF_LLINFO
> mib[5] = RTF_LLINFO;
> #else
> mib[5] = 0;
> -#endif
> +#endif
> +
> if (sysctl(mib, 6, NULL, &needed, NULL, 0) == -1 || needed == 0)
> - return 0;
> + return 1; /* not a failure: unsupported or empty table */
>
> - while (1)
> + while (1)
> {
> if (!expand_buf(&buff, needed))
> - return 0;
> + return 0;
> if ((rc = sysctl(mib, 6, buff.iov_base, &needed, NULL, 0)) == 0 ||
> - errno != ENOMEM)
> - break;
> + errno != ENOMEM)
> + break;
> needed += needed / 8;
> }
> +
> if (rc == -1)
> return 0;
> -
> - for (next = buff.iov_base ; next < (char *)buff.iov_base + needed;
> next += rtm->rtm_msglen)
> +
> + for (next = buff.iov_base; next < (char *)buff.iov_base + needed;
> next += rtm->rtm_msglen)
> {
> rtm = (struct rt_msghdr *)next;
> - sin2 = (struct sockaddr_inarp *)(rtm + 1);
> - sdl = (struct sockaddr_dl *)((char *)sin2 + SA_SIZE(sin2));
> - if (!callback.af_unspec(AF_INET, &sin2->sin_addr, LLADDR(sdl),
> sdl->sdl_alen, parm))
> - return 0;
> +
> + if (family == AF_INET)
> + {
> + struct sockaddr_inarp *sin2 = (struct sockaddr_inarp *)(rtm + 1);
> + sdl = (struct sockaddr_dl *)((char *)sin2 + SA_SIZE(sin2));
> + if (!callback.af_unspec(AF_INET, &sin2->sin_addr,
> + LLADDR(sdl), sdl->sdl_alen, parm))
> + return 0;
> + }
> + else
> + {
> + struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)(rtm + 1);
> + sdl = (struct sockaddr_dl *)((char *)sin6 + SA_SIZE(sin6));
> + if (IN6_IS_ADDR_LINKLOCAL(&sin6->sin6_addr))
> + {
> + /* PF_ROUTE sysctl returns raw kernel structures with
> the interface
> + index embedded in bytes 2-3 of link-local addresses.
> Extract it
> + into sin6_scope_id per the KAME API contract before
> clearing. */
> + sin6->sin6_scope_id =
> ((uint32_t)(sin6->sin6_addr.s6_addr[2]) << 8) |
> + sin6->sin6_addr.s6_addr[3];
> + sin6->sin6_addr.s6_addr[2] = 0;
> + sin6->sin6_addr.s6_addr[3] = 0;
> + }
> + if (!callback.af_unspec(AF_INET6, &sin6->sin6_addr,
> + LLADDR(sdl), sdl->sdl_alen, parm))
> + return 0;
> + }
> }
>
> return 1;
> }
> +
> +static int arp_enumerate(void *parm, callback_t callback)
> +{
> + if (!arp_enumerate_family(AF_INET, parm, callback))
> + return 0;
> + return arp_enumerate_family(AF_INET6, parm, callback);
> +}
> #endif /* defined(HAVE_BSD_NETWORK) && !defined(__APPLE__) */
>
>
>
> On Mon, 4 May 2026 at 18:45, Simon Kelley <simon at thekelleys.org.uk> wrote:
>>
>> Good catch.
>>
>> That code has been there eating memory for 15 years, and was added just
>> before the dnsmasq git repository started, so how it ended up like that
>> is a bit of a mystery.
>>
>> The code looks like it's using a common design pattern in dnsmasq, where
>> the buffer is stored in a long-lived iovec and expand_buf() only does
>> anything if the existing buffer is too small. This avoids lots of
>> malloc()/free() calls in hot code paths and resulting heap
>> fragmentation. The only problem is that the iovec in this case is not
>> long-lived. My guess is that this code got copied from elsewhere, and
>> the importance of that detail was missed.
>>
>> The fix is to declare struct iovec buff as static. Then the buffer
>> becomes long-lived and all the rest of the code, without any free()
>> calls, makes sense.
>>
>> I just committed
>>
>> https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=4bcc9650fac9a48502679cd793d269ef60caef07
>>
>> which does exactly that. It would be great if you could check it works
>> OK on your tests.
>>
>>
>> What would be _really_ nice (hint, hint) would be to extend this code to
>> return IPv6 neighbours.
>>
>>
>> Cheers,
>>
>> Simon.
>>
>>
>>
>>
>> On 26.04.2026 17:18, Sagie Duchovne-Nave wrote:
>>> arp_enumerate() allocates a heap buffer via expand_buf() to hold the
>>> kernel ARP table dump retrieved through sysctl(NET_RT_FLAGS). This
>>> buffer is never freed on any return path -- neither the early error
>>> returns nor the normal return after iteration -- causing a leak on
>>> every call.
>>>
>>> The leak is most acute in the DHCPv6 path. get_client_mac() calls
>>> find_mac() up to five times per packet with lazy=0. Because the
>>> 'updated' flag is local to each find_mac() invocation, a cached
>>> ARP_EMPTY entry for an unresolvable IPv6 address does not short-
>>> circuit the kernel lookup: each call falls through to iface_enumerate()
>>> -> arp_enumerate(), leaking one buffer per call. This yields up to
>>> five leaked allocations per DHCPv6 SOLICIT packet. The leak size per
>>> call equals the full system-wide IPv4 ARP table dump across all
>>> interfaces.
>>>
>>> The condition is readily triggered by a DHCPv6 client whose MAC
>>> address cannot be resolved via NDP -- which is the common case on
>>> FreeBSD, because arp_enumerate() queries NET_RT_FLAGS/RTF_LLINFO,
>>> which returns IPv4 ARP entries only; IPv6 NDP neighbour entries are
>>> not included. As a result every IPv6 MAC lookup fails unconditionally
>>> on FreeBSD, every failed lookup produces an ARP_EMPTY record that is
>>> never promoted, and every subsequent packet for that client leaks five
>>> buffers.
>>>
>>> Fix: free buff.iov_base on all return paths in arp_enumerate(),
>>> including the early returns inside the retry loop where iov_base may
>>> already be non-NULL from a prior expand_buf() call.
>>>
>>> Reported against: FreeBSD 14, dnsmasq 2.91
>>> Observed symptom: steady process memory growth correlated with DHCPv6
>>> SOLICIT traffic from a client whose NDP entry is absent from the
>>> kernel table (confirmed by disabling the client stopping the balloon).
>>>
>>> Signed-off-by: Sagie D.
>>> ---
>>> bpf.c | 8 +++++++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/bpf.c b/bpf.c
>>> index XXXXXXX..XXXXXXX 100644
>>> --- a/bpf.c
>>> +++ b/bpf.c
>>> @@ -xx,12 +xx,18 @@ int arp_enumerate(void *parm, callback_t callback)
>>> while (1)
>>> {
>>> if (!expand_buf(&buff, needed))
>>> - return 0;
>>> + {
>>> + free(buff.iov_base);
>>> + return 0;
>>> + }
>>> if ((rc = sysctl(mib, 6, buff.iov_base, &needed, NULL, 0)) == 0 ||
>>> errno != ENOMEM)
>>> break;
>>> needed += needed / 8;
>>> }
>>> if (rc == -1)
>>> - return 0;
>>> + {
>>> + free(buff.iov_base);
>>> + return 0;
>>> + }
>>>
>>> for (next = buff.iov_base ; next < (char *)buff.iov_base + needed;
>>> next += rtm->rtm_msglen)
>>> {
>>> rtm = (struct rt_msghdr *)next;
>>> sin2 = (struct sockaddr_inarp *)(rtm + 1);
>>> sdl = (struct sockaddr_dl *)((char *)sin2 + SA_SIZE(sin2));
>>> if (!callback.af_unspec(AF_INET, &sin2->sin_addr, LLADDR(sdl),
>>> sdl->sdl_alen, parm))
>>> - return 0;
>>> + {
>>> + free(buff.iov_base);
>>> + return 0;
>>> + }
>>> }
>>>
>>> - return 1;
>>> + free(buff.iov_base);
>>> + return 1;
>>> }
>>>
>>> _______________________________________________
>>> Dnsmasq-discuss mailing list
>>> Dnsmasq-discuss at lists.thekelleys.org.uk
>>> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
>>>
>>
>
More information about the Dnsmasq-discuss
mailing list