[Dnsmasq-discuss] [PATCH] Fix buffer overflow in TCP requests

Frank fhriley at gmail.com
Tue Jun 30 20:22:05 BST 2020


Resending this because I realized I sent it to Simon rather than the list:

Hi Simon,

This bug is fairly easy to reproduce. It can take 10 mins or more to
reproduce a crash so I suggest checking the length you get in
cache_recv_insert. If the domains being used are > 4 characters, this
will catch the bug the first time it occurs. That usually happens
within a minute. Something like this:

if (m > MAXDNAME) abort();

Then bombard dnsmasq with TCP queries:

docker run --rm -v /root/dns_queries:/dns_queries
aiys0211/flamethrower_v0.10.2 -f
/dns_queries/queryfile-example-100thousand -l 1000 -c 20 -q 10 -d 300
-p 53 -P tcp <ip of dnsmasq>

I used the first 100 thousand entries from here:
https://www.dns-oarc.net/files/dnsperf/data/queryfile-example-10million-201202.gz

Frank

On Sun, Jun 28, 2020 at 1:58 PM Simon Kelley <simon at thekelleys.org.uk> wrote:
>
> That's a nasty bug, and could explain what pi-hole users are seeing.
>
> If I understand things correctly, this bug will only manifest itself
> when the write() or read() syscalls return EINTR ir EAGAIN, which is
> possible, but not common, hence the bugs wasn't detected earlier.
>
> Frank, did you find a way to reproduce this on demand, or just capture
> sporadic instances?
>
> I've applied Frank's patch, but immediately superseded it with a more
> general clean-up which should have the same effect. I've tagged a
> 2.82test1 release with this. If that fixes the problem for pi-hole, I
> plan to go to 2.82 fairly quickly: I don't want this hanging around.
>
> It's slightly ironic that the original change in 2.81 is to fix a
> theoretical hang which I've never actually seen and cannot reproduce,
> but it introduced a really crash bug.
>
>
> Cheers,
>
> Simon.
>
>
>
>
> On 18/06/2020 22:42, Dominik wrote:
> > Hey Frank, dear list-members,
> >
> > thanks for your proposed fix. The Pi-hole team adopted dnsmasq v2.81
> > early on and we're seeing reports for mysterious crashes scattered all
> > over the dnsmasq code since the very first days of releasing our latest
> > version. Crashes reports show several locations, e.g., in_zone(),
> > iface_check() and even main() in a few places. TCP forks were always
> > happening sometime (not necessarily immediately) before a crash.
> > Running dnsmasq in debug mode (prevents forking) stopped any crashes.
> >
> > So far, we haven't had the time to dig into this ourselves properly,
> > however, your description perfectly matches with my preliminary
> > conclusion for one week ago:
> >> the dangling pointer is in **all** bug reports known so far in the
> > daemon pointer
> > (https://github.com/pi-hole/FTL/issues/805#issuecomment-643157367)
> >
> > The addition of TCP forks being able to add cache replies in dnsmasq
> > v2.81 is what is causing this code (which has been like this for along
> > time) to become problematic. I recently looked at the same code but
> > missed the chance that the byte may be still in the pipe. This is a
> > very good catch!
> >
> > I'm writing this mail to raise awareness that this appears to be a high
> > severity bug as it can easily lead to anything between indeterminable
> > behavior to fatal failures (typically the case). Even when the bug can
> > only be triggered under certain conditions, it recently caused over 200
> > posts on the Pi-hole issue ticket system (Github). Crashes due to this
> > have been reported independently by several (more than 20) individual
> > users.
> > Best
> > Dominik
> >
> > On Wed, 2020-06-17 at 19:52 -0700, Frank wrote:
> >> Hello,
> >>
> >> This patch fixes a buffer overflow in TCP requests. Since the read is
> >> not actually being retried, the byte written by the child can be left
> >> in the pipe. When that happens, cache_recv_insert() reads the length
> >> of the name, which is now multiplied by 256 due to the extra 0 byte
> >> (8 bit shift) and results in daemon->namebuff being overflowed.
> >> Namebuff is immediately before the daemon struct in memory so it ends
> >> up corrupting the beginning of the daemon struct.
> >>
> >> diff --git a/src/dnsmasq.c b/src/dnsmasq.c
> >> index 6481de8..457dea3 100644
> >> --- a/src/dnsmasq.c
> >> +++ b/src/dnsmasq.c
> >> @@ -1887,7 +1887,7 @@ static void check_dns_listeners(time_t now)
> >>                            single byte comes back up the pipe, which
> >>                            is sent by the child after it has closed
> >> the
> >>                            netlink socket. */
> >> -                       retry_send(read(pipefd[0], &a, 1));
> >> +                       while (retry_send(read(pipefd[0], &a, 1)));
> >>  #endif
> >>                         break;
> >>                       }
> >> @@ -1928,7 +1928,7 @@ static void check_dns_listeners(time_t now)
> >>  #ifdef HAVE_LINUX_NETWORK
> >>                   /* See comment above re netlink socket. */
> >>                   close(daemon->netlinkfd);
> >> -                 retry_send(write(pipefd[1], &a, 1));
> >> +                 while (retry_send(write(pipefd[1], &a, 1)));
> >>  #endif
> >>                 }
> >>
> >> Frank
> >>
> >>
> >> _______________________________________________
> >> Dnsmasq-discuss mailing list
> >> Dnsmasq-discuss at lists.thekelleys.org.uk
> >> http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
> >
> >
> > _______________________________________________
> > Dnsmasq-discuss mailing list
> > Dnsmasq-discuss at lists.thekelleys.org.uk
> > http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
> >
>



More information about the Dnsmasq-discuss mailing list