[Dnsmasq-discuss] No DHCPOffer back but DHCPDiscover is being received by UML machine
Simon Kelley
simon at thekelleys.org.uk
Fri Apr 24 14:29:18 BST 2020
Having looked at the docs for UML, I doubt that this is a UML problem,
it looks like a pure kernel (in this case, the one running under UML)
problem.
As such a regression test on those three kernels would therefore be useful.
Googling for combinations of recvmsg MSG_PEEK regression UDP MSG_TRUNK
shows a few possibles over the last few years, but no obvious smoking gun.
Assuming we've diagnosed the kernel misbehaviour correctly, the code in
dnsmasq could be changed to work-around the problem at the expense of a
small probability packet drop, which is not a problem in this case.
I'll look at doing that.
Simon.
On 23/04/2020 21:05, Josh H wrote:
> Hi there,
>
> I'm not sure of a way of testing it with a real network device, but I'm
> happy to attempt to build a older UML kernel and test it from there. As
> I said in my original email, the last fully known working build was way
> back in kernel 3.2 and a lot has changed since then, so it could very
> well be a kernel issue and due to the edge use case, no one has ever
> really come across it. Is there a kernel version you'd like me to try
> out? Debian has a standard usermodelinux package which contains prebuilt
> UML images with kernel versions of 4.9, 4.19 or 5.5 if they'd be handy?
> https://tracker.debian.org/pkg/user-mode-linux.
>
> Thanks for the support,
> Josh
>
> On Thu, 23 Apr 2020 at 20:30, Simon Kelley <simon at thekelleys.org.uk
> <mailto:simon at thekelleys.org.uk>> wrote:
>
> Ok, so Josh ran the strace and sent me the results as requested.
>
> The interesting bit us here.
>
> recvmsg(4, {msg_name={sa_family=AF_INET, sin_port=htons(68),
> sin_addr=inet_addr("0.0.0.0")}, msg_namelen=16,
> msg_iov=[{iov_base="\1\1\6\0\310\261\311+\0\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\366\226}H"...,
> iov_len=548}], msg_iovlen=1, msg_control=[{cmsg_len=24,
> cmsg_level=SOL_IP, cmsg_type=IP_PKTINFO,
> cmsg_data={ipi_ifindex=if_nametoindex("eth0"),
> ipi_spec_dst=inet_addr("192.168.1.1"),
> ipi_addr=inet_addr("255.255.255.255")}}], msg_controllen=24,
> msg_flags=0}, MSG_PEEK|MSG_TRUNC) = 300
> recvmsg(4, {msg_namelen=16}, 0) = -1 EAGAIN (Resource
> temporarily unavailable)
>
>
>
> The first call to recvmsg has the MSG_PEEK and MSG_TRUNC flags set.
> MSG_TRUNC causes the result to be the actual length of the received
> packet, even if it's longer than supplied buffer (548) and MSG_PEEK is
> defined as:
>
>
> MSG_PEEK
> This flag causes the receive operation to return data from the
> beginning of the receive queue without removing that data from
> the queue. Thus, a subsequent receive call will return the same
> data.
>
> So this allows the buffer to be expanded if necessary and then recvmsg
> gets called again when the buffer is big enough, to actually get the
> data and remove it from the queue. In this case the packet is 300 bytes
> long and the buffer is already 548 bytes, so no expansion is needed, we
> just do the call again, without the MSG_PEEK|MSG_TRUNC flags. That's the
> second call to recvmsg, which returns EAGAIN - the socket is
> no-blocking, and this return says there's no packet queued. It looks
> like the kernel is ignoring the MSG_PEEK flag, and dequeueing the data
> on the first call.
>
> I think this is a kernel bug.
>
> Josh, does this work with an older kernel or with a real network device,
> rather than the UML virtual device? It would be good to work out where
> the regression happened.
>
>
> Simon.
>
> On 16/04/2020 15:40, Josh H wrote:
> >
> > First, answer a simple question the answer to which I may have
> missed.
> > Is dnsmasq logging the receipt of DHCPDISCOVER messages? Can
> we see the
> > whole log showing that?
> >
> >
> > Based on the config I provided at the initial message, I have the log
> > file writing to /var/log/dnsmasq.log. This is the whole content of
> that
> > file:
> >
> > root at dns:~# cat /var/log/dnsmasq.log
> > Apr 16 15:36:50 dnsmasq[1695]: started, version 2.80 DNS disabled
> > Apr 16 15:36:50 dnsmasq[1695]: compile time options: IPv6 GNU-getopt
> > DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth DNSSEC
> > loop-detect inotify dumpfile
> > Apr 16 15:36:50 dnsmasq-dhcp[1695]: DHCP, IP range 192.168.1.3 --
> > 192.168.1.8, lease time 12h
> >
> > No mention of the DHCPDiscover being acknowledged.
> >
> > The next stage is to run dnsmasq under strace (check back here
> if you
> > need instructions on that) and see what system calls it's making.
> >
> >
> > What command would I need to run for this? And what service is best to
> > upload the strace result, pastebin?
> >
> > Thanks,
> > Josh
> >
> > On Thu, 16 Apr 2020 at 12:49, Simon Kelley
> <simon at thekelleys.org.uk <mailto:simon at thekelleys.org.uk>
> > <mailto:simon at thekelleys.org.uk <mailto:simon at thekelleys.org.uk>>>
> wrote:
> >
> >
> >
> > On 15/04/2020 19:27, Josh H wrote:
> >
> > > It's difficult for me to share the config outright as I'm
> using a
> > > modified version of netkit that I've updated to a much newer
> kernel
> > > - http://netkit-ng.github.io/. The netkit version that is
> available on
> > > that link is the one that worked with dnsmasq just fine, and
> that
> > > version was 2.62 and kernel 3.2. However I've updated it and am
> > running
> > > 2.80 and kernel 5.6.
> > >
> > > Anything else I can provide you with that might help? It's a
> very
> > unique
> > > setup so I appreciate it's probably not the easiest thing
> to try and
> > > debug.
> > >
> >
> > First, answer a simple question the answer to which I may have
> missed.
> > Is dnsmasq logging the receipt of DHCPDISCOVER messages? Can
> we see the
> > whole log showing that?
> >
> > The next stage is to run dnsmasq under strace (check back here
> if you
> > need instructions on that) and see what system calls it's making.
> >
> >
> > Simon.
> >
> >
> > _______________________________________________
> > Dnsmasq-discuss mailing list
> > Dnsmasq-discuss at lists.thekelleys.org.uk
> <mailto:Dnsmasq-discuss at lists.thekelleys.org.uk>
> > <mailto:Dnsmasq-discuss at lists.thekelleys.org.uk
> <mailto:Dnsmasq-discuss at lists.thekelleys.org.uk>>
> > http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
> >
>
More information about the Dnsmasq-discuss
mailing list