[Dnsmasq-discuss] 2.68rc4: bind-interfaces, FreeBSD, IPv6 2001:... addr fails and loses error code, parallel build brittle

Simon Kelley simon at thekelleys.org.uk
Mon Dec 2 12:01:25 GMT 2013


On 01/12/13 01:59, Matthias Andree wrote:
> Greetings,
>
> testing 2.68rc4, I have found three issues, on FreeBSD 9.2 amd64:
>
> 1. the Makefiles might not thoroughly list all dependencies required to
> build the dnsmasq executable; I found my build miss cache.o when linking
> (compiling with make -j + high number), re-running make immediately
> after the failure "solved" the problem.
> I am using a local UFS file system, so no NFS time skew.
> I am using GNU make 3.82.

By "miss cache.o" do you meant that cache.o was not included in the link 
command line, or that the link failed because cache.o had not been 
built. This is mysterious, and there's nothing different about cache.o 
versus the other object files.
>
> 2. Binding IPv6 addresses on FreeBSD does not appear to work properly,
> with bind-interfaces option.
>
> 3. And the error from 2. does not get properly reported either, the
> error code is apparently lost somewhere, or errno gets reset.
>

OK, that's easy fix, when the bind fails, don't call close() on the 
socket before reporting the error, or save/restore errno.

> This is the dnsmasq.conf, domain made up (build configuration is below
> in the log of the successful start):
>
>> domain-needed
>> bogus-priv
>> no-resolv
>> no-poll
>> server=127.0.0.1
>> except-interface=lo0
>> bind-interfaces
>> expand-hosts
>> domain=EXAMPLE.org
>
>
> This is the interface configuration, with some information masked:
>
>> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST>  metric 0 mtu 1500
>> 	options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
>> 	ether ...
>> 	inet 192.168.33.14 netmask 0xffffff00 broadcast 192.168.33.255
>> 	inet6 fe80::a00:..ff:fe..:1234%em0 prefixlen 64 scopeid 0x1
>> 	inet6 2001:....:....:....:a00:..ff:fe..:1234 prefixlen 64 autoconf
>> 	inet6 2001:....:....:....:b0e1:f6da:....:.... prefixlen 64 autoconf temporary
>> 	nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
>> 	media: Ethernet autoselect (1000baseT<full-duplex>)
>> 	status: active
>> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST>  metric 0 mtu 16384
>> 	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
>> 	inet6 ::1 prefixlen 128
>> 	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
>> 	inet 127.0.0.1 netmask 0xff000000
>> 	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>> 	groups: lo
>
>
> This is the error when running things:
>
>> $ LC_ALL=C sudo /var/tmp/usr/ports.svn/dns/dnsmasq-devel/work/stage/usr/local/sbin/dnsmasq -d
>>
>> dnsmasq: failed to create listening socket for 2001:4dd0:ff00:893e:b0e1:f6da:b2d8:9720%em0: No error: 0
>
> Now, this doesn't help at all, so let's go for details and use truss:
>
>> $ LC_ALL=C sudo truss dnsmasq -d 2>&1 | egrep -v close.*ERR#9
>> mmap(0x0,32768,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34366386176 (0x800657000)
>> issetugid(0x800857a60,0x7fffffffefc8,0x40,0x0,0xffff800800858a98,0x0) = 0 (0x0)
>
> ...
>
>> open("/usr/local/etc/dnsmasq.conf",O_RDONLY,0666) = 3 (0x3)
>> fstat(3,{ mode=-rw-r--r-- ,inode=10763926,size=25245,blksize=16384 }) = 0 (0x0)
>> read(3,"# Configuration file for dnsmasq"...,16384) = 16384 (0x4000)
>> read(3,"#dhcp-option-force=208,f1:00:74:"...,16384) = 8861 (0x229d)
>> read(3,0x801c84000,16384)			 = 0 (0x0)
>> close(3)					 = 0 (0x0)
>> madvise(0x801cd2000,0x1000,0x5,0xd1,0x7fffffffce00,0xffffffff) = 0 (0x0)
>> madvise(0x801ccf000,0x1000,0x5,0xce,0x7fffffffce00,0xffffffff) = 0 (0x0)
>> madvise(0x801c84000,0x4000,0x5,0x83,0x7fffffffce00,0xffffffff) = 0 (0x0)
>> open("/dev/null",O_RDWR,0160002740)		 = 3 (0x3)
>> open("/dev/null",O_RDWR,0160002740)		 = 4 (0x4)
>> open("/dev/null",O_RDWR,0160002740)		 = 5 (0x5)
>> close(3)					 = 0 (0x0)
>> close(4)					 = 0 (0x0)
>> close(5)					 = 0 (0x0)
>> clock_gettime(13,{1385861339.000000000 })	 = 0 (0x0)
>> socket(PF_INET,SOCK_DGRAM,0)			 = 3 (0x3)
>> __sysctl(0x7fffffffd730,0x6,0x0,0x7fffffffd748,0x0,0x0) = 0 (0x0)
>> __sysctl(0x7fffffffd730,0x6,0x801c80400,0x7fffffffd748,0x0,0x0) = 0 (0x0)
>> socket(PF_INET6,SOCK_DGRAM,0)			 = 4 (0x4)
>> socket(PF_INET,SOCK_DGRAM,0)			 = 5 (0x5)
>> ioctl(5,SIOCGIFINDEX,0xffffd730)		 = 0 (0x0)
>> close(5)					 = 0 (0x0)
>> ioctl(4,SIOCGIFAFLAG_IN6,0xffffd7d0)		 = 0 (0x0)
>> ioctl(4,SIOCGIFALIFETIME_IN6,0xffffd7d0)	 = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x0,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x801c80400,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> ioctl(3,SIOCGIFFLAGS,0xffffd6e0)		 = 0 (0x0)
>> ioctl(3,SIOCGIFMTU,0xffffd6e0)			 = 0 (0x0)
>> socket(PF_INET,SOCK_DGRAM,0)			 = 5 (0x5)
>> ioctl(5,SIOCGIFINDEX,0xffffd730)		 = 0 (0x0)
>> close(5)					 = 0 (0x0)
>> ioctl(4,SIOCGIFAFLAG_IN6,0xffffd7d0)		 = 0 (0x0)
>> ioctl(4,SIOCGIFALIFETIME_IN6,0xffffd7d0)	 = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x0,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x801cb1400,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> ioctl(3,SIOCGIFFLAGS,0xffffd6e0)		 = 0 (0x0)
>> ioctl(3,SIOCGIFMTU,0xffffd6e0)			 = 0 (0x0)
>> socket(PF_INET,SOCK_DGRAM,0)			 = 5 (0x5)
>> ioctl(5,SIOCGIFINDEX,0xffffd730)		 = 0 (0x0)
>> close(5)					 = 0 (0x0)
>> ioctl(4,SIOCGIFAFLAG_IN6,0xffffd7d0)		 = 0 (0x0)
>> ioctl(4,SIOCGIFALIFETIME_IN6,0xffffd7d0)	 = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x0,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x801cb1400,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> ioctl(3,SIOCGIFFLAGS,0xffffd6e0)		 = 0 (0x0)
>> ioctl(3,SIOCGIFMTU,0xffffd6e0)			 = 0 (0x0)
>> socket(PF_INET,SOCK_DGRAM,0)			 = 5 (0x5)
>> ioctl(5,SIOCGIFINDEX,0xffffd730)		 = 0 (0x0)
>> close(5)					 = 0 (0x0)
>> ioctl(4,SIOCGIFAFLAG_IN6,0xffffd7d0)		 = 0 (0x0)
>> ioctl(4,SIOCGIFALIFETIME_IN6,0xffffd7d0)	 = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x0,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x801cb1400,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> ioctl(3,SIOCGIFFLAGS,0xffffd6e0)		 = 0 (0x0)
>> ioctl(3,SIOCGIFMTU,0xffffd6e0)			 = 0 (0x0)
>> socket(PF_INET,SOCK_DGRAM,0)			 = 5 (0x5)
>> ioctl(5,SIOCGIFINDEX,0xffffd730)		 = 0 (0x0)
>> close(5)					 = 0 (0x0)
>> ioctl(4,SIOCGIFAFLAG_IN6,0xffffd7d0)		 = 0 (0x0)
>> ioctl(4,SIOCGIFALIFETIME_IN6,0xffffd7d0)	 = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x0,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x801cb1400,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> ioctl(3,SIOCGIFFLAGS,0xffffd6e0)		 = 0 (0x0)
>> ioctl(3,SIOCGIFMTU,0xffffd6e0)			 = 0 (0x0)
>> close(4)					 = 0 (0x0)
>> __sysctl(0x7fffffffd730,0x6,0x0,0x7fffffffd748,0x0,0x0) = 0 (0x0)
>> __sysctl(0x7fffffffd730,0x6,0x801c81400,0x7fffffffd748,0x0,0x0) = 0 (0x0)
>> socket(PF_INET,SOCK_DGRAM,0)			 = 4 (0x4)
>> ioctl(4,SIOCGIFINDEX,0xffffd730)		 = 0 (0x0)
>> close(4)					 = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x0,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x801c81400,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> ioctl(3,SIOCGIFFLAGS,0xffffd6e0)		 = 0 (0x0)
>> ioctl(3,SIOCGIFMTU,0xffffd6e0)			 = 0 (0x0)
>> socket(PF_INET,SOCK_DGRAM,0)			 = 4 (0x4)
>> ioctl(4,SIOCGIFINDEX,0xffffd730)		 = 0 (0x0)
>> close(4)					 = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x0,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> __sysctl(0x7fffffffd610,0x6,0x801c81400,0x7fffffffd628,0x0,0x0) = 0 (0x0)
>> ioctl(3,SIOCGIFFLAGS,0xffffd6e0)		 = 0 (0x0)
>> ioctl(3,SIOCGIFMTU,0xffffd6e0)			 = 0 (0x0)
>> close(3)					 = 0 (0x0)
>> socket(PF_INET,SOCK_DGRAM,0)			 = 3 (0x3)
>> setsockopt(0x3,0xffff,0x4,0x7fffffffd8d4,0x4,0xffffffff) = 0 (0x0)
>> fcntl(3,F_GETFL,)				 = 2 (0x2)
>> fcntl(3,F_SETFL,O_NONBLOCK|0x2)			 = 0 (0x0)
>> bind(3,{ AF_INET 192.168.33.14:53 },16)		 = 0 (0x0)
>> socket(PF_INET,SOCK_STREAM,0)			 = 4 (0x4)
>> setsockopt(0x4,0xffff,0x4,0x7fffffffd8d4,0x4,0xffffffff) = 0 (0x0)
>> fcntl(4,F_GETFL,)				 = 2 (0x2)
>> fcntl(4,F_SETFL,O_NONBLOCK|0x2)			 = 0 (0x0)
>> bind(4,{ AF_INET 192.168.33.14:53 },16)		 = 0 (0x0)
>> listen(0x4,0x5,0x10,0x7fffffffd8d4,0x4,0xffffffff) = 0 (0x0)
>> socket(PF_INET6,SOCK_DGRAM,0)			 = 5 (0x5)
>> setsockopt(0x5,0xffff,0x4,0x7fffffffd8d4,0x4,0x0) = 0 (0x0)
>> fcntl(5,F_GETFL,)				 = 2 (0x2)
>> fcntl(5,F_SETFL,O_NONBLOCK|0x2)			 = 0 (0x0)
>> setsockopt(0x5,0x29,0x1b,0x7fffffffd8d4,0x4,0x0) = 0 (0x0)
>> bind(5,{ AF_INET6 [2001:....:....:....:b0e1:f6da:....:....]:53 },28) ERR#49 'Can't assign requested address'
>
> Corresponding to EADDRNOTAVAIL.  Removing the alias causes the same
> issue on the other 2001:... address.  Removing both 2001:... aliases
> lets dnsmasq start up, and lsof -i:53 -n gives this:
>
>> named    889 bind   20u  IPv4 0xfffffe0005d823d0      0t0  TCP 127.0.0.1:domain (LISTEN)
>> named    889 bind  512u  IPv4 0xfffffe0005b590c0      0t0  UDP 127.0.0.1:domain
>> dnsmasq 5727 root    3u  IPv4 0xfffffe0005b59890      0t0  UDP 192.168.33.14:domain
>> dnsmasq 5727 root    4u  IPv4 0xfffffe006fa6eb70      0t0  TCP 192.168.33.14:domain (LISTEN)
>> dnsmasq 5727 root    5u  IPv6 0xfffffe0005b0d2d0      0t0  UDP [fe80::a00:..ff:fe..:1234%em0]:domain
>> dnsmasq 5727 root    6u  IPv6 0xfffffe006f253000      0t0  TCP [fe80::a00:..ff:fe..:1234%em0]:domain (LISTEN)
>
> Not sure what's going on there when it's trying to bind 2001:... addresses.


I think I know. I've seen this before: it occurs when you try and bind() 
a socket to a local address which is still undergoing Duplicate Address 
Detection.

This came up some time ago, and code was added to detect interface 
addresses in this state (by looking at the flags) and defer bind()ing 
them until DAD was complete.

What's confusing is that code _was_ only complete on Linux: the ability 
to determine if an address is still in DAD state was missing on *BSD, 
and it always assumed that DAD was complete, so I'd expect to see the 
problem you're seeing with older releases. From 2.67, the code is 
included for *BSD, so it should be OK, or at least not attempt to bind() 
at this point.


>
> Now, if I remove bind-interfaces, I get:
>
>> dnsmasq: started, version 2.67 cachesize 150
>> dnsmasq: compile time options: IPv6 GNU-getopt no-DBus i18n IDN DHCP DHCPv6 Lua TFTP no-conntrack no-ipset auth
>> dnsmasq: using nameserver 127.0.0.1#53
>> dnsmasq: read /etc/hosts - 5 addresses
>
> and
>
>> COMMAND  PID USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
>> named    889 bind   20u  IPv4 0xfffffe0005d823d0      0t0  TCP localhost:domain (LISTEN)
>> named    889 bind  512u  IPv4 0xfffffe0005b590c0      0t0  UDP localhost:domain
>> dnsmasq 4664 root    3u  IPv4 0xfffffe0005b59720      0t0  UDP *:domain
>> dnsmasq 4664 root    4u  IPv4 0xfffffe006f9f37a0      0t0  TCP *:domain (LISTEN)
>> dnsmasq 4664 root    5u  IPv6 0xfffffe0005b596f0      0t0  UDP *:domain
>> dnsmasq 4664 root    6u  IPv6 0xfffffe006fa6f7a0      0t0  TCP *:domain (LISTEN)
>
> Do you need anything else to debug this?
>


Yes,

I'd be interested to know the status of those addresses, and Duplicate 
Address Detection, since that's the best explanation I have of the 
source of that error return. Is it possible to turn DAD off, and see if 
that fixes things?

I'm also interested in what's happening in bpf.c in the interface 
enumeration code:

          ifr6.ifr_addr = *((struct sockaddr_in6 *) addrs->ifa_addr);
               if (fd != -1 && ioctl(fd, SIOCGIFAFLAG_IN6, &ifr6) != -1)
                 {
                   if (ifr6.ifr_ifru.ifru_flags6 & IN6_IFF_TENTATIVE)
                     flags |= IFACE_TENTATIVE;

Should set IFACE_TENTATIVE in flags if the interface is still doing DAD.


I'll commit a fix for the simple errno problem, to make things easier.


Cheers,

Simon.



More information about the Dnsmasq-discuss mailing list