[Dnsmasq-discuss] dnsmasq v2.86?

Andre Heider a.heider at gmail.com
Thu Aug 12 10:31:24 UTC 2021


On 12/08/2021 11:06, Simon Kelley wrote:
> This is useful information, but the backtraces are puzzling: the code
> isn't in tight loop, certainly.
> 
> 
> I wonder if the v4only.arpa thing is not a coincidence?
> 
> Some things to try, please.
> 
> 1) When the dnsmasq process is faulted, run
> 
> strace -p <pid of dnsmasq process>

Will do, so far I haven't been able to reproduce the issue today...

> 2) Try doing a query on ipv4only.arpa to dnsmasq directly, just in case
> that's the whole trigger.
> 
> 3) Same but straight to stubby
> 
> dig @127.0.0.1 -p 5453 ipv4only.arpa

Both work just fine, so maybe that was just a red herring.

> 
> 
> 4) Take a look in /usr/share/dnsmasq/rfc6761.conf
> 
> Is .arpa mentioned in there?

Nope, it's the default file as Kevin pasted

> 
> Cheers,
> 
> Simon.
> 
> 
> 
> On 11/08/2021 15:50, Andre Heider wrote:
>> On 11/08/2021 16:19, Andre Heider wrote:
>>> Hi there,
>>>
>>> it seems I can trigger this semi-reliably while powering up 2 KVM
>>> windows instances in parallel.
>>>
>>> This is the tail of log-queries:
>>> dnsmasq[6591]: query[A] foo.internal from 192.168.0.foo
>>> dnsmasq[6591]: DHCP foo.internal is 192.168.0.foo
>>> dnsmasq[6591]: query[AAAA] foo.internal from 192.168.0.foo
>>> dnsmasq[6591]: config foo.internal is NODATA-IPv6
>>> dnsmasq[6591]: query[A] ipv4only.arpa from 192.168.0.foo
>>>
>>> And this is where dnsmasq stop replying altogether and hogs the cpu.
>>> Note that foo.internal is the kvm host with the ip 192.168.0.foo.
>>>
>>> Using gdbserver doesn't yield too much info, looks like because of lto
>>> maybe (which OpenWrt does for dnsmasq)?
>>> (gdb) info threads
>>>     Id   Target Id                  Frame
>>> * 1    Thread 6591.6591 "dnsmasq" 0x555715f1 in forward_query.lto_priv ()
>>> (gdb) bt full
>>> #0  0x555715f1 in forward_query.lto_priv ()
>>> No symbol table info available.
>>> Backtrace stopped: previous frame identical to this frame (corrupt
>>> stack?)
>>
>> Without LTO, another hang with the last log-queries 'dnsmasq[6842]:
>> query[A] ipv4only.arpa from 192.168.0.foo' again:
>>
>> #0  0x55568969 in forward_query (udpfd=<optimized out>,
>> udpaddr=0x7fffbab8, dst_addr=0x7fffbaa8, dst_iface=15,
>> header=0x5561c080, plen=31, limit=0x5561c280 "", now=958007,
>> forward=0x5563e8e0, ad_reqd=0, do_bit=0) at forward.c:319
>>          is_sign = 1432469663
>>          pheader = 0x0
>>          flags = 0
>>          fwd_flags = <optimized out>
>>          is_dnssec = 0
>>          master = <optimized out>
>>          gotname = 128
>>          hash = 0x55624a10
>>          oph = 0x0
>>          old_src = <optimized out>
>>          first = 1432469663
>>          last = 12
>>          start = 0
>>          subnet = 0
>>          cacheable = 1431977952
>>          forwarded = 0
>>          edns0_len = 1431725623
>>          pheader = 0x80088 <error: Cannot access memory at address 0x80088>
>>          ede = -1
>> #1  0x5556947d in receive_query (listen=0x5561f5c0, now=958007) at
>> forward.c:1640
>>          ad_reqd = 0
>>          header = 0x5561c080
>>          source_addr = {sa = {sa_family = 2, sa_data =
>> "\306d\300\250\000(\000\000\000\000\000\000\000"}, in = {sin_family = 2,
>> sin_port = 50788, sin_addr = {s_addr = 3232235560}, sin_zero =
>> "\000\000\000\000\000\000\000"}, in6 = {sin6_family = 2, sin6_port =
>> 50788, sin6_flowinfo = 3232235560, sin6_addr = {
>>                __in6_union = {__s6_addr =
>> "\000\000\000\000\000\000\000\000\000\000\001Ww\347", <incomplete
>> sequence \360>, __s6_addr16 = {0, 0, 0, 0, 0, 343, 30695, 3824},
>> __s6_addr32 = {0, 0, 343, 2011631344}}}, sin6_scope_id = 8162}}
>>          pheader = 0x7fffbb68 "UZ?\340w\373+x"
>>          type = 1
>>          udp_size = <optimized out>
>>          dst_addr = {addr4 = {s_addr = 3232235521}, addr6 = {__in6_union
>> = {__s6_addr =
>> "\300\250\000\001a\023\340\345\000\004\024\231w\354\066\240",
>> __s6_addr16 = {49320, 1, 24851, 57573, 4, 5273, 30700, 13984},
>> __s6_addr32 = {3232235521, 1628692709, 267417, 2011969184}}}, cname =
>> {target = {cache = 0xc0a80001,
>>                name = 0xc0a80001 <error: Cannot access memory at address
>> 0xc0a80001>}, uid = 1628692709, is_name_ptr = 267417}, key = {keydata =
>> 0xc0a80001, keylen = 24851, flags = 57573, keytag = 4, algo = 20
>> '\024'}, ds = {keydata = 0xc0a80001, keylen = 24851, keytag = 57573,
>> algo = 0 '\000', digest = 4 '\004'},
>>            srv = {target = 0xc0a80001, targetlen = 24851, srvport =
>> 57573, priority = 4, weight = 5273}, log = {keytag = 49320, algo = 1,
>> digest = 24851, rcode = 57573, ede = 267417}}
>>          netmask = {s_addr = <optimized out>}
>>          dst_addr_4 = {s_addr = 3232235521}
>>          m = 0
>>          n = 31
>>          if_index = 15
>>          auth_dns = 0
>>          do_bit = 0
>>          have_pseudoheader = 0
>>          mark = 0
>>          have_mark = 0
>>          is_single_query = <optimized out>
>>          allowed = <optimized out>
>>          local_auth = <optimized out>
>>          iov = {{iov_base = 0x5561c080, iov_len = 1280}}
>>          msg = {msg_name = 0x7fffbab8, msg_namelen = 16, msg_iov =
>> 0x7fffbaa0, msg_iovlen = 1, msg_control = 0x7fffbad4, msg_controllen =
>> 24, msg_flags = 0}
>>          cmptr = <optimized out>
>>          control_u = {align = {cmsg_len = 24, cmsg_level = 0, cmsg_type =
>> 8}, control6 =
>> "\000\000\000\030\000\000\000\000\000\000\000\b\000\000\000\017\300\250\000\001\300\250\000\001\000\000\000\030\000\000\000",
>>
>>            control =
>> "\000\000\000\030\000\000\000\000\000\000\000\b\000\000\000\017\300\250\000\001\300\250\000\001"}
>>
>>          family = <optimized out>
>>          check_dst = <optimized out>
>> #2  0x5556c853 in check_dns_listeners (now=<optimized out>) at
>> dnsmasq.c:1802
>>          serverfdp = <optimized out>
>>          listener = 0x5561f5c0
>>          rfl = <optimized out>
>>          i = <optimized out>
>>          pipefd = {1431977952, 2012949368}
>> #3  0x555570f9 in main (argc=<optimized out>, argv=<optimized out>) at
>> dnsmasq.c:1229
>>          timeout = <optimized out>
>>          bind_fallback = 0
>>          now = 958007
>>          sigact = {__sa_handler = {sa_handler = 0x1, sa_sigaction = 0x1},
>> sa_mask = {__bits = {0, 0, 0, 0, 0, 2013266592, 17788, 2013238320,
>> 2013238324, 2013245196, 4, 0, 80, 0, 2012700248, 2012708864, 2013167616,
>> 2013233152, 2013237244, 0, 691091872, 2013233152, 4, 2147466580,
>> 1431657089, 2013091140, 2013086164,
>>                2013165476, 0, 0, 2013095836, 15630}}, sa_flags = 0,
>> sa_restorer = 0x22358}
>>          if_tmp = <optimized out>
>>          piperead = 24
>>          pipefd = {24, 25}
>>          err_pipe = {0, -1}
>>          ent_pw = <optimized out>
>>          script_uid = 0
>>          script_gid = 0
>>          gp = <optimized out>
>>          i = <optimized out>
>>          max_fd = 1024
>>          baduser = <optimized out>
>>          log_err = 0
>>          chown_warn = 0
>>          hdr = 0x55634fa0
>>          data = <optimized out>
>>          need_cap_net_admin = <optimized out>
>>          need_cap_net_raw = <optimized out>
>>          need_cap_net_bind_service = <optimized out>
>>          bound_device = 0x0
>>          did_bind = 0
>>          serv = <optimized out>
>>          netlink_warn = 0x0
>>
>> Continuing and breaking again about 30 times mostly yields line 319 as
>> above (which is 'difftime(now, master->forwardtime) > FORWARD_TIME ||')
>>
>> A few times it hits 341 though ('while (forward->blocking_query)'):
>>
>> #0  0x5556896f in forward_query (udpfd=<optimized out>,
>> udpaddr=0x7fffbab8, dst_addr=0x7fffbaa8, dst_iface=15,
>> header=0x5561c080, plen=31, limit=0x5561c280 "", now=958007,
>> forward=0x5563e8e0, ad_reqd=0, do_bit=0) at forward.c:341
>>          is_sign = 1432469663
>>          pheader = 0x0
>>          flags = 0
>>          fwd_flags = <optimized out>
>>          is_dnssec = 0
>>          master = <optimized out>
>>          gotname = 128
>>          hash = 0x55624a10
>>          oph = 0x0
>>          old_src = <optimized out>
>>          first = 1432469663
>>          last = 12
>>          start = 0
>>          subnet = 0
>>          cacheable = 1431977952
>>          forwarded = 0
>>          edns0_len = 1431725623
>>          pheader = 0x80088 <error: Cannot access memory at address 0x80088>
>>          ede = -1
>> #1  0x5556947d in receive_query (listen=0x5561f5c0, now=958007) at
>> forward.c:1640
>>          ad_reqd = 0
>>          header = 0x5561c080
>>          source_addr = {sa = {sa_family = 2, sa_data =
>> "\306d\300\250\000(\000\000\000\000\000\000\000"}, in = {sin_family = 2,
>> sin_port = 50788, sin_addr = {s_addr = 3232235560}, sin_zero =
>> "\000\000\000\000\000\000\000"}, in6 = {sin6_family = 2, sin6_port =
>> 50788, sin6_flowinfo = 3232235560, sin6_addr = {
>>                __in6_union = {__s6_addr =
>> "\000\000\000\000\000\000\000\000\000\000\001Ww\347", <incomplete
>> sequence \360>, __s6_addr16 = {0, 0, 0, 0, 0, 343, 30695, 3824},
>> __s6_addr32 = {0, 0, 343, 2011631344}}}, sin6_scope_id = 8162}}
>>          pheader = 0x7fffbb68 "UZ?\340w\373+x"
>>          type = 1
>>          udp_size = <optimized out>
>>          dst_addr = {addr4 = {s_addr = 3232235521}, addr6 = {__in6_union
>> = {__s6_addr =
>> "\300\250\000\001a\023\340\345\000\004\024\231w\354\066\240",
>> __s6_addr16 = {49320, 1, 24851, 57573, 4, 5273, 30700, 13984},
>> __s6_addr32 = {3232235521, 1628692709, 267417, 2011969184}}}, cname =
>> {target = {cache = 0xc0a80001,
>>                name = 0xc0a80001 <error: Cannot access memory at address
>> 0xc0a80001>}, uid = 1628692709, is_name_ptr = 267417}, key = {keydata =
>> 0xc0a80001, keylen = 24851, flags = 57573, keytag = 4, algo = 20
>> '\024'}, ds = {keydata = 0xc0a80001, keylen = 24851, keytag = 57573,
>> algo = 0 '\000', digest = 4 '\004'},
>>            srv = {target = 0xc0a80001, targetlen = 24851, srvport =
>> 57573, priority = 4, weight = 5273}, log = {keytag = 49320, algo = 1,
>> digest = 24851, rcode = 57573, ede = 267417}}
>>          netmask = {s_addr = <optimized out>}
>>          dst_addr_4 = {s_addr = 3232235521}
>>          m = 0
>>          n = 31
>>          if_index = 15
>>          auth_dns = 0
>>          do_bit = 0
>>          have_pseudoheader = 0
>>          mark = 0
>>          have_mark = 0
>>          is_single_query = <optimized out>
>>          allowed = <optimized out>
>>          local_auth = <optimized out>
>>          iov = {{iov_base = 0x5561c080, iov_len = 1280}}
>>          msg = {msg_name = 0x7fffbab8, msg_namelen = 16, msg_iov =
>> 0x7fffbaa0, msg_iovlen = 1, msg_control = 0x7fffbad4, msg_controllen =
>> 24, msg_flags = 0}
>>          cmptr = <optimized out>
>>          control_u = {align = {cmsg_len = 24, cmsg_level = 0, cmsg_type =
>> 8}, control6 =
>> "\000\000\000\030\000\000\000\000\000\000\000\b\000\000\000\017\300\250\000\001\300\250\000\001\000\000\000\030\000\000\000",
>>
>>            control =
>> "\000\000\000\030\000\000\000\000\000\000\000\b\000\000\000\017\300\250\000\001\300\250\000\001"}
>>
>>          family = <optimized out>
>>          check_dst = <optimized out>
>> #2  0x5556c853 in check_dns_listeners (now=<optimized out>) at
>> dnsmasq.c:1802
>>          serverfdp = <optimized out>
>>          listener = 0x5561f5c0
>>          rfl = <optimized out>
>>          i = <optimized out>
>>          pipefd = {1431977952, 2012949368}
>>
>>>
>>> Does that help, maybe?
>>>
>>> Regards,
>>> Andre
>>>
>>> On 11/08/2021 08:36, Andre Heider wrote:
>>>> On 10/08/2021 23:11, Simon Kelley wrote:
>>>>> On 10/08/2021 14:53, Dominik wrote:
>>>>>> Hey Simon,
>>>>>>
>>>>>> various dnsmasq-2.86 test tags are around and it doesn't look like
>>>>>> there are any intermediate bugs around. The Pi-hole beta seems to have
>>>>>> attracted at least a few couple of dozen additional testers and
>>>>>> nothing
>>>>>> seems to have come up here, either.
>>>>>>
>>>>>> How do you feel about tagging a v2.86 release?
>>>>>>
>>>>>> Best regards,
>>>>>> Dominik
>>>>>>
>>>>>>
>>>>>
>>>>> I've been accumulating patches for a couple of weeks, and started
>>>>> working through those tonight. None are very intrusive or
>>>>> controversial.
>>>>> Once those are done, I'll tag rc1.
>>>>
>>>> I'm using 2.86test6 on OpenWrt, and I think I've found a bug.
>>>> Detail's are vague so far but ever since I've started DoT with stubby
>>>> as upstream server, dnsmasq every now and then gets into a mode where
>>>> it stops responding to request completely and just sits there using
>>>> 100% cpu power on one core.
>>>> I haven't found a way yet to trigger that reliably, but it feels like
>>>> timing/parallel requests.
>>>>
>>>> Does that ring any bells?
>>>>
>>>> Any hints on how to get more relevant info on the situation?
>>>> Sometimes it takes over a day, so verbose logs seem to be out of the
>>>> question (since it's a small router). How to I get a backtrace
>>>> without gdb?
>>>>
>>>> config created by OpenWrt:
>>>> conf-file=/etc/dnsmasq.conf
>>>> dhcp-authoritative
>>>> domain-needed
>>>> no-resolv
>>>> localise-queries
>>>> enable-ubus=dnsmasq
>>>> expand-hosts
>>>> bind-dynamic
>>>> local-service
>>>> quiet-dhcp
>>>> edns-packet-max=1280
>>>> domain=internal
>>>> local=/internal/
>>>> server=127.0.0.1#5453
>>>> addn-hosts=/tmp/hosts
>>>> dhcp-leasefile=/tmp/dhcp.leases
>>>> stop-dns-rebind
>>>> conf-file=/usr/share/dnsmasq/trust-anchors.conf
>>>> dnssec
>>>> dhcp-broadcast=tag:needs-broadcast
>>>> conf-dir=/tmp/dnsmasq.d
>>>> user=dnsmasq
>>>> group=dnsmasq
>>>> dhcp-ignore-names=tag:dhcp_bogus_hostname
>>>> conf-file=/usr/share/dnsmasq/dhcpbogushostname.conf
>>>> bogus-priv
>>>> conf-file=/usr/share/dnsmasq/rfc6761.conf
>>>> no-dhcp-interface=pppoe-wan
>>>> dhcp-range=set:lan,192.168.0.200,192.168.0.254,255.255.255.0,12h
>>>>
>>>> stubby config created by OpenWrt:
>>>> resolution_type: GETDNS_RESOLUTION_STUB
>>>> round_robin_upstreams: 1
>>>> appdata_dir: "/var/lib/stubby"
>>>> trust_anchors_backoff_time: 2500
>>>> tls_authentication: GETDNS_AUTHENTICATION_REQUIRED
>>>> tls_query_padding_blocksize: 128
>>>> edns_client_subnet_private: 1
>>>> idle_timeout: 10000
>>>> listen_addresses:
>>>>     - 127.0.0.1 at 5453
>>>> dns_transport_list:
>>>>     - GETDNS_TRANSPORT_TLS
>>>> upstream_recursive_servers:
>>>>     - address_data: 5.1.66.255
>>>>       tls_auth_name: "dot.ffmuc.net"
>>>>     - address_data: 185.150.99.255
>>>>       tls_auth_name: "dot.ffmuc.net"
>>>>     - address_data: 185.95.218.42
>>>>       tls_auth_name: "dns.digitale-gesellschaft.ch"
>>>>     - address_data: 185.95.218.43
>>>>       tls_auth_name: "dns.digitale-gesellschaft.ch"
>>>>
>>>
>>
>>




More information about the Dnsmasq-discuss mailing list