[Dnsmasq-discuss] dnsmasq v2.86?

Simon Kelley simon at thekelleys.org.uk
Thu Aug 12 09:06:31 UTC 2021


This is useful information, but the backtraces are puzzling: the code
isn't in tight loop, certainly.


I wonder if the v4only.arpa thing is not a coincidence?

Some things to try, please.

1) When the dnsmasq process is faulted, run

strace -p <pid of dnsmasq process>

2) Try doing a query on ipv4only.arpa to dnsmasq directly, just in case
that's the whole trigger.

3) Same but straight to stubby

dig @127.0.0.1 -p 5453 ipv4only.arpa


4) Take a look in /usr/share/dnsmasq/rfc6761.conf

Is .arpa mentioned in there?

Cheers,

Simon.



On 11/08/2021 15:50, Andre Heider wrote:
> On 11/08/2021 16:19, Andre Heider wrote:
>> Hi there,
>>
>> it seems I can trigger this semi-reliably while powering up 2 KVM
>> windows instances in parallel.
>>
>> This is the tail of log-queries:
>> dnsmasq[6591]: query[A] foo.internal from 192.168.0.foo
>> dnsmasq[6591]: DHCP foo.internal is 192.168.0.foo
>> dnsmasq[6591]: query[AAAA] foo.internal from 192.168.0.foo
>> dnsmasq[6591]: config foo.internal is NODATA-IPv6
>> dnsmasq[6591]: query[A] ipv4only.arpa from 192.168.0.foo
>>
>> And this is where dnsmasq stop replying altogether and hogs the cpu.
>> Note that foo.internal is the kvm host with the ip 192.168.0.foo.
>>
>> Using gdbserver doesn't yield too much info, looks like because of lto
>> maybe (which OpenWrt does for dnsmasq)?
>> (gdb) info threads
>>    Id   Target Id                  Frame
>> * 1    Thread 6591.6591 "dnsmasq" 0x555715f1 in forward_query.lto_priv ()
>> (gdb) bt full
>> #0  0x555715f1 in forward_query.lto_priv ()
>> No symbol table info available.
>> Backtrace stopped: previous frame identical to this frame (corrupt
>> stack?)
> 
> Without LTO, another hang with the last log-queries 'dnsmasq[6842]:
> query[A] ipv4only.arpa from 192.168.0.foo' again:
> 
> #0  0x55568969 in forward_query (udpfd=<optimized out>,
> udpaddr=0x7fffbab8, dst_addr=0x7fffbaa8, dst_iface=15,
> header=0x5561c080, plen=31, limit=0x5561c280 "", now=958007,
> forward=0x5563e8e0, ad_reqd=0, do_bit=0) at forward.c:319
>         is_sign = 1432469663
>         pheader = 0x0
>         flags = 0
>         fwd_flags = <optimized out>
>         is_dnssec = 0
>         master = <optimized out>
>         gotname = 128
>         hash = 0x55624a10
>         oph = 0x0
>         old_src = <optimized out>
>         first = 1432469663
>         last = 12
>         start = 0
>         subnet = 0
>         cacheable = 1431977952
>         forwarded = 0
>         edns0_len = 1431725623
>         pheader = 0x80088 <error: Cannot access memory at address 0x80088>
>         ede = -1
> #1  0x5556947d in receive_query (listen=0x5561f5c0, now=958007) at
> forward.c:1640
>         ad_reqd = 0
>         header = 0x5561c080
>         source_addr = {sa = {sa_family = 2, sa_data =
> "\306d\300\250\000(\000\000\000\000\000\000\000"}, in = {sin_family = 2,
> sin_port = 50788, sin_addr = {s_addr = 3232235560}, sin_zero =
> "\000\000\000\000\000\000\000"}, in6 = {sin6_family = 2, sin6_port =
> 50788, sin6_flowinfo = 3232235560, sin6_addr = {
>               __in6_union = {__s6_addr =
> "\000\000\000\000\000\000\000\000\000\000\001Ww\347", <incomplete
> sequence \360>, __s6_addr16 = {0, 0, 0, 0, 0, 343, 30695, 3824},
> __s6_addr32 = {0, 0, 343, 2011631344}}}, sin6_scope_id = 8162}}
>         pheader = 0x7fffbb68 "UZ?\340w\373+x"
>         type = 1
>         udp_size = <optimized out>
>         dst_addr = {addr4 = {s_addr = 3232235521}, addr6 = {__in6_union
> = {__s6_addr =
> "\300\250\000\001a\023\340\345\000\004\024\231w\354\066\240",
> __s6_addr16 = {49320, 1, 24851, 57573, 4, 5273, 30700, 13984},
> __s6_addr32 = {3232235521, 1628692709, 267417, 2011969184}}}, cname =
> {target = {cache = 0xc0a80001,
>               name = 0xc0a80001 <error: Cannot access memory at address
> 0xc0a80001>}, uid = 1628692709, is_name_ptr = 267417}, key = {keydata =
> 0xc0a80001, keylen = 24851, flags = 57573, keytag = 4, algo = 20
> '\024'}, ds = {keydata = 0xc0a80001, keylen = 24851, keytag = 57573,
> algo = 0 '\000', digest = 4 '\004'},
>           srv = {target = 0xc0a80001, targetlen = 24851, srvport =
> 57573, priority = 4, weight = 5273}, log = {keytag = 49320, algo = 1,
> digest = 24851, rcode = 57573, ede = 267417}}
>         netmask = {s_addr = <optimized out>}
>         dst_addr_4 = {s_addr = 3232235521}
>         m = 0
>         n = 31
>         if_index = 15
>         auth_dns = 0
>         do_bit = 0
>         have_pseudoheader = 0
>         mark = 0
>         have_mark = 0
>         is_single_query = <optimized out>
>         allowed = <optimized out>
>         local_auth = <optimized out>
>         iov = {{iov_base = 0x5561c080, iov_len = 1280}}
>         msg = {msg_name = 0x7fffbab8, msg_namelen = 16, msg_iov =
> 0x7fffbaa0, msg_iovlen = 1, msg_control = 0x7fffbad4, msg_controllen =
> 24, msg_flags = 0}
>         cmptr = <optimized out>
>         control_u = {align = {cmsg_len = 24, cmsg_level = 0, cmsg_type =
> 8}, control6 =
> "\000\000\000\030\000\000\000\000\000\000\000\b\000\000\000\017\300\250\000\001\300\250\000\001\000\000\000\030\000\000\000",
> 
>           control =
> "\000\000\000\030\000\000\000\000\000\000\000\b\000\000\000\017\300\250\000\001\300\250\000\001"}
> 
>         family = <optimized out>
>         check_dst = <optimized out>
> #2  0x5556c853 in check_dns_listeners (now=<optimized out>) at
> dnsmasq.c:1802
>         serverfdp = <optimized out>
>         listener = 0x5561f5c0
>         rfl = <optimized out>
>         i = <optimized out>
>         pipefd = {1431977952, 2012949368}
> #3  0x555570f9 in main (argc=<optimized out>, argv=<optimized out>) at
> dnsmasq.c:1229
>         timeout = <optimized out>
>         bind_fallback = 0
>         now = 958007
>         sigact = {__sa_handler = {sa_handler = 0x1, sa_sigaction = 0x1},
> sa_mask = {__bits = {0, 0, 0, 0, 0, 2013266592, 17788, 2013238320,
> 2013238324, 2013245196, 4, 0, 80, 0, 2012700248, 2012708864, 2013167616,
> 2013233152, 2013237244, 0, 691091872, 2013233152, 4, 2147466580,
> 1431657089, 2013091140, 2013086164,
>               2013165476, 0, 0, 2013095836, 15630}}, sa_flags = 0,
> sa_restorer = 0x22358}
>         if_tmp = <optimized out>
>         piperead = 24
>         pipefd = {24, 25}
>         err_pipe = {0, -1}
>         ent_pw = <optimized out>
>         script_uid = 0
>         script_gid = 0
>         gp = <optimized out>
>         i = <optimized out>
>         max_fd = 1024
>         baduser = <optimized out>
>         log_err = 0
>         chown_warn = 0
>         hdr = 0x55634fa0
>         data = <optimized out>
>         need_cap_net_admin = <optimized out>
>         need_cap_net_raw = <optimized out>
>         need_cap_net_bind_service = <optimized out>
>         bound_device = 0x0
>         did_bind = 0
>         serv = <optimized out>
>         netlink_warn = 0x0
> 
> Continuing and breaking again about 30 times mostly yields line 319 as
> above (which is 'difftime(now, master->forwardtime) > FORWARD_TIME ||')
> 
> A few times it hits 341 though ('while (forward->blocking_query)'):
> 
> #0  0x5556896f in forward_query (udpfd=<optimized out>,
> udpaddr=0x7fffbab8, dst_addr=0x7fffbaa8, dst_iface=15,
> header=0x5561c080, plen=31, limit=0x5561c280 "", now=958007,
> forward=0x5563e8e0, ad_reqd=0, do_bit=0) at forward.c:341
>         is_sign = 1432469663
>         pheader = 0x0
>         flags = 0
>         fwd_flags = <optimized out>
>         is_dnssec = 0
>         master = <optimized out>
>         gotname = 128
>         hash = 0x55624a10
>         oph = 0x0
>         old_src = <optimized out>
>         first = 1432469663
>         last = 12
>         start = 0
>         subnet = 0
>         cacheable = 1431977952
>         forwarded = 0
>         edns0_len = 1431725623
>         pheader = 0x80088 <error: Cannot access memory at address 0x80088>
>         ede = -1
> #1  0x5556947d in receive_query (listen=0x5561f5c0, now=958007) at
> forward.c:1640
>         ad_reqd = 0
>         header = 0x5561c080
>         source_addr = {sa = {sa_family = 2, sa_data =
> "\306d\300\250\000(\000\000\000\000\000\000\000"}, in = {sin_family = 2,
> sin_port = 50788, sin_addr = {s_addr = 3232235560}, sin_zero =
> "\000\000\000\000\000\000\000"}, in6 = {sin6_family = 2, sin6_port =
> 50788, sin6_flowinfo = 3232235560, sin6_addr = {
>               __in6_union = {__s6_addr =
> "\000\000\000\000\000\000\000\000\000\000\001Ww\347", <incomplete
> sequence \360>, __s6_addr16 = {0, 0, 0, 0, 0, 343, 30695, 3824},
> __s6_addr32 = {0, 0, 343, 2011631344}}}, sin6_scope_id = 8162}}
>         pheader = 0x7fffbb68 "UZ?\340w\373+x"
>         type = 1
>         udp_size = <optimized out>
>         dst_addr = {addr4 = {s_addr = 3232235521}, addr6 = {__in6_union
> = {__s6_addr =
> "\300\250\000\001a\023\340\345\000\004\024\231w\354\066\240",
> __s6_addr16 = {49320, 1, 24851, 57573, 4, 5273, 30700, 13984},
> __s6_addr32 = {3232235521, 1628692709, 267417, 2011969184}}}, cname =
> {target = {cache = 0xc0a80001,
>               name = 0xc0a80001 <error: Cannot access memory at address
> 0xc0a80001>}, uid = 1628692709, is_name_ptr = 267417}, key = {keydata =
> 0xc0a80001, keylen = 24851, flags = 57573, keytag = 4, algo = 20
> '\024'}, ds = {keydata = 0xc0a80001, keylen = 24851, keytag = 57573,
> algo = 0 '\000', digest = 4 '\004'},
>           srv = {target = 0xc0a80001, targetlen = 24851, srvport =
> 57573, priority = 4, weight = 5273}, log = {keytag = 49320, algo = 1,
> digest = 24851, rcode = 57573, ede = 267417}}
>         netmask = {s_addr = <optimized out>}
>         dst_addr_4 = {s_addr = 3232235521}
>         m = 0
>         n = 31
>         if_index = 15
>         auth_dns = 0
>         do_bit = 0
>         have_pseudoheader = 0
>         mark = 0
>         have_mark = 0
>         is_single_query = <optimized out>
>         allowed = <optimized out>
>         local_auth = <optimized out>
>         iov = {{iov_base = 0x5561c080, iov_len = 1280}}
>         msg = {msg_name = 0x7fffbab8, msg_namelen = 16, msg_iov =
> 0x7fffbaa0, msg_iovlen = 1, msg_control = 0x7fffbad4, msg_controllen =
> 24, msg_flags = 0}
>         cmptr = <optimized out>
>         control_u = {align = {cmsg_len = 24, cmsg_level = 0, cmsg_type =
> 8}, control6 =
> "\000\000\000\030\000\000\000\000\000\000\000\b\000\000\000\017\300\250\000\001\300\250\000\001\000\000\000\030\000\000\000",
> 
>           control =
> "\000\000\000\030\000\000\000\000\000\000\000\b\000\000\000\017\300\250\000\001\300\250\000\001"}
> 
>         family = <optimized out>
>         check_dst = <optimized out>
> #2  0x5556c853 in check_dns_listeners (now=<optimized out>) at
> dnsmasq.c:1802
>         serverfdp = <optimized out>
>         listener = 0x5561f5c0
>         rfl = <optimized out>
>         i = <optimized out>
>         pipefd = {1431977952, 2012949368}
> 
>>
>> Does that help, maybe?
>>
>> Regards,
>> Andre
>>
>> On 11/08/2021 08:36, Andre Heider wrote:
>>> On 10/08/2021 23:11, Simon Kelley wrote:
>>>> On 10/08/2021 14:53, Dominik wrote:
>>>>> Hey Simon,
>>>>>
>>>>> various dnsmasq-2.86 test tags are around and it doesn't look like
>>>>> there are any intermediate bugs around. The Pi-hole beta seems to have
>>>>> attracted at least a few couple of dozen additional testers and
>>>>> nothing
>>>>> seems to have come up here, either.
>>>>>
>>>>> How do you feel about tagging a v2.86 release?
>>>>>
>>>>> Best regards,
>>>>> Dominik
>>>>>
>>>>>
>>>>
>>>> I've been accumulating patches for a couple of weeks, and started
>>>> working through those tonight. None are very intrusive or
>>>> controversial.
>>>> Once those are done, I'll tag rc1.
>>>
>>> I'm using 2.86test6 on OpenWrt, and I think I've found a bug.
>>> Detail's are vague so far but ever since I've started DoT with stubby
>>> as upstream server, dnsmasq every now and then gets into a mode where
>>> it stops responding to request completely and just sits there using
>>> 100% cpu power on one core.
>>> I haven't found a way yet to trigger that reliably, but it feels like
>>> timing/parallel requests.
>>>
>>> Does that ring any bells?
>>>
>>> Any hints on how to get more relevant info on the situation?
>>> Sometimes it takes over a day, so verbose logs seem to be out of the
>>> question (since it's a small router). How to I get a backtrace
>>> without gdb?
>>>
>>> config created by OpenWrt:
>>> conf-file=/etc/dnsmasq.conf
>>> dhcp-authoritative
>>> domain-needed
>>> no-resolv
>>> localise-queries
>>> enable-ubus=dnsmasq
>>> expand-hosts
>>> bind-dynamic
>>> local-service
>>> quiet-dhcp
>>> edns-packet-max=1280
>>> domain=internal
>>> local=/internal/
>>> server=127.0.0.1#5453
>>> addn-hosts=/tmp/hosts
>>> dhcp-leasefile=/tmp/dhcp.leases
>>> stop-dns-rebind
>>> conf-file=/usr/share/dnsmasq/trust-anchors.conf
>>> dnssec
>>> dhcp-broadcast=tag:needs-broadcast
>>> conf-dir=/tmp/dnsmasq.d
>>> user=dnsmasq
>>> group=dnsmasq
>>> dhcp-ignore-names=tag:dhcp_bogus_hostname
>>> conf-file=/usr/share/dnsmasq/dhcpbogushostname.conf
>>> bogus-priv
>>> conf-file=/usr/share/dnsmasq/rfc6761.conf
>>> no-dhcp-interface=pppoe-wan
>>> dhcp-range=set:lan,192.168.0.200,192.168.0.254,255.255.255.0,12h
>>>
>>> stubby config created by OpenWrt:
>>> resolution_type: GETDNS_RESOLUTION_STUB
>>> round_robin_upstreams: 1
>>> appdata_dir: "/var/lib/stubby"
>>> trust_anchors_backoff_time: 2500
>>> tls_authentication: GETDNS_AUTHENTICATION_REQUIRED
>>> tls_query_padding_blocksize: 128
>>> edns_client_subnet_private: 1
>>> idle_timeout: 10000
>>> listen_addresses:
>>>    - 127.0.0.1 at 5453
>>> dns_transport_list:
>>>    - GETDNS_TRANSPORT_TLS
>>> upstream_recursive_servers:
>>>    - address_data: 5.1.66.255
>>>      tls_auth_name: "dot.ffmuc.net"
>>>    - address_data: 185.150.99.255
>>>      tls_auth_name: "dot.ffmuc.net"
>>>    - address_data: 185.95.218.42
>>>      tls_auth_name: "dns.digitale-gesellschaft.ch"
>>>    - address_data: 185.95.218.43
>>>      tls_auth_name: "dns.digitale-gesellschaft.ch"
>>>
>>
> 
> 



More information about the Dnsmasq-discuss mailing list