[Dnsmasq-discuss] dnsmasq 2.86 crash
Simon Kelley
simon at thekelleys.org.uk
Tue Oct 19 14:51:28 UTC 2021
On 15/10/2021 04:45, Eloy Paris wrote:
> Hi Simon,
>
> I am running 2.87test4, which has the commit you mention below.
>
> I've tested bringing down and up the external interfaces of the machine
> (the ones that dnsmasq uses to reach the recursive DNS servers to
> fulfill DNS requests it receives) and have not been able to reproduce a
> crash anymore.
>
> However, after bringing down an interface and a few seconds later
> bringing it back up, DNS resolution stops working.
>
> I see this in the system log right after I re-enable the interface that
> I previously disabled:
>
> Oct 14 23:28:14 chapilu dnsmasq[79367]: reading /etc/resolv.conf
>
> and the packet capture shows:
>
> 23:29:24.737039 IP 192.168.122.165.60261 > 192.168.122.1.53: 18552+ A? google.com. (28)
> 23:29:24.737126 IP 192.168.122.1.53 > 192.168.122.165.60261: 18552 Refused 0/0/0 (28)
>
> Under what conditions does dnsmasq respond to a resolution request with
> REFUSED; no servers in /etc/resolv.conf?
>
> I guess there might be a race condition here because I just sent SIGHUP
> to the dnsmasq process and the system log shows this:
>
As Petr says, the REFUSED reply is when there are no suitable servers to
forward a query to.
Based on the previous crash, which is triggered by there being no
configured servers, this is expected. Reading /etc/resolv.conf is
inherently racy - but the code should keep trying in the case that it
finds an empty /etc/resolv.conf to mitigate this.
You're not setting --no-poll are you?
Simon.
> Oct 14 23:38:34 chapilu dnsmasq[79367]: read /etc/hosts - 4 addresses
>
> and now DNS resolution works again!
>
> No idea why dnsmasq is automatically detecting one change in
> /etc/resolv.conf, and it apparently is one that does not contain any
> servers.
>
> Cheers,
>
> Eloy Paris.-
>
> On Wed, Oct 13, 2021 at 09:44:56AM +0100, Simon Kelley wrote:
>
>> Based on the location of the crash, and the circumstances that cause it,
>> my guess is that this will be fixed by
>>
>> https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=d290630d31f4517ab26392d00753d1397f9a4114
>>
>> Please could you try that, and get back to us if it doesn't sort the
>> problem?
>>
>>
>> Cheers,
>>
>> Simon.
>>
>>
>> On 13/10/2021 05:03, Eloy Paris wrote:
>>> Sorry, I forgot the backtrace; here it is:
>>>
>>> (gdb) bt
>>> #0 0x0000561772564d46 in lookup_domain (domain=0x561772b1d7a0 "enterprise.activity.windows.com", flags=flags at entry=128,
>>> lowout=lowout at entry=0x7ffe8a4b63ac, highout=highout at entry=0x7ffe8a4b63b0) at domain-match.c:234
>>> #1 0x0000561772532b1e in forward_query (udpfd=5, udpaddr=udpaddr at entry=0x7ffe8a4b6500, dst_addr=dst_addr at entry=0x7ffe8a4b64e0,
>>> dst_iface=dst_iface at entry=13, header=header at entry=0x561772b1fae0, plen=plen at entry=49, limit=0x561772b1fce0 "", now=1634091453,
>>> forward=0x0, ad_reqd=0, do_bit=0) at forward.c:258
>>> #2 0x000056177253387e in receive_query (listen=listen at entry=0x561772b1f400, now=now at entry=1634091453) at forward.c:1636
>>> #3 0x0000561772538b7c in check_dns_listeners (now=now at entry=1634091453) at dnsmasq.c:1810
>>> #4 0x000056177251888b in main (argc=<optimized out>, argv=<optimized out>) at dnsmasq.c:1237
>>> (gdb)
>>>
>>>
>>> On Tue, Oct 12, 2021 at 10:55:22PM -0400, Eloy Paris wrote:
>>>> Hello,
>>>>
>>>> I am experiencing crashes in dnsmasq (2.86). I am able to reproduce it
>>>> though I am not sure about the exact sequence of events -- it happens
>>>> when I enable/disable my Ethernet NIC and disable/enable my wireless
>>>> NIC.
>>>>
>>>> In this use case, dnsmasq provides DNS and DHCP for a virtual machine
>>>> (VM) running on KVM. Something like this:
>>>>
>>>> +-- (Ethernet)-+
>>>> | |
>>>> Internet + Hypervisor (virbr0) --- VM
>>>> | |
>>>> +--- (Wi-fi)---+
>>>>
>>>> The interfaces I am enabling and disabling are on the hypervisor
>>>> ("Ethernet" and "Wi-fi" above; virbr0 never goes down).
>>>>
>>>> I guess dnsmasq does not handle gracefully receiving a DNS request from
>>>> the VM when the external interfaces of the hypervisor (Ethernet or
>>>> wi-fi) go down?
>>>>
>>>> Some details below; happy to provide more information if that's not
>>>> enough to get to the bottom of it.
>>>>
>>>> Cheers,
>>>>
>>>> Eloy Paris.-
>>>>
>>>> ----------------------------------------------------------------------
>>>>
>>>> elparis at chapilu[0]:~$ sudo coredumpctl debug dnsmasq
>>>> [sudo] Mot de passe de elparis :
>>>> PID: 46002 (dnsmasq)
>>>> UID: 65534 (nobody)
>>>> GID: 65534 (nobody)
>>>> Signal: 11 (SEGV)
>>>> Timestamp: Tue 2021-10-12 22:17:33 EDT (11min ago)
>>>> Command Line: /usr/bin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leaseshelper
>>>> Executable: /usr/bin/dnsmasq
>>>> Control Group: /system.slice/libvirtd.service
>>>> Unit: libvirtd.service
>>>> Slice: system.slice
>>>> Boot ID: a42fe0fbd18649ad8a6951687e2855d8
>>>> Machine ID: 3d34409507634591951da9abb51a3942
>>>> Hostname: chapilu
>>>> Storage: /var/lib/systemd/coredump/core.dnsmasq.65534.a42fe0fbd18649ad8a6951687e2855d8.46002.1634091453000000.zst (present)
>>>> Disk Size: 204.6K
>>>> Message: Process 46002 (dnsmasq) of user 65534 dumped core.
>>>>
>>>> Found module linux-vdso.so.1 with build-id: c6d7bcf242640b81aef03f231535fa4c2486c744
>>>> Found module libffi.so.7 with build-id: de60e99f39569d11d09160bbdcd486cedc87d2b6
>>>> Found module libp11-kit.so.0 with build-id: 5314ec746546ada6f442b6fdfae15eab9f6d3cdc
>>>> Found module libcrypto.so.1.1 with build-id: 6d23f0a3f354825868d044684fad31d482cc9210
>>>> Found module libdl.so.2 with build-id: 5abc547e7b0949f89f3c0e21ab0c8331a7440a8a
>>>> Found module libcrypt.so.2 with build-id: 3743451bdaf36f951f926927633fd964813025d0
>>>> Found module libnss_systemd.so.2 with build-id: 22990ff716d182c427e26b7a3cf94048b55b3e75
>>>> Found module libnss_files.so.2 with build-id: 1a36dfc01d3a1010b2ee79766a24a8090a3266d5
>>>> Found module libgpg-error.so.0 with build-id: ba85170c2d9343ea05eea8fa2048c212ff4ef552
>>>> Found module libgcrypt.so.20 with build-id: db45f5d5e0f7af1e77324fea1885f974619ad268
>>>> Found module libcap.so.2 with build-id: c1674f9082fedd415876b9f7d9712269163259b5
>>>> Found module liblz4.so.1 with build-id: e63600ab23b2f6997f42fac2fa56e1f02ce159a1
>>>> Found module libzstd.so.1 with build-id: 4b10444c1560ebc574af4d5f488b7408b22d450e
>>>> Found module liblzma.so.5 with build-id: 8b615460aa230708c5183f16bede67aa0437d95e
>>>> Found module librt.so.1 with build-id: 75484da2d6f1515189eefa076e0a40328834cd16
>>>> Found module ld-linux-x86-64.so.2 with build-id: 040cc3dd10461562f177df39e3be2f3704258c3c
>>>> Found module libmnl.so.0 with build-id: fdf3a318247060fa3e451d511ebaf23a7396d1dd
>>>> Found module libnfnetlink.so.0 with build-id: 273cc877c7b2ff41e88753edda777d7f1c4017ca
>>>> Found module libunistring.so.2 with build-id: 015ac6d6bcb60b7d8bea31a80d1941b06e8636ab
>>>> Found module libsystemd.so.0 with build-id: f776aaa16b4e2ba7056d01d928e4b2726ffe2b8b
>>>> Found module libpthread.so.0 with build-id: 07c8f95b4f3251d08550217ad8a1f31066229996
>>>> Found module libc.so.6 with build-id: 4b406737057708c0e4c642345a703c47a61c73dc
>>>> Found module libgmp.so.10 with build-id: e58d34ab389d1b649c24195c2d145e3ff2e58290
>>>> Found module libhogweed.so.6 with build-id: 2d70cff7b1841b4d9ca4e8e7726cd4b944c07fdc
>>>> Found module libnettle.so.8 with build-id: 9a878e513c02007598fcf1e2e286c2203f13536e
>>>> Found module libnetfilter_conntrack.so.3 with build-id: 0ad526380b1a1986a1e471a84d88d5f2a7fedd80
>>>> Found module libidn2.so.0 with build-id: 1ce2b50ad9f9821c2c629b521cf5a3c99593d332
>>>> Found module libdbus-1.so.3 with build-id: 74f2ab9c60512f3a93c932c3f627564d42e0b11e
>>>> Found module dnsmasq with build-id: 17b49d0999133997748526c51d99bf8b932fb79d
>>>> Stack trace of thread 46002:
>>>> #0 0x0000561772564d46 lookup_domain (dnsmasq + 0x55d46)
>>>> #1 0x0000561772532b1e forward_query (dnsmasq + 0x23b1e)
>>>> #2 0x000056177253387e receive_query (dnsmasq + 0x2487e)
>>>> #3 0x0000561772538b7c check_dns_listeners (dnsmasq + 0x29b7c)
>>>> #4 0x000056177251888b main (dnsmasq + 0x988b)
>>>> #5 0x00007f85147e4b25 __libc_start_main (libc.so.6 + 0x27b25)
>>>> #6 0x00005617725197ae _start (dnsmasq + 0xa7ae)
>>>>
>>>> GNU gdb (GDB) 11.1
>>>> Copyright (C) 2021 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law.
>>>> Type "show copying" and "show warranty" for details.
>>>> This GDB was configured as "x86_64-pc-linux-gnu".
>>>> Type "show configuration" for configuration details.
>>>> For bug reporting instructions, please see:
>>>> <https://www.gnu.org/software/gdb/bugs/>.
>>>> Find the GDB manual and other documentation resources online at:
>>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>>
>>>> For help, type "help".
>>>> Type "apropos word" to search for commands related to "word"...
>>>> Reading symbols from /usr/bin/dnsmasq...
>>>> [New LWP 46002]
>>>> [Thread debugging using libthread_db enabled]
>>>> Using host libthread_db library "/usr/lib/libthread_db.so.1".
>>>> Core was generated by `/usr/bin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-'.
>>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>>> #0 0x0000561772564d46 in lookup_domain (domain=0x561772b1d7a0 "enterprise.activity.windows.com", flags=flags at entry=128,
>>>> lowout=lowout at entry=0x7ffe8a4b63ac, highout=highout at entry=0x7ffe8a4b63b0) at domain-match.c:234
>>>> 234 domain-match.c: Aucun fichier ou dossier de ce type.
>>>> (gdb)
>>>>
>
More information about the Dnsmasq-discuss
mailing list