[Dnsmasq-discuss] Occasional "communications error", how to diagnose?
Geert Stappers
stappers at stappers.nl
Thu Dec 14 21:55:56 UTC 2023
On Thu, Dec 14, 2023 at 10:04:02AM +0000, Chris Green wrote:
> On Wed, Dec 13, 2023 at 08:59:05PM +0000, Simon Kelley wrote:
> > On 13/12/2023 15:25, Chris Green wrote:
> > > I run dnsmasq version 2.89 on my laptop
> > > which is running [x]ubuntu 23.04.
> > >
> > > I have systemd.resolvd disabled.
> > >
> > > I'm occasionally seeing the following error when getting a host's IP:-
> > >
> > > chris$ host homepi
> > > ;; communications error to 127.0.0.1#53: timed out
> > > homepi has address 192.168.1.113
> > > chris$ ps -ef | grep dnsmasq
> > > dnsmasq 933 1 0 Dec06 ? 00:00:22 /usr/sbin/dnsmasq -x /run/dnsmasq/dnsmasq.pid
> > -u dnsmasq -7 /etc/dnsmasq.d,.dpkg-dist,.dpkg-old,.dpkg-new --local-service
> > --trust-anchor=.,20326,8,2,e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
> >
> > > chris 86541 3774 0 15:05 pts/1 00:00:00 grep --color=auto dnsmasq
> > > chris$
> > >
> > > As can be seen dnsmasq is running and subsequent queries work without any
> > > error (or delay). The above timeout is a few seconds, maybe five or a bit
> > > less.
> > >
> > > There's no dnsmasq related error message in syslog (nothing for today at
> > > all). The system homepi is a Raspberry Pi on the same LAN as the laptop
> > > running dnsmasq, The error isn't only for one particular host, I've seen
> > > it for other systems on my LAN.
> > >
> > > Can anyone suggest what might be causing the error and/or how to diagnose
> > > what's wrong?
> > >
> >
> > It looks like the first query (or its reply) was dropped, host retried,
> > and it worked second time around.
> >
> > Since DNS transport is normally across UDP, which is defined as
> > unreliable, this is completely normal. Except that the UDP packets are
> > not actually traversing a network, they're going via the lo interface
> > within one machine. I'm sure there are circumstances where UDP packets
> > can get dropped in the kernel when going via the lo interface, but it
> > shouldn't happen very often. Is the machine under heavy load or memory
> > pressure? Maybe a network reconfiguration event could drop packets?
> >
> No, it's not a heavily loaded system by any means.
Acknowledge.
> It's a Thinkpad T470 laptop with an I7 processor and is virtually
> never worked hard at all. Just randomly running top now shows:-
>
> top - 09:59:28 up 12:04, 3 users, load average: 0.20, 0.12, 0.10
> Tasks: 254 total, 1 running, 253 sleeping, 0 stopped, 0 zombie
> %Cpu(s): 1.5 us, 0.2 sy, 0.0 ni, 97.9 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st
> MiB Mem : 7790.8 total, 296.7 free, 1032.4 used, 6461.8 buff/cache
> MiB Swap: 15258.0 total, 15255.5 free, 2.5 used. 6370.8 avail Mem
>
> That's about the way it always is (three users are all me).
>
> What I don't understand is that there's nothing at all in the logs about the
> failure/timeout.
Imagination is more important as knowledge --Albert Einstein
The sympthoms are that client request doesn't reach the server,
hence the report of "time out".
> Can I increase dnsmasq's logging to see if anything shows
> up? It's just 'my' laptop so there isn't a lot of DNS.
Add another DNS client for collecting more datapoints.
So try to reproduce the issue with `dig` and/or `nslookup`
whenever you encounter it with `host`.
Groeten
Geert Stappers
P.S.
Thanks for making it possible that we can read in the discussion order.
--
Silence is hard to parse
More information about the Dnsmasq-discuss
mailing list