[Dnsmasq-discuss] Avoid conflicts between dnsmasq and systemd-resolved.

Hongyi Zhao hongyi.zhao at gmail.com
Wed Sep 16 06:19:27 BST 2020


On Wed, Sep 16, 2020 at 11:18 AM Dominick C. Pastore
<dominickpastore at dcpx.org> wrote:
>
> On Tue, Sep 15, 2020, at 9:47 AM, Hongyi Zhao wrote:
> > On Tue, Sep 15, 2020 at 11:09 AM Dominick C. Pastore
> > <dominickpastore at dcpx.org> wrote:
> > >
> > > On Mon, Sep 14, 2020, at 8:03 PM, Hongyi Zhao wrote:
> > > > I run dnsmasq as following:
> > > >
> > > > $ /usr/local/sbin/dnsmasq --port=53 -c10240 --server=127.0.0.1#6053
> > > > --conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
> > > > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> > > >
> > > > The 127.0.0.1#6053 is a DNS proxy based on dnsproxy which has with
> > > > DoH, DoT, DoQ and DNSCrypt support.
> > > > The conf files here:
> > > > /home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf, are for
> > > > China domains which using China's mainland DNS servers.
> > > >
> > > > And the main dnsmasq.conf file has the following options enabled:
> > > >
> > > > $ egrep -v '^([[:blank:]]*#|$)'
> > > > /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> > > > dns-forward-max=10000
> > > > no-negcache
> > > > min-cache-ttl=3600
> > > > all-servers
> > > > domain-needed
> > > > bogus-priv
> > > > filterwin2k
> > > > no-resolv
> > > > no-poll
> > > > interface=lo
> > > > bind-interfaces
> > >
> > > I see. This is making more sense now.
> > >
> > > > > Why what? Why won't other programs on the host use Dnsmasq? That's the way systems with systemd-resolved work by default. Generally, programs on the host will query /etc/resolv.conf to determine which DNS servers to use (though the manpage for systemd-resolved.service(8) suggests that some programs do not use /etc/resolv.conf and connect to systemd-resolved though other means. To be honest, that part is a little unclear to me). By default, it's a symlink to a file that direct clients to systemd-resolved (127.0.0.53).
> > > > >
> > > > > The trouble is, systemd-resolved also uses resolv.conf to determine its own behavior. The moment you delete the symlink and replace it with your own file pointing to Dnsmasq (127.0.0.1), two things will happen:
> > > >
> > > > This is exactly my situation, see following for more detail info:
> > > >
> > > > werner at X10DAi-01:~$ cat /etc/resolv.conf
> > > > nameserver 127.0.0.1
> > > > werner at X10DAi-01:~$ realpath -e /etc/resolv.conf
> > > > /etc/resolv.conf
> > > >
> > > > > 1.) systemd-resolved will itself add Dnsmasq to its list of nameservers. This probably won't break systemd-resolved entirely, but it will potentially cause lots of retries and slowdowns.
> > > >
> > > > Seems so complicated and still can't figure out a perfect solution for
> > > > the coexistence of dnsmasq and systemd-resolved.
> > >
> > > Running both on the same system is compicated, and systemd-resolved adds little value when you already have Dnsmasq running. That is is why it's usually not recommended, though I'm reasonably confident it can be done if you really want to.
> > >
> > > > > 2.) Unless you've manually configured a nameserver in /etc/dnsmasq.conf, Dnsmasq will not have anywhere to send queries. This *will* break some things. It's smart enough to know that it shouldn't use itself as the upstream server, but neither /etc/resolv.conf nor /etc/dnsmasq.conf gives it other options, so it fails.
> > > >
> > > > As you can see, I've set upstream nameservers for my dnsmasq, so this
> > > > shouldn't be the culprit for my case.
> > >
> > > Agreed.
> > >
> > > > >
> > > > > If you want other programs on the same host to go through Dnsmasq, you should use the first option I suggested.
> > > >
> > > > Do you mean the following thing you have told:
> > > >
> > > >     If you want Dnsmasq to query the upstream servers,
> > > > systemd-resolved to query Dnsmasq,
> > > >     and everything else on the host to query systemd-resolved:
> > >
> > > Yes, that is what I meant. That said, based on everything you just sent, it sounds like that's how you currently have things configured:
> > >
> > > 1.) Your Dnsmasq is configured to ignore /etc/resolv.conf and has manually configured servers for upstream. Dnsmasq should be working fine, as long as there isn't anything in /home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir causing problems. (But make sure you are escaping the asterisk for that option if you are running dnsmasq in a shell.)
> > >
> > > 2.) systemd-resolved should be working well. It gets its upstream servers from your network config. Since you have Netplan configured for 127.0.0.1, it should be using Dnsmasq as its upstream server. You also have a regular file for /etc/resolv.conf, so systemd-resolved will use the nameserver there as upstream too, but it's the same one, so there is no change.
> > >
> > > 3.) Other programs on your system will either use systemd-networkd or Dnsmasq for DNS, depending on whether they obey /etc/resolv.conf or not. Either way, since systemd-resolved is forwarding all queries to Dnsmasq, every request should eventually end up going through Dnsmasq. (By the way, you should safely be able to restore /etc/resolv.conf to its original symlink to /run/systemd/resolve/stub-resolv.conf since you don't have Dnsmasq reading from it.)
> > >
> > > So, at this point, I'm not quite sure what the problem is. You mentioned using dig earlier, so I'm not sure if you already tried this, but you can try connecting to each server directly to pinpoint which step in the chain is causing issues:
> >
> > For simplicity, I previously only told you partial local DNS
> > resolution topology used by me. From now on, considering that you've
> > known some ideas of the DNS settings for my case, I'll tell you the
> > complete DNS resolution topology/scheme on my Ubuntu 20.04 box. I
> > describe the full DNS configurations as following:
> >
> > As you have seen, I use dnsmasq and dnsproxy to do the DNS resolution.
> > In detail, I run two dnsmasq instances and one dnsproxy instance for
> > the job. And all the following commands are issued from bash script,
> > so I don't need to escape the *  character which otherwise should be
> > escaped if issued directly from within terminal.
> >
> > The dnsproxy is started by this way:
> >
> > $ dnsproxy -v -l 127.0.0.1 --port=6053 --all-servers -u tls://8.8.4.4
> > -u tls://8.8.8.8 -u tls://1.0.0.1 -u tls://1.1.1.1 -u tls://9.9.9.9 -u
> > tls://9.9.9.10 -u tls://149.112.112.10
> >
> > It listens on 127.0.0.1:6053 and forwards the query to several DoT DNS
> > upstream servers.
> >
> > The two dnsmasq instances are shown as following:
> >
> > $ /usr/local/sbin/dnsmasq --port=6054
> > --servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
> >
> > This dnsmasq instance listens on 127.0.0.1:6054 and use the following
> > upstreams which locate in China mainland:
> >
> > $ egrep -v '^[[:blank:]]*(#|$)'
> > /home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
> > server=114.114.114.114
> > server=114.114.115.115
> > server=114.114.114.119
> > server=114.114.115.119
> > server=114.114.114.110
> > server=114.114.115.110
> > server=223.5.5.5
> > server=223.6.6.6
> > server=180.76.76.76
> > server=112.124.47.27
> > server=114.215.126.16
> >
> > And the content of the main config file is shown as follows:
> >
> > $ egrep -v '^[[:blank:]]*(#|$)'
> > /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
> > dns-forward-max=10000
> > cache-size=0
> > all-servers
> > domain-needed
> > bogus-priv
> > filterwin2k
> > no-resolv
> > no-poll
> > interface=lo
> > bind-interfaces
> > no-hosts
> >
> > $ /usr/local/sbin/dnsmasq --port=53 -c10240 --server=127.0.0.1#6053
> > --conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> >
> > This dnsmasq instance listens on 127.0.0.1:53 and use two previously
> > set upstreams: 127.0.0.1#6053 and 127.0.0.1#6054. The former is used
> > to resolve the DNS queries for hostname no belong to China mainland,
> > and the latter is for China mainland.
> >
> > In detail, there are two .conf file under the directory
> > /home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir, shown as
> > follows:
> >
> > $ ls -1 *.conf
> > accelerated-domains.china.dnsmasq.conf
> > bogus-nxdomain.china.conf
> >
> > The content of them is in the following form respectively:
> >
> > $ head accelerated-domains.china.dnsmasq.conf
> > server=/0-100.com/127.0.0.1#6054
> > server=/0-6.com/127.0.0.1#6054
> > server=/0-gold.net/127.0.0.1#6054
> > server=/00.net/127.0.0.1#6054
> > server=/0000go.com/127.0.0.1#6054
> > server=/00042.com/127.0.0.1#6054
> > server=/0005pz.com/127.0.0.1#6054
> > server=/0006266.com/127.0.0.1#6054
> > server=/0007.net/127.0.0.1#6054
> > server=/000dn.com/127.0.0.1#6054
> >
> > $ egrep -v '^[[:blank:]]*(#|$)' bogus-nxdomain.china.conf | head
> > bogus-nxdomain=123.125.81.12
> > bogus-nxdomain=101.226.10.8
> > bogus-nxdomain=198.105.254.11
> > bogus-nxdomain=104.239.213.7
> > bogus-nxdomain=61.191.206.4
> > bogus-nxdomain=218.30.64.194
> > bogus-nxdomain=61.139.8.101
> > bogus-nxdomain=61.139.8.102
> > bogus-nxdomain=61.139.8.103
> > bogus-nxdomain=61.139.8.104
> >
> > And the content of the main config file is shown as follows:
> >
> > $ egrep -v '^[[:blank:]]*(#|$)'
> > /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> > dns-forward-max=10000
> > no-negcache
> > min-cache-ttl=3600
> > all-servers
> > domain-needed
> > bogus-priv
> > filterwin2k
> > no-resolv
> > no-poll
> > interface=lo
> > bind-interfaces
> >
> >
> > The netplan yaml file is as follows:
> >
> > $ cat /etc/netplan/99-networkd-local-dns.yaml
> > network:
> >  version: 2
> >  renderer: networkd
> >  ethernets:
> >    enp:
> >      match:
> >        name: enp*
> >      dhcp4: true
> >      dhcp4-overrides:
> >        use-dns: false
> >      nameservers:
> >       addresses:
> >        - 127.0.0.1
> >    docker:
> >      match:
> >        name: docker*
> >      dhcp4: true
> >      dhcp4-overrides:
> >        use-dns: false
> >      nameservers:
> >       addresses:
> >        - 127.0.0.1
> >
> > The /etc/resolv.conf is as follows:
> >
> > $ realpath -e /etc/resolv.conf
> > /run/systemd/resolve/stub-resolv.conf
> > $ egrep -v '^[[:blank:]]*(#|$)' /etc/resolv.conf
> > nameserver 127.0.0.53
> > options edns0
> >
> >
> > For now, I've told you all the configurations of my local DNS
> > topology. Next, I'll do the testings told by you shown in the
> > following.
> >
> > First, please notice all of the process info of the mentioned tools above:
> >
> > $ pgrep -ax dnsproxy
> > 21355 ./dnsproxy -v -l 127.0.0.1 --port=6053 --all-servers -u
> > tls://8.8.4.4 -u tls://8.8.8.8 -u tls://1.0.0.1 -u tls://1.1.1.1 -u
> > tls://9.9.9.9 -u tls://9.9.9.10 -u tls://149.112.112.10
> >
> > $ pgrep -ax dnsmasq
> > 21369 /usr/local/sbin/dnsmasq --port=6054
> > --servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
> > 21380 /usr/local/sbin/dnsmasq --port=53 -c10240
> > --server=127.0.0.1#6053
> > --conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> >
> > As you can see, we have three processed running correctly
> > corresponding to the situation I described above.
> >
> > >
> > > To test your DNS proxy:
> > > dig @127.0.0.1 -p 6053 <somedomain.com> ANY
> >
> > werner at X10DAi-01:~$ dig +short @127.0.0.1 -p 6053 www.baidu.com ANY
> > www.a.shifen.com.
> > werner at X10DAi-01:~$ pgrep -ax dnsproxy
> > 21355 ./dnsproxy -v -l 127.0.0.1 --port=6053 --all-servers -u
> > tls://8.8.4.4 -u tls://8.8.8.8 -u tls://1.0.0.1 -u tls://1.1.1.1 -u
> > tls://9.9.9.9 -u tls://9.9.9.10 -u tls://149.112.112.10
> > werner at X10DAi-01:~$ pgrep -ax dnsmasq
> > 21369 /usr/local/sbin/dnsmasq --port=6054
> > --servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
> > 21380 /usr/local/sbin/dnsmasq --port=53 -c10240
> > --server=127.0.0.1#6053
> > --conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> >
> > As you can see, this step can be completed successfully.
> >
> > >
> > > If that is working as intended, then test Dnsmasq:
> > > dig @127.0.0.1 <somedomain.com> ANY
> >
> > werner at X10DAi-01:~$ dig +short @127.0.0.1 www.baidu.com ANY
> > ;; connection timed out; no servers could be reached
> >
> > werner at X10DAi-01:~$ pgrep -ax dnsmasq
> > 21369 /usr/local/sbin/dnsmasq --port=6054
> > --servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
> > 21380 /usr/local/sbin/dnsmasq --port=53 -c10240
> > --server=127.0.0.1#6053
> > --conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> > 38755 /usr/local/sbin/dnsmasq --port=53 -c10240
> > --server=127.0.0.1#6053
> > --conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> > 38756 /usr/local/sbin/dnsmasq --port=6054
> > --servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
> > 38812 /usr/local/sbin/dnsmasq --port=53 -c10240
> > --server=127.0.0.1#6053
> > --conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> > 38814 /usr/local/sbin/dnsmasq --port=6054
> > --servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
> > 38864 /usr/local/sbin/dnsmasq --port=53 -c10240
> > --server=127.0.0.1#6053
> > --conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> > 38865 /usr/local/sbin/dnsmasq --port=6054
> > --servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
> > werner at X10DAi-01:~$ pgrep -ax dnsproxy
> > 21355 ./dnsproxy -v -l 127.0.0.1 --port=6053 --all-servers -u
> > tls://8.8.4.4 -u tls://8.8.8.8 -u tls://1.0.0.1 -u tls://1.1.1.1 -u
> > tls://9.9.9.9 -u tls://9.9.9.10 -u tls://149.112.112.10
> >
> >
> > As you can see, this step failed but very stange, there will so many
> > dnsmasq processes be started/triggered. I still can't figure out the
> > reason and how to solve it.
>
> This does indeed seem strange. Unfortunately, I'm not sure either. The best I can suggest is to check the syslog for any clues, if you haven't yet.

If I’ve time later this afternoon, I will check it and feedback.

> Perhaps someone else here might have more insight. But, I don't *think* this actually has anything to do with systemd-resolved at all, based on all the configuration info you gave.
>
> > As a side note, I also changed the content of the /etc/resolv.conf to
> > the following and the problem is still the same:
> >
> > nameserver 127.0.0.1
> > options edns0
> >
> >
> > >
> > > If there's still no problem, then test systemd-resolved:
> > > dig @127.0.0.53 <somedomain.com> ANY
> >
> > werner at X10DAi-01:~$ dig +short @127.0.0.53 www.baidu.com ANY
> > www.a.shifen.com.
> > werner at X10DAi-01:~$ pgrep -ax dnsmasq
> > 21369 /usr/local/sbin/dnsmasq --port=6054
> > --servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
> > 21380 /usr/local/sbin/dnsmasq --port=53 -c10240
> > --server=127.0.0.1#6053
> > --conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
> > --hostsdir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/hostsdir -C
> > /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> > werner at X10DAi-01:~$ pgrep -ax dnsproxy
> > 21355 ./dnsproxy -v -l 127.0.0.1 --port=6053 --all-servers -u
> > tls://8.8.4.4 -u tls://8.8.8.8 -u tls://1.0.0.1 -u tls://1.1.1.1 -u
> > tls://9.9.9.9 -u tls://9.9.9.10 -u tls://149.112.112.10
> >
> > This test will succeed for 127.0.0.1 or 127.0.0.53 used in /etc/resolv.conf.
>
> I was a little surprised this one worked since the previous one didn't, but I suspect systemd-resolved is falling back to the FallbackDNS servers (which are hardcoded in if not set explicitly).

What's the FallbackDNS servers and how can I find/list them?

>
> > Any hints for the problem for my case based on my above descriptions?
> >
> > Best regareds,
> > HY
> > --
> > Hongyi Zhao <hongyi.zhao at gmail.com>
> >



-- 
Hongyi Zhao <hongyi.zhao at gmail.com>



More information about the Dnsmasq-discuss mailing list