[Dnsmasq-discuss] Avoid conflicts between dnsmasq and systemd-resolved.
Hongyi Zhao
hongyi.zhao at gmail.com
Tue Sep 15 14:47:25 BST 2020
On Tue, Sep 15, 2020 at 11:09 AM Dominick C. Pastore
<dominickpastore at dcpx.org> wrote:
>
> On Mon, Sep 14, 2020, at 8:03 PM, Hongyi Zhao wrote:
> > I run dnsmasq as following:
> >
> > $ /usr/local/sbin/dnsmasq --port=53 -c10240 --server=127.0.0.1#6053
> > --conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
> > -C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> >
> > The 127.0.0.1#6053 is a DNS proxy based on dnsproxy which has with
> > DoH, DoT, DoQ and DNSCrypt support.
> > The conf files here:
> > /home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf, are for
> > China domains which using China's mainland DNS servers.
> >
> > And the main dnsmasq.conf file has the following options enabled:
> >
> > $ egrep -v '^([[:blank:]]*#|$)'
> > /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
> > dns-forward-max=10000
> > no-negcache
> > min-cache-ttl=3600
> > all-servers
> > domain-needed
> > bogus-priv
> > filterwin2k
> > no-resolv
> > no-poll
> > interface=lo
> > bind-interfaces
>
> I see. This is making more sense now.
>
> > > Why what? Why won't other programs on the host use Dnsmasq? That's the way systems with systemd-resolved work by default. Generally, programs on the host will query /etc/resolv.conf to determine which DNS servers to use (though the manpage for systemd-resolved.service(8) suggests that some programs do not use /etc/resolv.conf and connect to systemd-resolved though other means. To be honest, that part is a little unclear to me). By default, it's a symlink to a file that direct clients to systemd-resolved (127.0.0.53).
> > >
> > > The trouble is, systemd-resolved also uses resolv.conf to determine its own behavior. The moment you delete the symlink and replace it with your own file pointing to Dnsmasq (127.0.0.1), two things will happen:
> >
> > This is exactly my situation, see following for more detail info:
> >
> > werner at X10DAi-01:~$ cat /etc/resolv.conf
> > nameserver 127.0.0.1
> > werner at X10DAi-01:~$ realpath -e /etc/resolv.conf
> > /etc/resolv.conf
> >
> > > 1.) systemd-resolved will itself add Dnsmasq to its list of nameservers. This probably won't break systemd-resolved entirely, but it will potentially cause lots of retries and slowdowns.
> >
> > Seems so complicated and still can't figure out a perfect solution for
> > the coexistence of dnsmasq and systemd-resolved.
>
> Running both on the same system is compicated, and systemd-resolved adds little value when you already have Dnsmasq running. That is is why it's usually not recommended, though I'm reasonably confident it can be done if you really want to.
>
> > > 2.) Unless you've manually configured a nameserver in /etc/dnsmasq.conf, Dnsmasq will not have anywhere to send queries. This *will* break some things. It's smart enough to know that it shouldn't use itself as the upstream server, but neither /etc/resolv.conf nor /etc/dnsmasq.conf gives it other options, so it fails.
> >
> > As you can see, I've set upstream nameservers for my dnsmasq, so this
> > shouldn't be the culprit for my case.
>
> Agreed.
>
> > >
> > > If you want other programs on the same host to go through Dnsmasq, you should use the first option I suggested.
> >
> > Do you mean the following thing you have told:
> >
> > If you want Dnsmasq to query the upstream servers,
> > systemd-resolved to query Dnsmasq,
> > and everything else on the host to query systemd-resolved:
>
> Yes, that is what I meant. That said, based on everything you just sent, it sounds like that's how you currently have things configured:
>
> 1.) Your Dnsmasq is configured to ignore /etc/resolv.conf and has manually configured servers for upstream. Dnsmasq should be working fine, as long as there isn't anything in /home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir causing problems. (But make sure you are escaping the asterisk for that option if you are running dnsmasq in a shell.)
>
> 2.) systemd-resolved should be working well. It gets its upstream servers from your network config. Since you have Netplan configured for 127.0.0.1, it should be using Dnsmasq as its upstream server. You also have a regular file for /etc/resolv.conf, so systemd-resolved will use the nameserver there as upstream too, but it's the same one, so there is no change.
>
> 3.) Other programs on your system will either use systemd-networkd or Dnsmasq for DNS, depending on whether they obey /etc/resolv.conf or not. Either way, since systemd-resolved is forwarding all queries to Dnsmasq, every request should eventually end up going through Dnsmasq. (By the way, you should safely be able to restore /etc/resolv.conf to its original symlink to /run/systemd/resolve/stub-resolv.conf since you don't have Dnsmasq reading from it.)
>
> So, at this point, I'm not quite sure what the problem is. You mentioned using dig earlier, so I'm not sure if you already tried this, but you can try connecting to each server directly to pinpoint which step in the chain is causing issues:
For simplicity, I previously only told you partial local DNS
resolution topology used by me. From now on, considering that you've
known some ideas of the DNS settings for my case, I'll tell you the
complete DNS resolution topology/scheme on my Ubuntu 20.04 box. I
describe the full DNS configurations as following:
As you have seen, I use dnsmasq and dnsproxy to do the DNS resolution.
In detail, I run two dnsmasq instances and one dnsproxy instance for
the job. And all the following commands are issued from bash script,
so I don't need to escape the * character which otherwise should be
escaped if issued directly from within terminal.
The dnsproxy is started by this way:
$ dnsproxy -v -l 127.0.0.1 --port=6053 --all-servers -u tls://8.8.4.4
-u tls://8.8.8.8 -u tls://1.0.0.1 -u tls://1.1.1.1 -u tls://9.9.9.9 -u
tls://9.9.9.10 -u tls://149.112.112.10
It listens on 127.0.0.1:6053 and forwards the query to several DoT DNS
upstream servers.
The two dnsmasq instances are shown as following:
$ /usr/local/sbin/dnsmasq --port=6054
--servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
This dnsmasq instance listens on 127.0.0.1:6054 and use the following
upstreams which locate in China mainland:
$ egrep -v '^[[:blank:]]*(#|$)'
/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
server=114.114.114.114
server=114.114.115.115
server=114.114.114.119
server=114.114.115.119
server=114.114.114.110
server=114.114.115.110
server=223.5.5.5
server=223.6.6.6
server=180.76.76.76
server=112.124.47.27
server=114.215.126.16
And the content of the main config file is shown as follows:
$ egrep -v '^[[:blank:]]*(#|$)'
/home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
dns-forward-max=10000
cache-size=0
all-servers
domain-needed
bogus-priv
filterwin2k
no-resolv
no-poll
interface=lo
bind-interfaces
no-hosts
$ /usr/local/sbin/dnsmasq --port=53 -c10240 --server=127.0.0.1#6053
--conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
This dnsmasq instance listens on 127.0.0.1:53 and use two previously
set upstreams: 127.0.0.1#6053 and 127.0.0.1#6054. The former is used
to resolve the DNS queries for hostname no belong to China mainland,
and the latter is for China mainland.
In detail, there are two .conf file under the directory
/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir, shown as
follows:
$ ls -1 *.conf
accelerated-domains.china.dnsmasq.conf
bogus-nxdomain.china.conf
The content of them is in the following form respectively:
$ head accelerated-domains.china.dnsmasq.conf
server=/0-100.com/127.0.0.1#6054
server=/0-6.com/127.0.0.1#6054
server=/0-gold.net/127.0.0.1#6054
server=/00.net/127.0.0.1#6054
server=/0000go.com/127.0.0.1#6054
server=/00042.com/127.0.0.1#6054
server=/0005pz.com/127.0.0.1#6054
server=/0006266.com/127.0.0.1#6054
server=/0007.net/127.0.0.1#6054
server=/000dn.com/127.0.0.1#6054
$ egrep -v '^[[:blank:]]*(#|$)' bogus-nxdomain.china.conf | head
bogus-nxdomain=123.125.81.12
bogus-nxdomain=101.226.10.8
bogus-nxdomain=198.105.254.11
bogus-nxdomain=104.239.213.7
bogus-nxdomain=61.191.206.4
bogus-nxdomain=218.30.64.194
bogus-nxdomain=61.139.8.101
bogus-nxdomain=61.139.8.102
bogus-nxdomain=61.139.8.103
bogus-nxdomain=61.139.8.104
And the content of the main config file is shown as follows:
$ egrep -v '^[[:blank:]]*(#|$)'
/home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
dns-forward-max=10000
no-negcache
min-cache-ttl=3600
all-servers
domain-needed
bogus-priv
filterwin2k
no-resolv
no-poll
interface=lo
bind-interfaces
The netplan yaml file is as follows:
$ cat /etc/netplan/99-networkd-local-dns.yaml
network:
version: 2
renderer: networkd
ethernets:
enp:
match:
name: enp*
dhcp4: true
dhcp4-overrides:
use-dns: false
nameservers:
addresses:
- 127.0.0.1
docker:
match:
name: docker*
dhcp4: true
dhcp4-overrides:
use-dns: false
nameservers:
addresses:
- 127.0.0.1
The /etc/resolv.conf is as follows:
$ realpath -e /etc/resolv.conf
/run/systemd/resolve/stub-resolv.conf
$ egrep -v '^[[:blank:]]*(#|$)' /etc/resolv.conf
nameserver 127.0.0.53
options edns0
For now, I've told you all the configurations of my local DNS
topology. Next, I'll do the testings told by you shown in the
following.
First, please notice all of the process info of the mentioned tools above:
$ pgrep -ax dnsproxy
21355 ./dnsproxy -v -l 127.0.0.1 --port=6053 --all-servers -u
tls://8.8.4.4 -u tls://8.8.8.8 -u tls://1.0.0.1 -u tls://1.1.1.1 -u
tls://9.9.9.9 -u tls://9.9.9.10 -u tls://149.112.112.10
$ pgrep -ax dnsmasq
21369 /usr/local/sbin/dnsmasq --port=6054
--servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
21380 /usr/local/sbin/dnsmasq --port=53 -c10240
--server=127.0.0.1#6053
--conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
As you can see, we have three processed running correctly
corresponding to the situation I described above.
>
> To test your DNS proxy:
> dig @127.0.0.1 -p 6053 <somedomain.com> ANY
werner at X10DAi-01:~$ dig +short @127.0.0.1 -p 6053 www.baidu.com ANY
www.a.shifen.com.
werner at X10DAi-01:~$ pgrep -ax dnsproxy
21355 ./dnsproxy -v -l 127.0.0.1 --port=6053 --all-servers -u
tls://8.8.4.4 -u tls://8.8.8.8 -u tls://1.0.0.1 -u tls://1.1.1.1 -u
tls://9.9.9.9 -u tls://9.9.9.10 -u tls://149.112.112.10
werner at X10DAi-01:~$ pgrep -ax dnsmasq
21369 /usr/local/sbin/dnsmasq --port=6054
--servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
21380 /usr/local/sbin/dnsmasq --port=53 -c10240
--server=127.0.0.1#6053
--conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
As you can see, this step can be completed successfully.
>
> If that is working as intended, then test Dnsmasq:
> dig @127.0.0.1 <somedomain.com> ANY
werner at X10DAi-01:~$ dig +short @127.0.0.1 www.baidu.com ANY
;; connection timed out; no servers could be reached
werner at X10DAi-01:~$ pgrep -ax dnsmasq
21369 /usr/local/sbin/dnsmasq --port=6054
--servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
21380 /usr/local/sbin/dnsmasq --port=53 -c10240
--server=127.0.0.1#6053
--conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
38755 /usr/local/sbin/dnsmasq --port=53 -c10240
--server=127.0.0.1#6053
--conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
38756 /usr/local/sbin/dnsmasq --port=6054
--servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
38812 /usr/local/sbin/dnsmasq --port=53 -c10240
--server=127.0.0.1#6053
--conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
38814 /usr/local/sbin/dnsmasq --port=6054
--servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
38864 /usr/local/sbin/dnsmasq --port=53 -c10240
--server=127.0.0.1#6053
--conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
38865 /usr/local/sbin/dnsmasq --port=6054
--servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
werner at X10DAi-01:~$ pgrep -ax dnsproxy
21355 ./dnsproxy -v -l 127.0.0.1 --port=6053 --all-servers -u
tls://8.8.4.4 -u tls://8.8.8.8 -u tls://1.0.0.1 -u tls://1.1.1.1 -u
tls://9.9.9.9 -u tls://9.9.9.10 -u tls://149.112.112.10
As you can see, this step failed but very stange, there will so many
dnsmasq processes be started/triggered. I still can't figure out the
reason and how to solve it.
As a side note, I also changed the content of the /etc/resolv.conf to
the following and the problem is still the same:
nameserver 127.0.0.1
options edns0
>
> If there's still no problem, then test systemd-resolved:
> dig @127.0.0.53 <somedomain.com> ANY
werner at X10DAi-01:~$ dig +short @127.0.0.53 www.baidu.com ANY
www.a.shifen.com.
werner at X10DAi-01:~$ pgrep -ax dnsmasq
21369 /usr/local/sbin/dnsmasq --port=6054
--servers-file=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/servers-file/cn
-C /home/werner/Public/anti-gfw/dns/dnsmasq/conf/cn-dns.conf
21380 /usr/local/sbin/dnsmasq --port=53 -c10240
--server=127.0.0.1#6053
--conf-dir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/conf-dir,*.conf
--hostsdir=/home/werner/Public/anti-gfw/dns/dnsmasq/conf/hostsdir -C
/home/werner/Public/anti-gfw/dns/dnsmasq/conf/dnsmasq.conf
werner at X10DAi-01:~$ pgrep -ax dnsproxy
21355 ./dnsproxy -v -l 127.0.0.1 --port=6053 --all-servers -u
tls://8.8.4.4 -u tls://8.8.8.8 -u tls://1.0.0.1 -u tls://1.1.1.1 -u
tls://9.9.9.9 -u tls://9.9.9.10 -u tls://149.112.112.10
This test will succeed for 127.0.0.1 or 127.0.0.53 used in /etc/resolv.conf.
Any hints for the problem for my case based on my above descriptions?
Best regareds,
HY
--
Hongyi Zhao <hongyi.zhao at gmail.com>
More information about the Dnsmasq-discuss
mailing list