[Dnsmasq-discuss] TTL in nested wild card CNAME

Simon Kelley simon at thekelleys.org.uk
Tue Mar 17 15:13:22 GMT 2020


On 17/03/2020 01:31, Sasha Litvak wrote:
> I couldn't find a specific answer anywhere so hopefully someone has a
> clue on this list
> 
> We are using dnsmasq on our servers as a caching dns solution.
> 
> Most of our domains are resolved by a wildcard record like this
> 
> $TTL 3600       ; 1 hour
>                         A       10.10.10.23
> $ORIGIN example.net.
> *                       CNAME   excontainers
> excontainers    CNAME   exservice.service.consul
> 
> dnsmasq handles resolution of .consul domain directly but the DNS
> server itself also forwards .consul to consul servers.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


Can you elaborate? How does dnsmasq handle the resolution of the .consul
domain? If you have something like

10.0.48.13 exservice.service.consul

in /etc/hosts

then that defines, effectively, an immortal record for
exservice.service.consul, so a CNAME chain of two records, each with a
TTL of one hour, would result in that answer being returned for an hour.

> 
> I added min-ttl 5s to decrease the number of queries to consul
> 
> So when I do dig foo.example.net  @127.0.0.1 I get
> 
> foo.example.net. 3600 IN CNAME excontainers.example.net.
> excontainers.example.net. 3600 IN CNAME exservice.service.consul.
> exservice.service.consul. 5 IN A 10.0.48.13

This might be misleading: is you do that query to dnsmasq with a clean
cache, it will forward the query upstream, and return the complete
result it gets, including the A record with a 5s TTL, but further
queries from the cache would return a 0 (infinite) TTL for the A record
of it's defined locally.

The fix for this is to define the .consul A record using --host-record,
which allows you to specify the 5s TTL.



> 
> Now we often need to migrate subdomains by pointing them to a
> different consul cluster.  So our script uses nsupdate and creates a
> dynamic DNS record resulting in this reply
> 
> foo.example.net. 60 IN CNAME  exservice2.service.consul.
>  exservice2.service.consul. 5 IN A 10.0.48.35
> 
> So we have a record that is more explicit and it takes precedence over
> wild card.   On servers with little traffic, domain switch happens
> within a few seconds, but on the main busy server with 100s of queries
> a second, it takes an hour for dnsmasq to change its cache.  We see
> dnsmasq sending requests to the DNS server getting correct new records
> but still sending the old cached records to a client.
> 
> When we are going back from distinct to default wild card (removing
> distinct record in DNS) cache change happens almost immediately (a
> couple of seconds) regardless of how busy the server is.
> 
> Sorry for the long description but I would like to find out a reason
> why during switching from wild card to more explicit record dnsmasq
> cache update takes such a long time.
> 

I'm guessing at exactly what's going on here: more details would be
useful, but if I guessed right, that's the solution.


Simon.




More information about the Dnsmasq-discuss mailing list