[Dnsmasq-discuss] [PATCH] Stop treating SERVFAIL as a successful response from upstream servers

Simon Kelley simon at thekelleys.org.uk
Mon Feb 6 21:14:11 GMT 2017

Hash: SHA256

Patch applied. Thank you.

And thank you for the comprehensive documentation.

The original change was made as a part of the DNSSEC stuff, and I have
a nagging feeling that there was some, theoretical, situation that
could occur in conjunction with DNSSEC which prompted the change.
Despite much puzzling, I can't come up with waht this might have been,
so I've applied the patch. If it breaks with DNSSEC, we'll find out,
but I suspect I'm chasing shadows.



On 01/02/17 22:54, Baptiste Jonglez wrote:
> From: Baptiste Jonglez <git at bitsofnetworks.org>
> This effectively reverts most of 51967f9807 ("SERVFAIL is an
> expected error return, don't try all servers.") and 4ace25c5d6
> ("Treat REFUSED (not SERVFAIL) as an unsuccessful upstream
> response").
> With the current behaviour, as soon as dnsmasq receives a SERVFAIL
> from an upstream server, it stops trying to resolve the query and
> simply returns SERVFAIL to the client.  With this commit, dnsmasq
> will instead try to query other upstream servers upon receiving a
> SERVFAIL response.
> According to RFC 1034 and 1035, the semantic of SERVFAIL is that of
> a temporary error condition.  Recursive resolvers are expected to
> encounter network or resources issues from time to time, and will
> respond with SERVFAIL in this case.  Similarly, if a validating
> DNSSEC resolver [RFC 4033] encounters issues when checking
> signatures (unknown signing algorithm, missing signatures, expired
> signatures because of a wrong system clock, etc), it will respond
> with SERVFAIL.
> Note that all those behaviours are entirely different from a
> negative response, which would provide a definite indication that
> the requested name does not exist.  In our case, if an upstream
> server responds with SERVFAIL, another upstream server may well
> provide a positive answer for the same query.
> Thus, this commit will increase robustness whenever some upstream
> servers encounter temporary issues or are misconfigured.
> Quoting RFC 1034, Section 4.3.1. "Queries and responses":
> If recursive service is requested and available, the recursive
> response to a query will be one of the following:
> - The answer to the query, possibly preface by one or more CNAME 
> RRs that specify aliases encountered on the way to an answer.
> - A name error indicating that the name does not exist.  This may
> include CNAME RRs that indicate that the original query name was an
> alias for a name which does not exist.
> - A temporary error indication.
> Here is Section 5.2.3. of RFC 1034, "Temporary failures":
> In a less than perfect world, all resolvers will occasionally be
> unable to resolve a particular request.  This condition can be
> caused by a resolver which becomes separated from the rest of the
> network due to a link failure or gateway problem, or less often by
> coincident failure or unavailability of all servers for a
> particular domain.
> And finally, RFC 1035 specifies RRCODE 2 for this usage, which is
> now more widely known as SERVFAIL (RFC 1035, Section 4.1.1. "Header
> section format"):
> RCODE           Response code - this 4 bit field is set as part of 
> responses.  The values have the following interpretation: (...)
> 2               Server failure - The name server was unable to
> process this query due to a problem with the name server.
> For the DNSSEC-related usage of SERVFAIL, here is RFC 4033 Section
> 5. "Scope of the DNSSEC Document Set and Last Hop Issues":
> A validating resolver can determine the following 4 states: (...)
> Insecure: The validating resolver has a trust anchor, a chain of 
> trust, and, at some delegation point, signed proof of the 
> non-existence of a DS record.  This indicates that subsequent 
> branches in the tree are provably insecure.  A validating resolver 
> may have a local policy to mark parts of the domain space as 
> insecure.
> Bogus: The validating resolver has a trust anchor and a secure 
> delegation indicating that subsidiary data is signed, but the 
> response fails to validate for some reason: missing signatures, 
> expired signatures, signatures with unsupported algorithms, data 
> missing that the relevant NSEC RR says should be present, and so 
> forth. (...)
> This specification only defines how security-aware name servers
> can signal non-validating stub resolvers that data was found to be
> bogus (using RCODE=2, "Server Failure"; see [RFC4035]).
> Notice the difference between a definite negative answer
> ("Insecure" state), and an indefinite error condition ("Bogus"
> state).  The second type of error may be specific to a recursive
> resolver, for instance because its system clock has been
> incorrectly set, or because it does not implement newer
> cryptographic primitives.  Another recursive resolver may succeed
> for the same query.
> There are other similar situations in which the specified behaviour
> is similar to the one implemented by this commit.
> For instance, RFC 2136 specifies the behaviour of a "requestor"
> that wants to update a zone using the DNS UPDATE mechanism.  The
> requestor tries to contact all authoritative name servers for the
> zone, with the following behaviour specified in RFC 2136, Section
> 4:
> 4.6. If a response is received whose RCODE is SERVFAIL or NOTIMP,
> or if no response is received within an implementation dependent
> timeout period, or if an ICMP error is received indicating that the
> server's port is unreachable, then the requestor will delete the
> unusable server from its internal name server list and try the next
> one, repeating until the name server list is empty.  If the
> requestor runs out of servers to try, an appropriate error will be
> returned to the requestor's caller.
> Signed-off-by: Baptiste Jonglez <git at bitsofnetworks.org> --- 
> src/forward.c | 3 ++- 1 file changed, 2 insertions(+), 1
> deletion(-)
> diff --git a/src/forward.c b/src/forward.c index 9b464d3..47409f0
> 100644 --- a/src/forward.c +++ b/src/forward.c @@ -853,7 +853,8 @@
> void reply_query(int fd, int family, time_t now) we get a good
> reply from another server. Kill it when we've had replies from all
> to avoid filling the forwarding table when everything is broken */ 
> -  if (forward->forwardall == 0 || --forward->forwardall == 1 ||
> RCODE(header) != REFUSED) +  if (forward->forwardall == 0 ||
> --forward->forwardall == 1 || +      (RCODE(header) != REFUSED &&
> RCODE(header) != SERVFAIL)) { int check_rebind = 0, no_cache_dnssec
> = 0, cache_secure = 0, bogusanswer = 0;
Version: GnuPG v2.0.22 (GNU/Linux)


More information about the Dnsmasq-discuss mailing list