[Dnsmasq-discuss] SERVFAIL and all-servers
tobias+dnsmasq at trds.de
tobias+dnsmasq at trds.de
Thu Mar 10 13:32:08 UTC 2022
On 2022-03-07 16:46, Simon Kelley wrote:
> On 06/03/2022 15:16, Matus UHLAR - fantomas via Dnsmasq-discuss wrote:
>> On 02.03.22 19:24, Simon Kelley wrote:
>>> The behaviour on this alternated between what you observed and what
>>> you advocate a few times before settling.
>>>
>>> The problem with waiting for all replies is that a common source of
>>> SERVFAIL returns is domains with broken DNSSEC. In that case all the
>>> servers will return SERVFAIL, which is a bit of a pain if you have to
>>> wait for the slowest one, but a disaster if one server is not
>>> responding: in that case all you can do is wait for the timeout.
>>>
>>> Defining SERVFAIL as the response to DNSSEC validation failure has
>>> always seemed odd to me.
>>>
>>> all-servers is not necessarily more reliable: the default dnsmasq
>>> behaviour does a reasonably good job in most circumstances.
>>
>> I would expect a bit more reliability in this case just as the OP.
Indeed. Even if "more reliable" might not be possible in every case, it
seems that in my case all-servers makes it actually way less reliable.
>> How does dnsmasq reply if all-servers is not set and first server
>> returns SERVFAIL?
>
> If it sends to a single server, and that returns SERVFAIL, it will retry
> the query to all servers (as if all-servers was set.) This doesn't avoid
> the problem that the same "rogue" server could reply SERVFAIL again.
>
> I guess there's an argument to omit the already-failed server on the retry?
I'd argue that ideally there should be no situation, where getting a
SERVFAIL is worse than getting no answer, but currently that seems to be
the case. If SERVFAIL can be handled differently, so it does not have a
negative impact, omitting might not be necessary any more.
Suggestion: Answer with SERVFAIL if all upstreams did so, but
additionally shortcut to SERVFAIL after some shorter timeout
(adjustable, maybe ~3s by default, relative to start of the query) if a
SERVFAIL arrived. Not an "ideal" solution, but I hope sufficient for the
DNSSEC example as well.
"Extended DNS Errors" (RFC 8914) might change things, but that's for the
future I guess. It could help to recognize the DNSSEC failure case (but
"EDE content should be treated only as diagnostic information and MUST
NOT alter DNS protocol processing", well...). On the other hand, with
non-classic-UDP-DNS (DoT, DoH, ...) rising, I'd assume non-DNSSEC
related SERVFAILs are rising as well.
More information about the Dnsmasq-discuss
mailing list