[Dnsmasq-discuss] [PATCH 1/1] forward.c: fix handling of truncated response
Simon Kelley
simon at thekelleys.org.uk
Wed Oct 2 20:34:32 UTC 2024
On 02/10/2024 17:00, Dominik Derigs via Dnsmasq-discuss wrote:
> Hey all,
>
>
> having this configurable makes sense particularly thinking about the
> multitude of IoT devices often receiving not all that much attention
> from their manufacturers. A particular example is the camera Tapo C-310
> which seems to be widely used and has often been reported in the Pi-hole
> forums to cause a huge amount of queries due to exactly this truncation
> method we are talking about here (when dnsmasq is used with DNSSEC, the
> query A time.nist.gov results in truncation).
>
>
> Said Tapo camera is not behaving nicely and does not retry over TCP.
> Instead, it retries the same query roughly one second later, resulting
> in a whooping 86,400 identical queries, all being truncated. A
> workaround is querying this domain periodically manually over TCP but it
> is a rather ugly workaround.
>
>
> If there would be an option to allow truncated content to remain, I
> would indeed find it very useful for situations like the one mentioned
> above. I confirmed that the camera could successfully reads the returned
> CNAME/A record. This is what I got when simply applying the submitted
> patch (but keeping the logging):
>
>
> Oct 2 17:46:53 dnsmasq[1861633]: 25 127.0.0.1/45027 query[A]
> time.nist.gov from 127.0.0.1
> Oct 2 17:46:53 dnsmasq[1861633]: 25 127.0.0.1/45027 forwarded
> time.nist.gov to 127.0.0.1#5335
> Oct 2 17:46:53 dnsmasq[1861633]: 26 dnssec-query[DS] gov to 127.0.0.1#5335
> Oct 2 17:46:53 dnsmasq[1861633]: 26 reply gov is DS for keytag 2536,
> algo 13, digest 2
> Oct 2 17:46:53 dnsmasq[1861633]: 27 dnssec-query[DS] nist.gov to
> 127.0.0.1#5335
> Oct 2 17:46:53 dnsmasq[1861633]: 28 dnssec-query[DNSKEY] gov to
> 127.0.0.1#5335
> Oct 2 17:46:53 dnsmasq[1861633]: 28 reply gov is DNSKEY keytag 2536,
> algo 13
> Oct 2 17:46:53 dnsmasq[1861633]: 28 reply gov is DNSKEY keytag 35496,
> algo 13
> Oct 2 17:46:53 dnsmasq[1861633]: 27 reply nist.gov is DS for keytag
> 33751, algo 8, digest 2
> Oct 2 17:46:53 dnsmasq[1861633]: 29 dnssec-query[DNSKEY] nist.gov to
> 127.0.0.1#5335
> Oct 2 17:46:53 dnsmasq[1861633]: 29 reply nist.gov is DNSKEY keytag
> 18303, algo 8
> Oct 2 17:46:53 dnsmasq[1861633]: 29 reply nist.gov is DNSKEY keytag
> 33751, algo 8
> Oct 2 17:46:53 dnsmasq[1861633]: 30 dnssec-query[DS] glb.nist.gov to
> 127.0.0.1#5335
> Oct 2 17:46:53 dnsmasq[1861633]: 30 reply glb.nist.gov is DS for keytag
> 56235, algo 7, digest 1
> Oct 2 17:46:53 dnsmasq[1861633]: 30 reply glb.nist.gov is DS for keytag
> 4395, algo 7, digest 1
> Oct 2 17:46:53 dnsmasq[1861633]: 31 dnssec-query[DNSKEY] glb.nist.gov
> to 127.0.0.1#5335
> Oct 2 17:46:53 dnsmasq[1861633]: 31 reply glb.nist.gov is truncated[DNSKEY]
> Oct 2 17:46:53 dnsmasq[1861633]: 25 127.0.0.1/45027 validation result
> is TRUNCATED
> Oct 2 17:46:53 dnsmasq[1861633]: 25 127.0.0.1/45027 reply is truncated
>
>
> For testing, "dig +ignore + notcp time.nist.gov" is handy as it is
> guaranteed to be truncated with dnsmasq and DNSSEC enabled if the domain
> isn't already in your local cache through a preceding TCP query.
>
>
I'm in the process of testing some code that fixes this problem. What's
happening here is not that the answer to the original query is too large
and truncated, it's that the answer to a subsidiary query needed to
validate the original answer is too large (in this case DNSKEY
glb.nist.gov) All existing dnsmasq releases return the answer to the
original query with the truncated bit set, even if the answer in fact
fits, just to cause the client to retry with TCP so that the subsidiary
query can be done over TCP and avoid truncation.
The new code swaps to TCP for subsidiary DNSSEC queries without having
to return an artificially truncated answer to the original query and,
once the validation is complete, returns the validated answer over UDP.
This is much cleaner, saves a round trip to the client, and avoids
breakage with clients which fail to do TCP properly.
It's not the whole solution, however. If the answer to a query exceeds
the truncation limit, then no client without TCP support can get an good
answer. DNSSEC adds one more possible state. That's when the answer
without DNSSEC records will fit untruncated, but the addition of
signatures makes the answer too large. Dnsmasq passes on the query with
the DO bit set to make upstream add the signature RRs and if that causes
truncation then the truncated answer goes back to client. In theory,
dnsmasq could switch to TCP, and get the validated answer. It then
strips the RRSIG records from the answer, and if it then fits, returns
it via UDP. Only if the answer still doesn't fit (or the client set the
DO bit, and wants RRSIGS) is TCP fallback by the client required.
I'm minded to aee if the code I have can be extended to handle this state.
Simon.
> Best,
>
> Dominik
>
>
> Am 10/2/24 um 12:30 AM schrieb Petr Menšík:
>>
>> I think Simon has pointed out this is intentional. Partial reply is
>> incomplete and for well behaving clients carries not useful
>> information. It would use TCP anyway instead, therefore it adds just
>> additional data.
>>
>>
>> I would consider clients not falling back to TCP as broken. TCP is not
>> considered optional nowadays. Attempts to fix such clients by relying
>> on incomplete responses instead are wrong.
>>
>>
>> It might make sense to allow enabling such behaviour by configuration,
>> if you have broken software not able to workaround. But I would insist
>> that software is broken and therefore sending incomplete responses
>> should be only workaround for them, not something behaved by default.
>>
>>
>> Is that something using Alpine C library?
>>
>>
>> Cheers,
>> Petr
>>
>>
>> On 30/09/2024 06:39, Rahul Thakur via Dnsmasq-discuss wrote:
>>> Hi Simon,
>>>
>>> So what do you think of my reasoning for this patch? Do you agree?
>>>
>>> Best regards,
>>> Rahul Thakur
>>> ------------------------------------------------------------------------
>>> *From:* Rahul Thakur <rahul.thakur at genexis.eu>
>>> *Sent:* 25 September 2024 15:29
>>> *To:* Simon Kelley <simon at thekelleys.org.uk>; dnsmasq-
>>> discuss at lists.thekelleys.org.uk <dnsmasq-discuss at lists.thekelleys.org.uk>
>>> *Subject:* Re: [Dnsmasq-discuss] [PATCH 1/1] forward.c: fix handling
>>> of truncated response
>>> Hi Simon,
>>>
>>> Thanks for responding to this patch, please find my justification for
>>> this patch as follows:
>>>
>>> I think rfc 2181 is defining the behaviour for DNS server and not DNS
>>> proxy.
>>>
>>> I am relying on and referring to rfc 5625 while making this change.
>>>
>>> In section 4.4 (https://datatracker.ietf.org/doc/html/
>>> rfc5625#section-4.4 <https://datatracker.ietf.org/doc/html/
>>> rfc5625#section-4.4>), the rfc 5625 states,
>>>
>>> If a proxy must unilaterally truncate a response, then the proxy MUST
>>> set the TC bit. Similarly, proxies MUST NOT remove the TC bit from
>>> responses.
>>>
>>> Dnsmasq is ofcourse complying to this behaviour and not meddling with
>>> the TC bit while setting the answers to 0. But, if I read further
>>> section 4.4.1 (https://datatracker.ietf.org/doc/html/
>>> rfc5625#section-4.4.1 <https://datatracker.ietf.org/doc/html/
>>> rfc5625#section-4.4.1>),
>>>
>>> Whilst TCP transport is not strictly mandatory, it is supported by
>>> the vast majority of stub resolvers and recursive servers.
>>>
>>> So, this indicates that it is not mandatory that the client ignores
>>> this truncated response. This is further supported by section 6.1.3.2
>>> of rfc 1123 (https://datatracker.ietf.org/doc/html/
>>> rfc1123#section-6.1.3.2). In paragraph 3 of the DISCUSSION in section
>>> 6.1.3.2, it states,
>>>
>>> Whether it is possible to use a truncated answer
>>> depends on the application.
>>>
>>> Hence, when dnsmasq explicitly deletes the answers, then it deprives
>>> clients that do not fallback to TCP and are happy with the truncated
>>> response to be able to resolve their queries.
>> Fix such client to fall back to TCP instead of making dnsmasq to
>> provide it. Or ensure it uses EDNS0 with buffer big enough to receive
>> whole message.
>>>
>>> To me, it sounds like a better strategy to forward the truncated
>>> response as is to the client and let the client decide what it wants
>>> to do rather than forcefully dropping the answers.
>>>
>>> Best regards,
>>> Rahul Thakur
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>> *From:* Dnsmasq-discuss <dnsmasq-discuss-
>>> bounces at lists.thekelleys.org.uk> on behalf of Simon Kelley
>>> <simon at thekelleys.org.uk>
>>> *Sent:* 25 September 2024 13:39
>>> *To:* dnsmasq-discuss at lists.thekelleys.org.uk <dnsmasq-
>>> discuss at lists.thekelleys.org.uk>
>>> *Subject:* Re: [Dnsmasq-discuss] [PATCH 1/1] forward.c: fix handling
>>> of truncated response
>>> I think that this is legitimate behaviour. RFC 2181 para 9 says
>>>
>>> Where TC is set, the partial RRSet that would not completely fit may
>>> be left in the response. When a DNS client receives a reply with TC
>>> set, it should ignore that response, and query again, using a
>>> mechanism, such as a TCP connection, that will permit larger replies.
>>>
>>> Which means the contents (or lack of them) of the answer, auth and
>>> additional sections has to be ignored by the client anyway.
>>>
>>> Do you have a standards reference which says otherwise? Test suites can
>>> tell you either that behaviour has changed over releases or that
>>> behaviour differs from other implementations. They cant tell you that
>>> behaviour is correct.
>>>
>>> There is a subtle reason for the code being as it is. Dnsmasq
>>> has various functions which change the contents of a packet being
>>> returned, and these can't reliably be applied to a truncated packet, so
>>> data in a truncated packet may (for instance) disclose DNS data which
>>> should be blocked.
>>>
>>> The patch is, in any case, broken because it gratuitously removes the
>>> call to the logging code.
>>>
>>>
>>> Cheers,
>>>
>>> Simon.
>>>
>>> On 24/09/2024 11:01, Rahul Thakur via Dnsmasq-discuss wrote:
>>> > From: Rahul Thakur <rahul.thakur at iopsys.eu>
>>> >
>>> > the handling of truncated reponse is broken in 2.90. The answers
>>> > are removed before forwarding in case TC bit is set, which
>>> > seems incorrect.
>>> >
>>> > test details-
>>> > the regression was caught by a CDrouter run and this change fixes
>>> > the regression.
>>> > ---
>>> > src/forward.c | 7 -------
>>> > 1 file changed, 7 deletions(-)
>>> >
>>> > diff --git a/src/forward.c b/src/forward.c
>>> > index 10e7496..c893d84 100644
>>> > --- a/src/forward.c
>>> > +++ b/src/forward.c
>>> > @@ -782,13 +782,6 @@ static size_t process_reply(struct dns_header
>>> *header, time_t now, struct server
>>> > server->flags |= SERV_WARNED_RECURSIVE;
>>> > }
>>> >
>>> > - if (header->hb3 & HB3_TC)
>>> > - {
>>> > - log_query(F_UPSTREAM, NULL, NULL, "truncated", 0);
>>> > - header->ancount = htons(0);
>>> > - header->nscount = htons(0);
>>> > - header->arcount = htons(0);
>>> > - }
>>> >
>>> > if (!(header->hb3 & HB3_TC) && (!bogusanswer || (header->hb4 &
>>> HB4_CD)))
>>> > {
>>>
>> --
>> Petr Menšík
>> Software Engineer, RHEL
>> Red Hat,https://www.redhat.com/
>> PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
>>
>> _______________________________________________
>> Dnsmasq-discuss mailing list
>> Dnsmasq-discuss at lists.thekelleys.org.uk
>> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
>
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
More information about the Dnsmasq-discuss
mailing list