[Dnsmasq-discuss] [PATCH 1/1] forward.c: fix handling of truncated response

Simon Kelley simon at thekelleys.org.uk
Wed Nov 20 23:45:52 UTC 2024



On 10/2/24 17:00, Dominik Derigs via Dnsmasq-discuss wrote:
> Hey all,
> 
> 
> having this configurable makes sense particularly thinking about the 
> multitude of IoT devices often receiving not all that much attention 
> from their manufacturers. A particular example is the camera Tapo C-310 
> which seems to be widely used and has often been reported in the Pi-hole 
> forums to cause a huge amount of queries due to exactly this truncation 
> method we are talking about here (when dnsmasq is used with DNSSEC, the 
> query A time.nist.gov results in truncation).
> 
> 
> Said Tapo camera is not behaving nicely and does not retry over TCP. 
> Instead, it retries the same query roughly one second later, resulting 
> in a whooping 86,400 identical queries, all being truncated. A 
> workaround is querying this domain periodically manually over TCP but it 
> is a rather ugly workaround.
> 
> 
> If there would be an option to allow truncated content to remain, I 
> would indeed find it very useful for situations like the one mentioned 
> above. I confirmed that the camera could successfully reads the returned 
> CNAME/A record. This is what I got when simply applying the submitted 
> patch (but keeping the logging):
> 
> 
> Oct  2 17:46:53 dnsmasq[1861633]: 25 127.0.0.1/45027 query[A] 
> time.nist.gov from 127.0.0.1
> Oct  2 17:46:53 dnsmasq[1861633]: 25 127.0.0.1/45027 forwarded 
> time.nist.gov to 127.0.0.1#5335
> Oct  2 17:46:53 dnsmasq[1861633]: 26 dnssec-query[DS] gov to 127.0.0.1#5335
> Oct  2 17:46:53 dnsmasq[1861633]: 26 reply gov is DS for keytag 2536, 
> algo 13, digest 2
> Oct  2 17:46:53 dnsmasq[1861633]: 27 dnssec-query[DS] nist.gov to 
> 127.0.0.1#5335
> Oct  2 17:46:53 dnsmasq[1861633]: 28 dnssec-query[DNSKEY] gov to 
> 127.0.0.1#5335
> Oct  2 17:46:53 dnsmasq[1861633]: 28 reply gov is DNSKEY keytag 2536, 
> algo 13
> Oct  2 17:46:53 dnsmasq[1861633]: 28 reply gov is DNSKEY keytag 35496, 
> algo 13
> Oct  2 17:46:53 dnsmasq[1861633]: 27 reply nist.gov is DS for keytag 
> 33751, algo 8, digest 2
> Oct  2 17:46:53 dnsmasq[1861633]: 29 dnssec-query[DNSKEY] nist.gov to 
> 127.0.0.1#5335
> Oct  2 17:46:53 dnsmasq[1861633]: 29 reply nist.gov is DNSKEY keytag 
> 18303, algo 8
> Oct  2 17:46:53 dnsmasq[1861633]: 29 reply nist.gov is DNSKEY keytag 
> 33751, algo 8
> Oct  2 17:46:53 dnsmasq[1861633]: 30 dnssec-query[DS] glb.nist.gov to 
> 127.0.0.1#5335
> Oct  2 17:46:53 dnsmasq[1861633]: 30 reply glb.nist.gov is DS for keytag 
> 56235, algo 7, digest 1
> Oct  2 17:46:53 dnsmasq[1861633]: 30 reply glb.nist.gov is DS for keytag 
> 4395, algo 7, digest 1
> Oct  2 17:46:53 dnsmasq[1861633]: 31 dnssec-query[DNSKEY] glb.nist.gov 
> to 127.0.0.1#5335
> Oct  2 17:46:53 dnsmasq[1861633]: 31 reply glb.nist.gov is truncated[DNSKEY]
> Oct  2 17:46:53 dnsmasq[1861633]: 25 127.0.0.1/45027 validation result 
> is TRUNCATED
> Oct  2 17:46:53 dnsmasq[1861633]: 25 127.0.0.1/45027 reply is truncated
> 
> 
> For testing, "dig +ignore + notcp time.nist.gov" is handy as it is 
> guaranteed to be truncated with dnsmasq and DNSSEC enabled if the domain 
> isn't already in your local cache through a preceding TCP query.
> 
> 
> Best,
> 
> Dominik
> 

This sent my down a very deep rabbit-hole. In case people haven't 
realised, the answer to the time.nist.gov query is easily small enough 
to fit in a UDP packet. What's happening here is that one of the DNSSEC 
key queries needed to validate the time.nist.gov query is too long to 
fit in a UDP packet. Dnsmasq can validate a DNSSEC query either in UDP 
mode or TCP mode but not a mixture, so as soon  as it gets a truncated 
reply to the DNSKEY query, it abandons the validation and returns a 
truncated repsonse to the original query in the expectation that the 
client will send the query again over TCP. That will allow dnsmasq to 
validate it and eventually return a reply. That reply would be easily 
small enough to fit in UDP packet.

There are some other cases were the same thing can be triggered, for 
instance if the answer to the original query exceeds a UDP packet size 
when RRSIGs are added to it, but the answer without RRSIGS fits in a UDP 
packet. That also causes dnsmasq to add it's own TRUNCATED bit to the 
reply in the expectation that the client will retry (or break horribly, 
in the example above.)

I now have code which avoid this hack: dnsmasq can swap from the UDP 
code path to the TCP code path to get the answers to intermediate 
questions or a DNSSEC-decorated reply to the original question, and as 
long as the final answer fits in a UDP packet, it won't ever return a 
truncated reply.

This code has taken some time to get right and to test, but expect an 
upload in the next day or two for further testing purposes.

On the same theme of accommodating broken clients, and because dnsmasq 
will not longer be artificially truncating answers, expect that code 
dump to include Rahul's patch as well.


Cheers,

Simon.
> 
> Am 10/2/24 um 12:30 AM schrieb Petr Menšík:
>>
>> I think Simon has pointed out this is intentional. Partial reply is 
>> incomplete and for well behaving clients carries not useful 
>> information. It would use TCP anyway instead, therefore it adds just 
>> additional data.
>>
>>
>> I would consider clients not falling back to TCP as broken. TCP is not 
>> considered optional nowadays. Attempts to fix such clients by relying 
>> on incomplete responses instead are wrong.
>>
>>
>> It might make sense to allow enabling such behaviour by configuration, 
>> if you have broken software not able to workaround. But I would insist 
>> that software is broken and therefore sending incomplete responses 
>> should be only workaround for them, not something behaved by default.
>>
>>
>> Is that something using Alpine C library?
>>
>>
>> Cheers,
>> Petr
>>
>>
>> On 30/09/2024 06:39, Rahul Thakur via Dnsmasq-discuss wrote:
>>> Hi Simon,
>>>
>>> So what do you think of my reasoning for this patch? Do you agree?
>>>
>>> Best regards,
>>> Rahul Thakur
>>> ------------------------------------------------------------------------
>>> *From:* Rahul Thakur <rahul.thakur at genexis.eu>
>>> *Sent:* 25 September 2024 15:29
>>> *To:* Simon Kelley <simon at thekelleys.org.uk>; dnsmasq- 
>>> discuss at lists.thekelleys.org.uk <dnsmasq-discuss at lists.thekelleys.org.uk>
>>> *Subject:* Re: [Dnsmasq-discuss] [PATCH 1/1] forward.c: fix handling 
>>> of truncated response
>>> Hi Simon,
>>>
>>> Thanks for responding to this patch, please find my justification for 
>>> this patch as follows:
>>>
>>> I think rfc 2181 is defining the behaviour for DNS server and not DNS 
>>> proxy.
>>>
>>> I am relying on and referring to rfc 5625 while making this change.
>>>
>>> In section 4.4 (https://datatracker.ietf.org/doc/html/ 
>>> rfc5625#section-4.4 <https://datatracker.ietf.org/doc/html/ 
>>> rfc5625#section-4.4>), the rfc 5625 states,
>>>
>>>    If a proxy must unilaterally truncate a response, then the proxy MUST
>>>    set the TC bit.  Similarly, proxies MUST NOT remove the TC bit from
>>>    responses.
>>>
>>> Dnsmasq is ofcourse complying to this behaviour and not meddling with 
>>> the TC bit while setting the answers to 0. But, if I read further 
>>> section 4.4.1 (https://datatracker.ietf.org/doc/html/ 
>>> rfc5625#section-4.4.1 <https://datatracker.ietf.org/doc/html/ 
>>> rfc5625#section-4.4.1>),
>>>
>>>  Whilst TCP transport is not strictly mandatory, it is supported by
>>>    the vast majority of stub resolvers and recursive servers.
>>>
>>> So, this indicates that it is not mandatory that the client ignores 
>>> this truncated response. This is further supported by section 6.1.3.2 
>>> of rfc 1123 (https://datatracker.ietf.org/doc/html/ 
>>> rfc1123#section-6.1.3.2). In paragraph 3 of the DISCUSSION in section 
>>> 6.1.3.2, it states,
>>>
>>>                  Whether it is possible to use a truncated answer
>>>                  depends on the application.
>>>
>>> Hence, when dnsmasq explicitly deletes the answers, then it deprives 
>>> clients that do not fallback to TCP and are happy with the truncated 
>>> response to be able to resolve their queries.
>> Fix such client to fall back to TCP instead of making dnsmasq to 
>> provide it. Or ensure it uses EDNS0 with buffer big enough to receive 
>> whole message.
>>>
>>> To me, it sounds like a better strategy to forward the truncated 
>>> response as is to the client and let the client decide what it wants 
>>> to do rather than forcefully dropping the answers.
>>>
>>> Best regards,
>>> Rahul Thakur
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>> *From:* Dnsmasq-discuss <dnsmasq-discuss- 
>>> bounces at lists.thekelleys.org.uk> on behalf of Simon Kelley 
>>> <simon at thekelleys.org.uk>
>>> *Sent:* 25 September 2024 13:39
>>> *To:* dnsmasq-discuss at lists.thekelleys.org.uk <dnsmasq- 
>>> discuss at lists.thekelleys.org.uk>
>>> *Subject:* Re: [Dnsmasq-discuss] [PATCH 1/1] forward.c: fix handling 
>>> of truncated response
>>> I think that this is legitimate behaviour. RFC 2181 para 9 says
>>>
>>>     Where TC is set, the partial RRSet that would not completely fit may
>>>     be left in the response.  When a DNS client receives a reply with TC
>>>     set, it should ignore that response, and query again, using a
>>>     mechanism, such as a TCP connection, that will permit larger replies.
>>>
>>> Which means the contents (or lack of them) of the answer, auth and
>>> additional sections has to be ignored by the client anyway.
>>>
>>> Do you have a standards reference which says otherwise? Test suites can
>>> tell you either that behaviour has changed over releases or that
>>> behaviour differs from other implementations. They cant tell you that
>>> behaviour is correct.
>>>
>>> There is a subtle reason for the code being as it is. Dnsmasq
>>> has various functions which change the contents of a packet being
>>> returned, and these can't reliably be applied to a truncated packet, so
>>> data in a truncated packet may (for instance) disclose DNS data which
>>> should be blocked.
>>>
>>> The patch is, in any case, broken because it gratuitously removes the
>>> call to the logging code.
>>>
>>>
>>> Cheers,
>>>
>>> Simon.
>>>
>>> On 24/09/2024 11:01, Rahul Thakur via Dnsmasq-discuss wrote:
>>> > From: Rahul Thakur <rahul.thakur at iopsys.eu>
>>> >
>>> > the handling of truncated reponse is broken in 2.90. The answers
>>> > are removed before forwarding in case TC bit is set, which
>>> > seems incorrect.
>>> >
>>> > test details-
>>> > the regression was caught by a CDrouter run and this change fixes
>>> > the regression.
>>> > ---
>>> >   src/forward.c | 7 -------
>>> >   1 file changed, 7 deletions(-)
>>> >
>>> > diff --git a/src/forward.c b/src/forward.c
>>> > index 10e7496..c893d84 100644
>>> > --- a/src/forward.c
>>> > +++ b/src/forward.c
>>> > @@ -782,13 +782,6 @@ static size_t process_reply(struct dns_header 
>>> *header, time_t now, struct server
>>> >        server->flags |= SERV_WARNED_RECURSIVE;
>>> >       }
>>> >
>>> > -  if (header->hb3 & HB3_TC)
>>> > -    {
>>> > -      log_query(F_UPSTREAM, NULL, NULL, "truncated", 0);
>>> > -      header->ancount = htons(0);
>>> > -      header->nscount = htons(0);
>>> > -      header->arcount = htons(0);
>>> > -    }
>>> >
>>> >     if  (!(header->hb3 & HB3_TC) && (!bogusanswer || (header->hb4 & 
>>> HB4_CD)))
>>> >       {
>>>
>> -- 
>> Petr Menšík
>> Software Engineer, RHEL
>> Red Hat,https://www.redhat.com/
>> PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
>>
>> _______________________________________________
>> Dnsmasq-discuss mailing list
>> Dnsmasq-discuss at lists.thekelleys.org.uk
>> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
> 
> _______________________________________________
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss at lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss




More information about the Dnsmasq-discuss mailing list