<div dir="ltr"><div>Thanks Petr for having a look at this.</div><div>Since IDN processing do turns uppercase letters into lowercase, I deliberately left uppercase letters out.</div><div>I think your approach to put everything in check_name() makes sense, even if that function grows and maybe is starting to become a bit hard to read.</div><div><br></div><div>I took your patch and made a couple of changes:</div><div>- IDN processing is performed if there are uppercase letters present in the name (unless we have an old version of libidn2 and there is an underscore in the name).</div><div>- IDN processing is always performed if there are non-ascii characters in the name, no matter if there are underscores or not (a non-processed name containing non-ascii characters sounds dangerous).</div><div><br></div><div>The size of the blacklist is about 230000 lines, and I agree that it would make sense to also file a bug on libidn2.<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Den ons 8 sep. 2021 kl 15:25 skrev Petr Menšík <<a href="mailto:pemensik@redhat.com">pemensik@redhat.com</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>I think your check should also accept uppercase ASCII letters.
Anyway, similar check is already done in check_names, which is
there to skip names containing underscore with older libidn2
versions. I guess it could return 2 also in case ascii-only
characters were detected, instead of checking the name again in
another loop.</p>
<p>Attached alternative change, which would process only names not
only ascii names. Changes check_names to return 2 when IDN should
be used. Printing ascii names should be safe, even when they
contain characters not allowed by hostnames. Such as _, +, = or
whatever garbage is present. As long as it is readable in logs, it
should not matter.</p>
<p>How many lines does your dnsmasq.blacklist.txt contain? Those
differences are significant. Maybe bug should be filled on
libidn2. Conversion from ascii-only name to ascii name should not
take too long even if it was called.<br>
</p>
<div>On 9/6/21 3:27 PM, Gustaf Ullberg
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi Simon and dnsmasq contributors,<br>
<br>
I am running dnsmasq with a blocklist from<br>
<a href="https://github.com/notracking/hosts-blocklists/blob/master/dnsmasq/dnsmasq.blacklist.txt" target="_blank">https://github.com/notracking/hosts-blocklists/blob/master/dnsmasq/dnsmasq.blacklist.txt</a><br>
<br>
I have noticed that building dnsmasq with libidn2 support (which
my distro does) can cause extreme slowdowns. The slowdowns seem
to come from the call to idn2_to_ascii_lz in canonicalise()
being very slow.<br>
<br>
idn2_to_ascii_lz is run on every domain name in the blocklist to
encode special characters, and this is surprisingly slow even
when there are no special characters. I developed a patch
(attached to this email) that checks a domain name for other
characters than . - a-z 0-9. If any such character is found, the
domain name will be encoded. If no such character is found the
domain name will not be encoded (as encoding won't change it).
This removes most of the overhead of using libidn2. Unless you
find any problems with this approach, I wish the patch can be
mainlined.<br>
<br>
Some benchmarks on a Raspberry Pi (slow, but probably not an
uncommon device for running dnsmasq) running ArchLinux and
dnsmasq git master:<br>
<br>
# Without libidn2: Acceptable speed<br>
> make<br>
> time ./src/dnsmasq -C dnsmasq.blacklist.txt --test<br>
dnsmasq: syntax check OK.<br>
<br>
real 0m3.699s<br>
user 0m3.468s<br>
sys 0m0.200s<br>
<br>
<br>
<br>
# With libidn2: To slow to be usable<br>
> make COPTS="-DHAVE_LIBIDN2"<br>
> time ./src/dnsmasq -C dnsmasq.blacklist.txt --test<br>
dnsmasq: syntax check OK.<br>
<br>
real 1m6.921s<br>
user 0m59.509s<br>
sys 0m0.606s<br>
<br>
<br>
# With libidn2 and attached patch: Back to acceptable speed<br>
> git am 0001-Avoid-IDN-translations-when-not-needed.patch<br>
> make COPTS="-DHAVE_LIBIDN2"<br>
> time ./src/dnsmasq -C dnsmasq.blacklist.txt --test<br>
dnsmasq: syntax check OK.<br>
<br>
real 0m3.903s<br>
user 0m3.643s<br>
sys 0m0.219s<br>
<div><br>
</div>
<div>Best regards,</div>
<div>Gustaf</div>
</div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
Dnsmasq-discuss mailing list
<a href="mailto:Dnsmasq-discuss@lists.thekelleys.org.uk" target="_blank">Dnsmasq-discuss@lists.thekelleys.org.uk</a>
<a href="https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss" target="_blank">https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss</a>
</pre>
</blockquote>
<pre cols="72">--
Petr Menšík
Software Engineer
Red Hat, <a href="http://www.redhat.com/" target="_blank">http://www.redhat.com/</a>
email: <a href="mailto:pemensik@redhat.com" target="_blank">pemensik@redhat.com</a>
PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB</pre>
</div>
_______________________________________________<br>
Dnsmasq-discuss mailing list<br>
<a href="mailto:Dnsmasq-discuss@lists.thekelleys.org.uk" target="_blank">Dnsmasq-discuss@lists.thekelleys.org.uk</a><br>
<a href="https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss" rel="noreferrer" target="_blank">https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss</a><br>
</blockquote></div>