[Dnsmasq-discuss] Dnsmasq with Gigantic hosts file

Simon Kelley simon at thekelleys.org.uk
Wed Jan 11 20:24:27 GMT 2012


On 11/01/12 18:44, Jan Seiffert wrote:
> 2012/1/11 Simon Kelley <simon at thekelleys.org.uk>:
>> On 11/01/12 18:15, Jan Seiffert wrote:
>>> 
>>> 2012/1/11 Simon Kelley<simon at thekelleys.org.uk>:
> [snip]
>>>> 
>>>> Try commenting out the code around line 650 in cache.c which
>>>> starts with a big comment block explaining the tweak I mention
>>>> above, starting
>>>> 
>>>> /* Ensure there is only one address ->  name mapping (first one
>>>> trumps)
>>>> 
>>>> If goes fast then I've guessed right. Assuming I have, there
>>>> are various solutions: add hashing for by-adddress cache
>>>> lookups,
>>> 
>>> 
>>> Should i refresh my reverse tree code? Nice thing was it was so
>>> unintrusive, so small device could disable it,
>> 
>> 
>> Remind me: I remember the regex patch, but not that one.
>> 
> 
> The patch in: 
> http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2007q1/001120.html
>
> 
Or should i attach it again?

Got it, thanks.
> 
>> I've thought about this a bit more, and I think it's pretty easy to
>> hash the addresses only during the process of reading hostsfiles
>> with basically no extra resource: There is a pointer field in cache
>> entries which is unused at that time, and could be used to hold an
>> open-hash chain. All that's needed is an array of pointers, one per
>> hash bucket, which can be freed once files are read.
>> 
> 
> *mumbel, mumbel* While i envy your genius, and are intrigued by the
> nifty trick, this is on the verge of ... insanity?
> 
We eat insanity for breakfast round here.

> Ok, ok, a reverse tree is "complicated", but would put the whole 
> reverse thing to rest, also during runtime.

Tree is good, because it works for lookup too. Downside is extra memory
use, and malloc/free during cache operation: dnsmasq at the moment is
written so that it never calls malloc/free during DNS operations - saves
memory fragmentation, especially in MMU-less systems (Do they still exist?)



> I mean now Preston seems to be slowed down during read in, but i
> guess later reverse lookups will also not be fast due to the sheer
> number of unique IP/hosts.
> 

ash-during-read doesn't cost extra memory and doesn't do malloc, but it
won't speed up reverse operations. This isn't normally a problem, even
for gigantic hosts files, since the lookup cost is limited by the number
of _reverse_ entries in the cache. For ad-blocking gigantic hosts files,
almost none of the entries are reverse, so no problem. For Preston's
workload, it maybe moreso, but reverse lookups are much less frequent
than forward ones on most systems, so it may still be OK.

Simon.



More information about the Dnsmasq-discuss mailing list