[Dnsmasq-discuss] Dnsmasq with Gigantic hosts file

Wed Jan 11 18:26:05 GMT 2012

On 11/01/12 18:15, Jan Seiffert wrote:
> 2012/1/11 Simon Kelley<simon at thekelleys.org.uk>:
> [snip]
>> To implement this, as each line is read in a host file, the address is
>> looked up in the cache to determine if it's been seen before. As the cache
>> is NOT hashed for by-address lookups, this is an O(n) operation, and reading
>> a hostfile is O(n^2) worst-case.
>>
>
> Ouch!
>
> [snip]
>> Try commenting out the code around line 650 in cache.c which starts with a
>> big comment block explaining the tweak I mention above, starting
>>
>> /* Ensure there is only one address ->  name mapping (first one trumps)
>>
>> If goes fast then I've guessed right. Assuming I have, there are various
>> solutions: add hashing for by-adddress cache lookups,
>
> Should i refresh my reverse tree code?
> Nice thing was it was so unintrusive, so small device could disable it,

Remind me: I remember the regex patch, but not that one.

I've thought about this a bit more, and I think it's pretty easy to hash 
the addresses only during the process of reading hostsfiles with 
basically no extra resource: There is a pointer field in cache entries 
which is unused at that time, and could be used to hold an open-hash 
chain. All that's needed is an array of pointers, one per hash bucket, 
which can be freed once files are read.

>
> Preston, which version are you running, or can you run the latest version?

and a further question, can you send me a copy of our hosts file? It's 
easy to benchmark accurately if we have real data.

Cheers,

Simon.