[Dnsmasq-discuss] Dnsmasq with Gigantic hosts file

Jan 'RedBully' Seiffert redbully at cc.fh-luh.de
Tue Jan 30 15:36:34 GMT 2007


Simon Kelley wrote:
> Jan 'RedBully' Seiffert wrote:
> 
>> While it is common that most of these entries will point to one address
>> (be it 127.0.0.1 or a local LAN machine) and F_REVERSE not set for them,
>> it's a pity it will still slow down reverse lookups...
> 
> Reverse lookups currently iterate over every cache entry, including (in
> the case) the 700k from /etc/hosts which don't have F_REVERSE set. It
> will ignore them, of course.
> 
> Actually the observation that most of these entries don't set F_REVERSE
> is a valuable insight; it's also case that they will have the F_IMMORTAL
> bit set.
> 
At least for the "common" 10k+ entrys from /etc/hosts case. Maybe a
large installation can also generate a lot of reverse entries, but that
should be more in the 100'th, which a server should be able to search.
Everything else could be simply over engineering.

> This allows a very simple optimisation: just put the reverse entries at
> the start of the hash-chain. Reverse lookups still need to check every
> hash chain, but only the first few entries which have F_REVERSE set, all
> the thousands of entries from /etc/hosts further down the list can just
> be skipped.
> 
> The IMMORTAL bit allows a similar optimisation for garbage-collection
> (the cache_scan_free() function.) IMMORTAL entries never need to be
> garbage collected, so  by putting them at the _end_ of the hash chain,
> iterating over every entry to find expired ones is no longer impacted by
> having lots of /etc/hosts names.
> 
Hmmmm, OK, thats a impressive simple way to go.
This (re)sorting (last hit gets placed in front of chain?) and the
garbage-collection are things which scared me away not to touch the
complete rest of the code ;)

> I've not tested this yet, but I think the following patch should be enough.
[snip - patch]

But doesn't this sorting clash with the resorting of cache_find_by_name
(according to line 487)?
Is it necessary to traverse the lists to the end of the "region" on insert?

I will also test it, time to update to the last version...

>> Maybe the following patch will help:
>> Maintain a tree for reverse lookups.
>> Patch is compile tested only, i don't know if this will work, I surely
>> dropped the ball somewhere. Also because the cache logic 'as is' is
>> quite complex IMHO (but that's maybe just my brain always needing a
>> moment wrapping around hash tables when looking into the impl.).
> 
> That code is clearly in the spirit of the existing cache code, which
> scares me whenever I look at it!

I only tried to make it least intrusive because the lack of in deep
knowledge of the original cache code (and how every little bit works
together in every corner case). Also this way you can switch off the
tree to save like 2k RAM for memory constrained system like the OpenWRT.
Thats maybe not the optimal approach (and may not work this way) but as
i said, i tried to make it least intrusive, and from my look over the
code, these places looked like the best place to hook in.

> I'll test mine first, just because it's
> simpler and I (think I) understand it. If it's no good I'll work on
> understanding yours and go with that.
> 
No problem, mine is more like a sketch how this could be made. I thought
in terms of "talk is cheap, show me the Code".
And even if you do not take it, I could use it in a project of mine
where i also need to map back from IP addresses to "objects" for UDP
packets (but I need to worry on concurrent access, coding on dnsmasq is
mind relaxing in this regard ;)

> Cheers,
> 
> Simon.
> 
> 
Greetings
	Jan

-- 
Live proud, live free - code in C



More information about the Dnsmasq-discuss mailing list