[Dnsmasq-discuss] Dnsmasq with Gigantic hosts file
Simon Kelley
simon at thekelleys.org.uk
Tue Jan 30 13:33:12 GMT 2007
Jan 'RedBully' Seiffert wrote:
> While it is common that most of these entries will point to one address
> (be it 127.0.0.1 or a local LAN machine) and F_REVERSE not set for them,
> it's a pity it will still slow down reverse lookups...
Reverse lookups currently iterate over every cache entry, including (in
the case) the 700k from /etc/hosts which don't have F_REVERSE set. It
will ignore them, of course.
Actually the observation that most of these entries don't set F_REVERSE
is a valuable insight; it's also case that they will have the F_IMMORTAL
bit set.
This allows a very simple optimisation: just put the reverse entries at
the start of the hash-chain. Reverse lookups still need to check every
hash chain, but only the first few entries which have F_REVERSE set, all
the thousands of entries from /etc/hosts further down the list can just
be skipped.
The IMMORTAL bit allows a similar optimisation for garbage-collection
(the cache_scan_free() function.) IMMORTAL entries never need to be
garbage collected, so by putting them at the _end_ of the hash chain,
iterating over every entry to find expired ones is no longer impacted by
having lots of /etc/hosts names.
I've not tested this yet, but I think the following patch should be enough.
--- dnsmasq-2.36/dnsmasq-2.36/src/cache.c 2006-12-30
13:44:08.000000000 +0000
+++ dnsmasq-2.37/dnsmasq-2.37/src/cache.c 2007-01-30
12:28:49.000000000 +0000
@@ -157,9 +157,24 @@
static void cache_hash(struct crec *crecp)
{
- struct crec **bucket = hash_bucket(cache_get_name(crecp));
- crecp->hash_next = *bucket;
- *bucket = crecp;
+ /* maintain an invariant that all entries with F_REVERSE set
+ are at the start of the hash-chain and all non-reverse
+ immortal entries are at the end of the hash-chain.
+ This allows reverse searches and garbage collection to be optimised */
+
+ struct crec **up = hash_bucket(cache_get_name(crecp));
+
+ if (!(crecp->flags & F_REVERSE))
+ {
+ while (*up && ((*up)->flags & F_REVERSE))
+ up = &((*up)->hash_next);
+
+ if (crecp->flags & F_IMMORTAL)
+ while (*up && (!(*up)->flags & F_IMMORTAL))
+ up = &((*up)->hash_next);
+ }
+ crecp->hash_next = *up;
+ *up = crecp;
}
static void cache_free(struct crec *crecp)
@@ -295,8 +310,13 @@
#else
int addrlen = INADDRSZ;
#endif
+ /* take advantage of the fact that has chains have stuff in the
+ order <reverse>,<other>,<immortal> so that when we hit an entry
+ which isn't reverse and is immortal, we're done. */
for (i = 0; i < hash_size; i++)
- for (crecp = hash_table[i], up = &hash_table[i]; crecp; crecp =
crecp->hash_next)
+ for (crecp = hash_table[i], up = &hash_table[i];
+ crecp && ((crecp->flags & F_REVERSE) || !(crecp->flags &
F_IMMORTAL));
+ crecp = crecp->hash_next)
if (is_expired(now, crecp))
{
*up = crecp->hash_next;
@@ -567,12 +587,16 @@
else
{
/* first search, look for relevant entries and push to top of list
- also free anything which has expired */
+ also free anything which has expired. All the reverse entries
are at the
+ start of the hash chain, so we can give up when we find the first
+ non-REVERSE one. */
int i;
struct crec **up, **chainp = &ans;
- for(i=0; i<hash_size; i++)
- for (crecp = hash_table[i], up = &hash_table[i]; crecp; crecp =
crecp->hash_next)
+ for (i=0; i<hash_size; i++)
+ for (crecp = hash_table[i], up = &hash_table[i];
+ crecp && (crecp->flags & F_REVERSE);
+ crecp = crecp->hash_next)
if (!is_expired(now, crecp))
{
if ((crecp->flags & F_REVERSE) &&
>
> Maybe the following patch will help:
> Maintain a tree for reverse lookups.
> Patch is compile tested only, i don't know if this will work, I surely
> dropped the ball somewhere. Also because the cache logic 'as is' is
> quite complex IMHO (but that's maybe just my brain always needing a
> moment wrapping around hash tables when looking into the impl.).
That code is clearly in the spirit of the existing cache code, which
scares me whenever I look at it! I'll test mine first, just because it's
simpler and I (think I) understand it. If it's no good I'll work on
understanding yours and go with that.
Cheers,
Simon.
More information about the Dnsmasq-discuss
mailing list