<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hello All,<div><br></div><div>We use dnsmasq as a very effective replacement for the ISC software, thus far with great success. However, we have run into occasional performance problems at scale.</div><div><br></div><div>These seem to manifest as a general slowdown of the request -> reply process, which can sometimes exceed 90 seconds from request to reply.</div><div><br></div><div>The problems only cause issues when there is a penalty for missing a request, e.g. when we boot an entire rack of servers. In the default configuration, many servers will not attempt to retry the boot sequence, forcing us to detect hung machines and issue a remote reboot. We often have to do this several times in the frequent case of building more than a few hundred servers in a single batch.</div><div><br></div><div>Here are the symptoms:</div><div><br></div><div>1. We never have any issue with even mass installation as long as the configuration contains less than a few thousand static entries and a few hundred subnets.</div><div><br></div><div>2. At some point, dhcp will slow down: this sequence was taken with 7383 static host entries defined:</div><div><br></div><div><div><font class="Apple-style-span" face="'Courier New'">---------------------------------------------------------------------------</font></div><div><font class="Apple-style-span" face="'Courier New'"> TIME: 19:43:47.480189</font></div><div><font class="Apple-style-span" face="'Courier New'"> IP: > (00:01:e8:92:fd:41) > (00:21:9b:a2:e4:47)</font></div><div><font class="Apple-style-span" face="'Courier New'"> OP: 1 (BOOTPREQUEST)</font></div><div><font class="Apple-style-span" face="'Courier New'"> HTYPE: 1 (Ethernet)</font></div><div><font class="Apple-style-span" face="'Courier New'"> HLEN: 6</font></div><div><font class="Apple-style-span" face="'Courier New'"> HOPS: 1</font></div><div><font class="Apple-style-span" face="'Courier New'"> XID: 2472c541</font></div></div><div><font class="Apple-style-span" face="'Courier New'"><br></font></div><div><font class="Apple-style-span" face="'Courier New'"><SNIP></font></div><div><div><font class="Apple-style-span" face="'Courier New'"><br></font></div><div><font class="Apple-style-span" face="'Courier New'">---------------------------------------------------------------------------</font></div><div><font class="Apple-style-span" face="'Courier New'"> TIME: 19:45:35.963209</font></div><div><font class="Apple-style-span" face="'Courier New'"> IP: > (00:21:9b:a2:e4:47) > (00:00:5e:00:01:8d)</font></div><div><font class="Apple-style-span" face="'Courier New'"> OP: 2 (BOOTPREPLY)</font></div><div><font class="Apple-style-span" face="'Courier New'"> HTYPE: 1 (Ethernet)</font></div><div><font class="Apple-style-span" face="'Courier New'"> HLEN: 6</font></div><div><font class="Apple-style-span" face="'Courier New'"> HOPS: 1</font></div><div><font class="Apple-style-span" face="'Courier New'"> XID: 2472c541</font></div></div><div><font class="Apple-style-span" face="'Courier New'"><br></font></div><div><font class="Apple-style-span" face="'Courier New'"><SNIP></font></div><div><font class="Apple-style-span" face="'Courier New'"><br></font></div><div>3. If I reduce the number of static entries to, for example, 2408, response time returns to sub-second.</div><div><br></div><div>General configuration notes:</div><div><br></div><div>This request is handled by a relay, and we have the following config options in play:</div><div><br></div><div><div>no-ping</div><div>no-hosts</div><div>no-resolv</div><div>cache-size=0</div><div>dhcp-lease-max=20000</div><div>dhcp-authoritative</div><div>conf-dir=/etc/dnsmasq.d</div><div>domain=sekret.zynga.com</div><div>port=0 </div></div><div><br></div><div>Fast:</div><div><br></div><div><div># ls -la /etc/dnsmasq.conf </div><div>-rw-r--r-- 1 root root 408222 Jun 14 20:20 /etc/dnsmasq.conf</div></div><div># grep dhcp-range /etc/dnsmasq.conf | wc</div><div><div> 80 80 6340</div><div># grep dhcp-host /etc/dnsmasq.conf | wc</div><div> 2408 2410 209798</div></div><div><br></div><div>Slow:</div><div><br></div><div><div><div># ls -la /etc/dnsmasq.conf </div><div>-rw-r--r-- 1 root root 1205132 Jun 14 20:31 /etc/dnsmasq.conf</div></div></div><div># grep dhcp-range /etc/dnsmasq.conf | wc</div><div><div> 80 80 6340</div><div># grep dhcp-host /etc/dnsmasq.conf | wc</div><div> 7383 7387 650118</div><div><br></div></div><div>Are there any obvious inflection points that would cause the server to drop in performance by a few orders of magnitude with lots of hosts defined? Are there any recommendations for tuning, beyond introducing more dnsmasq servers?</div><div><br></div><div>-Mike</div></body></html>