[Dnsmasq-discuss] Dnsmasq on high load

Simon Kelley simon at thekelleys.org.uk
Wed Mar 11 21:49:28 GMT 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/03/15 09:16, Анатолий Мулярский wrote:
> I've looked into the code. retry_send() is called multiply times
> from different places. BUT the static variable retries does not
> reset before any call in a loop. There can be a situation where
> retries is 999 and for the first EAGAIN we get an error as the
> retries reaches 1000 as its upper limit.

A good point, the problem is that retry_send() doesn't get called when
the sendto() or other syscall succeeds, so it can't reset retries.

By changing the boilerplate code from

while (sendto(....) == -1 && retry_send());

to

while (retry_send(sendto(....));

we can make retry_send reset the retries variable when the syscall
succeeds.

int retry_send(ssize_t rc)
{
   if (rc != -1)
    {
      retries = 0;
      return 0;
    }

.... other code as before ....


I just checked in this change to the git repo (and some extra checks
on the return value of close(), for good measure.)

Cheers,

Simon.





> The solution I can propose is to call retry_send() with a
> parameter identifying the place of calling. In the function
> retry_send() we must make the check for the parameter for matching
> with one from the previous invokation. If it is not true we must
> reset retries counter. The second suggestion is to make the timeout
> value a configurable option. And the last - in the error message it
> would be comfortable to get more info about the problem request -
> the client address and its request as minimal info. Sorry if I want
> too much :) But my problem it does not solve - it only can help
> identifying the problem source.
> 
> 
> 2015-03-11 9:56 GMT+02:00, Анатолий Мулярский <tm1tvk at gmail.com>:
>> Thank you for the advice, I'll try it later.
>> 
>> 2015-03-10 19:31 GMT+02:00, Simon Kelley
>> <simon at thekelleys.org.uk>:
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>> 
>>> On 10/03/15 15:15, Анатолий Мулярский wrote:
>>>> As I know, the error message means EAGAIN error. But what
>>>> the reason?
>>> 
>>> There's a recent change to dnsmasq, which limits it to waiting
>>> for one second for the EAGAIN error to go away.
>>> 
>>> See retry_send() in src/util.c
>>> 
>>> /* Linux kernels can return EAGAIN in perpetuity when calling 
>>> sendmsg() and the relevant interface has gone. Here we loop 
>>> retrying in EAGAIN for 1 second max, to avoid this hanging 
>>> dnsmasq. */
>>> 
>>> You might try tweaking the code below that to make it wait
>>> longer, or not have a timeout.
>>> 
>>> The reason for the EAGAIN is likely that the send queue on the
>>> socket if full.
>>> 
>>> 
>>> Cheers,
>>> 
>>> Simon.
>>> 
>> 
>> 
>> -- Best regards Anatoly Muliarski
>> 
> 
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iEYEARECAAYFAlUAuGgACgkQKPyGmiibgrf2EQCgmeRwA1JkRMwdKmGtue2Flmuj
LP0An3olJjTmKZ6HpR3B0/nlwUBewTWN
=xpm0
-----END PGP SIGNATURE-----



More information about the Dnsmasq-discuss mailing list