Bird dying on nl_get_reply

Vincent Bernat bernat at luffy.cx
Tue May 2 15:16:08 CEST 2017


Hey!

Just got an instance of BIRD dying unexpectedly after displaying the
following message:

nl_get_reply: No buffer space available

It's from netlink.c:

	  int x = recvmsg(nl->fd, &m, 0);
	  if (x < 0)
	    die("nl_get_reply: %m");

Manpage for netlink(7) says an application should expect such a
condition:

       However, reliable transmissions from kernel to user are
       impossible in any case.  The kernel can't send a netlink message
       if the socket buffer is full: the message will be dropped and the
       kernel and the user-space process will no longer have the same
       view of kernel state.  It is up to the application to detect when
       this happens (via the ENOBUFS error returned by recvmsg(2)) and
       resynchronize.

Another possibility would be to use NETLINK_NO_ENOBUFS socket option:

       This flag can be used by unicast and broadcast listeners to avoid
       receiving ENOBUFS errors.

I don't think using this flag is a good idea.

I thought this problem has already been reported recently, but I didn't
find the thread back. The receive buffer could be increased dynamically
when this happens. Or maybe we could just ignore the error and wait for
the next kernel sync to catch up. Or the 8192 value could be configured
at build-time. What's the best option?
-- 
10.0 times 0.1 is hardly ever 1.0.
            - The Elements of Programming Style (Kernighan & Plauger)


More information about the Bird-users mailing list