Hello!
Sep 20 11:50:48 ganges bird: Kernel dropped some netlink messages, will resync on next scan. [...] Sep 20 11:51:19 ganges bird: Kernel dropped some netlink messages, will resync on next scan.
This is somehow inevitable, as the netlink manpage states: However, reliable transmissions from kernel to user are impossible in any case. The kernel can't send a netlink message if the socket buffer is full: the message will be dropped and the kernel and the user-space process will no longer have the same view of kernel state. It is up to the application to detect when this happens (via the ENOBUFS error returned by recvmsg(2)) and resynchronize. This unreliability is also a good reason to have periodic table scans, just to be sure that kernel is in sync with BIRD.
I'm seeing netlink drops when upstream internet churn is say more than 200 updates/sec or so, not huge, but quite freqent and can continue for minutes/hours.
Yes, this is quite a known situation. We can't do much about it in single-threaded BIRD – the ENOBUFS error signals that the kernel has no more room to store route updates. (See more thorough explanation down there.)
Some items I've investigated so far:
Increasing net.core.rmem_max and net.core.wmem_max sysctls doesn't seem to help much, strace of bird doesn't indicate any EAGAIN or blocking when writing to the netlink sockets.
Here somebody suggests increasing net.core.rmem_default before starting BIRD. https://bird.network.cz/pipermail/bird-users/2017-September/011541.html
strace shows some room for optimization in the prot kernel (these would obviously be code changes). For example, when a route changes next-hop/interface, 2 netlink messages are sent, delete followed by add, instead of a single change/replace (this would complicate bird, but reduce netlink message in half for updates).
This would be feasible in a world of one single kernel with no bugs, yet there have been quite a few bugs needed to be worked-around and we have no useful detection mechanism to check whether this exact kernel version suffers from that bug. (There are still people running new BIRD on old kernels.)
There is plenty of cpu cycles available, bird is <%1, etc...
Any pointers on tuning or config changes that may help here are appreciated.
Well, to be honest, I think this may be fixed by having a separate netlink thread (which is a work-almost-in-progress), yet without that, it is almost impossible. The reason is how it works now: 1) BGP receives a packet (quite a big one or several of them) 2) BGP parses the input data and for each single route: 2A) import filter is run 2B) best route in table is recalculated 2C) all exports are run; in case of kernel, the netlink message is sent 2D) kernel generates a netlink message in response, confirming the route update (repeat this for all the data) 3) BGP is done and another socket is read. For simplicity, let's assume it is the netlink receive socket. 4) Netlink parses the incoming messages, getting ENOBUFS and realizing that there are some more updates that didn't fit into the receive buffer, issuing that warning. 5) After a while, netlink scan is issued, successfully checking that all routes are there. The actual reason for BIRD showing these warning in tables where only BIRD writes is simply the impossibility of reading the netlink socket while exporting routes from another protocol. This will be fixed in future BIRD versions supporting multithreaded execution where the netlink thread should have enough time to read the netlink socket and the exports for netlink (and all other protocols) will properly queue and wait to be processed until the protocol decides to actually export. Maria