Hi Ondrej, On Fri, Jan 14, 2022 at 11:17 PM Ondrej Zajicek <santiago@crfreenet.org> wrote:
On Mon, Jan 10, 2022 at 11:47:57PM +0100, Tomas Hlavacek wrote:
Add netlink KRT dump filter on Linux to avoid PMTU cache records from FNHE table dump along with KRT.
Linux Kernel added FNHE table dump to the netlink API in patch https://patchwork.ozlabs.org/project/netdev/patch/8d3b68cd37fb5fddc470904cdd...
The filter mitigates the risk of receiving unknown and potentially large number of FNHE records that would block BIRD I/O in each sync. There is a known issue caused by the GRE tunnels on Linux that seems to be creating one FNHE record for each destination IP address that is routed through the tunnel, even when the PMTU equals to GRE interface MTU (tested with kernel 5.5 - 5.16-rc7).
Thanks, merged with some modifications:
https://gitlab.nic.cz/labs/bird/-/commit/e818f16448e918ed07633480291283f3449...
Instead of switching NETLINK_GET_STRICT_CHK on and off, i just used strict checking for all dumps (including link and address).
Great! That is definitely a better way! Cool!
Also, removed the SO_SNDBUF/SO_RCVBUF change. That seems unrelated and has some issues:
1) Why these values? 32k for SO_SNDBUF is smaller than the default value (208k), so it in fact makes the buffer smaller (which probably does not matter). While 1M for SO_RCVBUF is bigger that max value, so it is capped at 416k.
I took the values from iproute2 and then I tried to fine-tune speed of the large FNHE dumps by tweaking these parameters. It is not relevant anymore. So yes, it's OK to drop it.
2) It applies just for nl_scan and nl_req, and not for async fd, where it makes most sense.
3) We may want big rx buffer for async fd, in this case we may consider using SO_SNDBUFFORCE.
I am not sure which netlink socket operations are really synchronous or with flow control, so big buffer is not needed.
I didn't realize that. Sure, you are right. Best regards, Tomas