freebsd mbuf cluster leak with bird 2.14
Jan Bramkamp
crest at rlwinm.de
Wed Feb 7 15:55:52 CET 2024
On 11.01.24 21:34, Thomas Steen Rasmussen via Bird-users wrote:
> Hello :)
>
> Yesterday I had one of my FreeBSD routers stop forwarding because it
> ran out of mbuf clusters. It usually operates far from the limit, but
> there is (was) something leaking mbuf clusters bad, and I suspect it
> might be bird.
>
> ----
>
> Some background:
>
> Due to a missing/misconfigured kernel export filter bird was
> repeatedly trying to export some routes to the kernel which the kernel
> already knew (from statically configured blackhole routes). So these
> errors have been repeating in the logs for some time (more than a year):
>
> Jan 11 19:09:04 dgncr2a bird[30963]: KRT: Error sending route
> 2a09:94c0::/29 to kernel: File exists
> Jan 11 19:10:04 dgncr2a syslogd: last message repeated 1 times
> Jan 11 19:10:04 dgncr2a bird[30963]: KRT: Error sending route
> 85.209.116.0/22 to kernel: File exists
> Jan 11 19:11:04 dgncr2a syslogd: last message repeated 1 times
>
> Over the holidays I upgraded from bird 2.0.9 to bird 2.14, as well as
> upgrading FreeBSD from 13-STABLE-384a885111ad to
> 13-STABLE-2cbd132986a7. I suspect one of these two changes made this
> problem appear. I made no changes to bird or router config other than
> the upgrades.
>
> ----
>
> The mbuf cluster leak was pretty bad, like 8-10 clusters per second at
> a pretty steady rate. The kern.ipc.nmbclusters limit on my routers was
> around 2 million and I raised it to 4 million now.
>
> Since I had no idea what was causing the leak and I was desperate for
> a fix I at one point tried adding the missing kernel export filter (as
> to at least silence the noisy warnings in the logs), and imagine my
> surprise when the mbuf cluster leak stopped.
>
> I tried removing the filers again, the leak started again, and stopped
> again when I re-added the filters. It appears some combination of bird
> 2.14 and exporting routes already found in the kernel means leaking
> mbuf clusters like crazy.
>
> I have no idea if this is a bird or a freebsd problem but I have to
> start somewhere :) I can to some extent test stuff, but the routers
> are in production (BGP with 1 ebgp and 1 ibgp peer and no full table)
> so nothing too wild.
Can you please also open a FreeBSD PR for this. It looks like bird is
only the reproducer for a kernel bug. Can you periodically record the
`vmstat -m` and `netstat -m` output as the resources are leaked?
More information about the Bird-users
mailing list