freebsd mbuf cluster leak with bird 2.14

Jan Bramkamp crest at rlwinm.de
Wed Feb 7 15:55:52 CET 2024


On 11.01.24 21:34, Thomas Steen Rasmussen via Bird-users wrote:
> Hello :)
>
> Yesterday I had one of my FreeBSD routers stop forwarding because it 
> ran out of mbuf clusters. It usually operates far from the limit, but 
> there is (was) something leaking mbuf clusters bad, and I suspect it 
> might be bird.
>
> ----
>
> Some background:
>
> Due to a missing/misconfigured kernel export filter bird was 
> repeatedly trying to export some routes to the kernel which the kernel 
> already knew (from statically configured blackhole routes). So these 
> errors have been repeating in the logs for some time (more than a year):
>
> Jan 11 19:09:04 dgncr2a bird[30963]: KRT: Error sending route 
> 2a09:94c0::/29 to kernel: File exists
> Jan 11 19:10:04 dgncr2a syslogd: last message repeated 1 times
> Jan 11 19:10:04 dgncr2a bird[30963]: KRT: Error sending route 
> 85.209.116.0/22 to kernel: File exists
> Jan 11 19:11:04 dgncr2a syslogd: last message repeated 1 times
>
> Over the holidays I upgraded from bird 2.0.9 to bird 2.14, as well as 
> upgrading FreeBSD from 13-STABLE-384a885111ad to 
> 13-STABLE-2cbd132986a7. I suspect one of these two changes made this 
> problem appear. I made no changes to bird or router config other than 
> the upgrades.
>
> ----
>
> The mbuf cluster leak was pretty bad, like 8-10 clusters per second at 
> a pretty steady rate. The kern.ipc.nmbclusters limit on my routers was 
> around 2 million and I raised it to 4 million now.
>
> Since I had no idea what was causing the leak and I was desperate for 
> a fix I at one point tried adding the missing kernel export filter (as 
> to at least silence the noisy warnings in the logs), and imagine my 
> surprise when the mbuf cluster leak stopped.
>
> I tried removing the filers again, the leak started again, and stopped 
> again when I re-added the filters. It appears some combination of bird 
> 2.14 and exporting routes already found in the kernel means leaking 
> mbuf clusters like crazy.
>
> I have no idea if this is a bird or a freebsd problem but I have to 
> start somewhere :) I can to some extent test stuff, but the routers 
> are in production (BGP with 1 ebgp and 1 ibgp peer and no full table) 
> so nothing too wild.
Can you please also open a FreeBSD PR for this. It looks like bird is 
only the reproducer for a kernel bug. Can you periodically record the 
`vmstat -m` and `netstat -m` output as the resources are leaked?


More information about the Bird-users mailing list