IPv6 BGP & kernel 4.19

micah anderson micah at riseup.net
Thu Sep 24 17:03:28 CEST 2020


Oliver <bird-o at sernet.de> writes:

> Hello,
>
> after upgrading to debian buster with kernel 4.19 we also had problems.
>
> By adjusting net.ipv6.route.max_size we have fixed the following messages:
> watchdog: BUG: soft lockup - CPU#X stuck for 22s! 
> and
> ixgbe 0000:02:00.0 ens2fX: initiating reset due to tx timeout
>
> But we still had a lot of jitter on the line. Downgrading to 4.9.0 fixed the
> problem, but this is not a permanent solution.
>
> What else did we tried:
> * Increasing gc_threshX
> net.ipv6.neigh.default.gc_thresh1 = 2048
> net.ipv6.neigh.default.gc_thresh2 = 4096
> net.ipv6.neigh.default.gc_thresh3 = 8192
> => Did not help

The linux kernel is getting rid of ipv6 caching, like it did with ipv4,
but it will take some time to get there. It seems that in this kernel
they have set a small value for net.ipv6.route.max_size (4096!), and
when this parameter is increased (e.g. 1048576).... the problem went
away for us.

I'm not 100% clear on what units this value is, I had around 89k ipv6
routes, so this value is definitely higher. I'm sure that setting t too
high could result in some memory issues.

Additionally, you also want to raise net.ipv6.route.gc_thresh to avoid
running the garbage collector too often. I found that the rule of thumb
here is 1/4 the size of ipv6.route.max_size.

I did find that in Linux kernel 5.2 there is a message output to the
kernel ring buffer when the ipv6.route.max_size is hit, so you at least
have a *clue* what is going on. In 4.19, which is what Debian Buster is,
you don't get that clue.

-- 
        micah


More information about the Bird-users mailing list