On Mon, Aug 17, 2015 at 03:41:00PM -0700, Matthew Walster wrote:
When a BGP sessions comes up/down, there is a period of high CPU while all the new best-routes are computed -- this is understandable.
However, on boxes I have where this table is then promoted to the Linux kernel, the CPU usage stays high for some considerable time. For a full (~600,000 prefix)
From the looks of things, a bunch of netlink messages are being generated to the kernel to RTM_DELROUTE then RTM_NEWROUTE -- each of which is causing the kernel's trie to rebalance which is a fairly costly operation. ... Unfortunately there doesn't appear to be an RTM_CHANGE or similar in Linux, so the DELROUTE will seemingly cause a tree to either be pruned or re-branched, followed by the NEWROUTE causing a full rebalance run -- whereas a CHANGE would (could) hopefully just over-write the value.
Hi I don't know much of kernel internal code, so i cannot answer the further questions, although it seems silly to me to not have some kind of amortized behavior for trie rebalancing. Note that there is NLM_F_REPLACE flag that with RTM_NEWROUTE works like RTM_CHANGE. And BIRD used that in some historic versions. But there is a semantic problem with these replacements - route options works like selector for RTM_DELROUTE, but not for replacement (where they just represent new values). Namely BIRD can specify rtm_protocol RTPROT_BIRD for RTM_DELROUTE and only BIRD routes are affected (and RTM_NEWROUTE would succeed only if the RTM_DELROUTE succeded), while for replacement specifying rtm_protocol RTPROT_BIRD just means that the current route (regardless its rtm_protocol) is replaced with the new one with that rtm_protocol value, which is undesirable. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."