On Thu, Sep 18, 2014 at 03:27:21PM +0300, Javor Kliachev wrote:
Hello,
We use bird 1.4.2 as route server with multiple RIBs with ~100 BGP active sessions. Over one of these sessions, we're receiving ~ 360k prefixes and re-announcing them to all other sessions.
By my calculations the total amount of all prefixes in all RIBs is about ~ 3600000 and till now everything was OK.
*But today we have experienced the following issue: *
When we stopped the session that we received ~360k, BIRD daemon went to 100% usage and held this behavior for period of~5-6 min. This event caused а lot of ( but not all ) sessions to start flapping.
Here is the output taken from our log during the event:
Sep 18 10:40:51 rs2 bird: R0_69: Received: Hold timer expired Sep 18 10:40:51 rs2 bird: R0_69: BGP session closed Sep 18 10:40:52 rs2 bird: R0_69: Down Sep 18 10:41:11 rs2 bird: R0_69: Startup delayed by 60 seconds Sep 18 10:41:45 rs2 bird: R0_69: Incoming connection from 10.0.0.69 (port 13073) rejected Sep 18 10:41:59 rs2 bird: R0_69: Started Sep 18 10:42:36 rs2 bird: R0_69: Incoming connection from 10.0.0.69 (port 24222) accepted Sep 18 10:42:36 rs2 bird: R0_69: BGP session established
The above lines was repeated for all other affected sessions.
Hello Thanks for the bugreport. Could you send me the config file, the whole log and information when exactly BIRD went to 100% and then back? Even in permanent 100% CPU load, BIRD shouldn't miss timers for sending keepalive packets. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."