On 27.03.2012 20:44, Ondrej Zajicek wrote:
On Tue, Mar 27, 2012 at 07:13:11PM +0400, Alexander V. Chernikov wrote:
On 26.03.2012 03:25, Ondrej Zajicek wrote:
On Mon, Mar 12, 2012 at 11:22:10PM +0400, Oleg wrote:
On Mon, Mar 12, 2012 at 02:27:23PM +0400, Alexander V. Chernikov wrote:
On 12.03.2012 13:25, Oleg wrote:
Hi, all.
I have some experience with bird under heavy cpu load. I had a situation when bird do frequent updates of kernel table, because of bgp-session frequent down/up (because of cpu and/or net load).
Hello
Answering collectively for the whole thread:
I did some preliminary testing and it on my test machine exporting full BGP feed (cca 400k routes) to a kernel table took 1-2 sec on Linux and 5-6 sec on BSD. Similar time for flushing the kernel table. Therefore, if we devote a half CPU for kernel sync, we have about 200 kr/s (kiloroutes per second) for Linux and 40 kr/s for BSD, this still seems more than enough for an edge router. Are there any estimates (using protocol statistics) for number of updates to kernel proto in this case? How many protocols, tables and ppie do you have in your case?
The key to responsiveness (and ability to send keepalives on time) during heavy CPU load is in granularity. The main problem in BIRD is that whole route propagation is done synchronously - when route is received, it is propagated through all pipes and all routing tables to all final receivers in one step, which is problematic if you have several hundreds of BGP sessions (but probably not too problematic with I've be been playing with BGP/core code in preparations for peer-groups implementation.
Setup: 1 peer with full-view (1), 1 peer as full-view receiver (10. both are disabled by default. We're starting bird, enables peer 1. After full-view is received we enables second peer.
Some bgp bucket statistics: max_feed: 256 iterations: 1551 buckets: 362184 routes: 397056 effect: 8% max_feed: 512 iterations: 775 buckets: 351902 routes: 396800 effect: 11% max_feed: 1024 iterations: 387 buckets: 335773 routes: 396288 effect: 15% max_feed: 2048 iterations: 193 buckets: 300434 routes: 395264 effect: 23% max_feed: 4096 iterations: 96 buckets: 255752 routes: 393216 effect: 34% max_feed: 8192 iterations: 48 buckets: 216780 routes: 393216 effect: 44%
'Effect' means (routes - buckets) * 100 / routes e.g. how much prefixes are stored in existing buckets.
Maybe we can consider making max_feed value to be auto-tuned ? e.g. to be 8 or 16k for small total amount of protocols. If we assume max_feed * proto_count to be const (which keeps granularity at the same level), and say that we use default feed (256) for 256 protocols, we can automatically recalculate max_feed on every { configure, protocol enabled/disabled state change / whatever }
Is there any point to trying to achieve efficient route packing to buckets? Most rx processing is done per route, so buckets just save some TCP data transfers. Decrease number of packets being sent to large amount of readers (actually, peer-group members). Yes, it is not the piece where the main CPU-intensive parts are done, but this can still save, say, 1-5% without any cost from our side. Why not be polite if we can?
Our typical firewall, see number of messages received: Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd XXX-BIRD 4 XXXXX 11394287 93753 0 0 0 03:53:52 403577 X-QUAGGA 4 XXXXX 11185307 92512 0 0 0 01w5d18h 403583 X-QUAGGA 4 XXXXX 26569910 93805 0 0 0 05w6d06h 403411
BTW, these results depends on many things like how big buffers kernel has for TCP and how fast the other side is able to acknowledge receiving data. I guess that first, BIRD probably flushes buckets from BGP to TCP as fast as they are generated with minimal packing (depending mostly on granularity or max_feed), later TCP buffers became full, sending updates is postponed and BGP bucket cache started to fill (you could see that in 'show memory').
If you want to get efficient packing, probably most elegant solution would be to add some delay (like 2 s) before activating/scheduling sending BGP update packets. Or some smart approach, like if BGP bucket cache contains at least x buckets, schedule updates immediately, otherwise schedule them after 2 s.
Thanks for the idea. I should have thought about that approach.
Another possible problem is a kernel scan, which is also done in one step, but at least in Linux it could probably be splitted to smaller steps and does not took too much time if the kernel table is in expected state. ... CLI interface can easily be another abuser: bird> show route count 2723321 of 2723321 routes for 407158 networks
If I do 'show route' for such table this can block bird for 10? seconds.
Really? show route processing is splitted per 64 routes, so i suppose that only the CLI session is blocked.
Not exactly. 'show route' result shows up immediately, but when I press 'q' to quit from the results I see 'bird' process eating CPU for ~5 seconds.