@Joe, thanks for starting this thread, I was just thinking along this line of making sure bird is running. Apollon, I would love to give your plugin a try. To summarize, I'm deploying Bird on 120 machines (2 nic each) + 6 routers (total of 240 ptp links in a clos configuration) and participate in a single area OSPF. Anyone know whether bird have been being deploy/tested in this use case? thanks, -bn 0216331C On Mon, Apr 22, 2013 at 4:35 AM, Apollon Oikonomopoulos <apollon@skroutz.gr>wrote:
On 19:20 Mon 22 Apr , Joe Wooller wrote:
On a similar note, does anyone monitor the process externally, (say via nagios or the like?) I would be interested to see how people monitor the active process, and possibly if anyone monitors sessions and prefixes received, used/filtered?
I haven't been able to find any thing out there that suits my needs, so with the assistance of a friend we have come up with this, still in progress though..
https://github.com/dowlingw/bird-tool
Cheers Joe
We are monitoring the process with icinga & check_mk. Check_mk has a notion of state, so we essentially persist ("inventorize") the admin state of all protocols (up/down) and their status (Connected, Running etc) and if anything changes we get an alert.
I could share the check_mk plugin that does all this, if anyone is interested.
We also monitor the prefix count directly in the Linux kernel (via /proc/net/fib_triestat and /proc/net/rt6_stats) and use it to score our keepalived processes higher or lower and possibly trigger a failover of the access interface IPs if one router seems to receive significantly less prefixes than the other for some reason.
Cheers, Apollon