HA setup (v 1.4.5)
I have BIRD installed to test working with a custom forwarding plane. The forwarding plane supports HA in the following way : - Syncs up routing tables - Provides millisecond failover (since routing tables are synced up) - Shared MAC - At any given time only one of the appliance is actively receiving and processing packets. - Both appliances have BIRD running (rte_notify sends route updates to forwarding plane) and a small wrapper that processes forwarding table routing updates (static routes , etc) Problem : At any point of time, the forwarding plane has correct set of routes on both appliances but bird's routing tables are out of sync since only the active appliance is processing packets (eg : OSPF LSA). So when we failover, BIRD running on secondary appliance does not have any routes until it learns them through route updates (since now secondary appliance is processing the LSA's ). This can also result in stale routes in forwarding plane. How do we support failover condition in BIRD ? (both OSPF and BGP (multiple instances)) I would highly appreciate some thoughts on this issue. I did think about BFD but then currently (if I interpreted it correctly supports only a single instance of OSPF or BGP). Graceful recovery is only implemented for BGP so not completely usable. -- Jigar Mehta
On Tue, Sep 15, 2015 at 09:08:00AM -0400, Jigar Mehta wrote:
I have BIRD installed to test working with a custom forwarding plane. The forwarding plane supports HA in the following way :
- Syncs up routing tables - Provides millisecond failover (since routing tables are synced up) - Shared MAC - At any given time only one of the appliance is actively receiving and processing packets. - Both appliances have BIRD running (rte_notify sends route updates to forwarding plane) and a small wrapper that processes forwarding table routing updates (static routes , etc)
Hello Am i understand it correctly that you have two BIRD instances on two separate appliances/OSes that share a FIB and network interfaces? Does one instance sees routes in FIB from the other instance? How failover works, is it somehow signalized to appliances, or BIRD just start receiving packets? Could both appliances communicate each with other? Could the hardware be configured in such a way that intefaces have separate MACs for both appliances so both could receive and process packets destinated to it? In that case you could just run two regular routing instances (from the neighbor point of view), both generating the same routing table. I don't think BFD could be used for this purpose. Graceful recovery could be used in this case, but you are right that BIRD currently does not support it for OSPF and it would also needed some modifications - BIRD starts graceful recovery immediately after start, so you could either postpone BIRD startup after failover or modify BIRD to wait for it. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
I do have two bird instances on two seperate appliances (bird uses a custom forwarding plane instead of kernel) . These appliances have their own FIB but share interfaces. The appliances send keep-alives between them to provide failover They cannot use seperate interfaces. Due to this reason, only the 'active' appliance receives packets (say LSA). BIRD running on active appliance has all the routes in its FIB but the one running on secondary appliance has an empty fib. However, since the appliances talk to each other, they sync up their FIB (forwarding plane routing table is synced up) so after failover, the secondary appliance can take over with minimal outage. My problem as I mentioned earlier is : After failover, BIRD running on secondary instance will need to learn routes since it will now start receiving packets from peers : Apart from Graceful Recovery, is there a way BIRD knows that its synced up all the routes from neighbors (need to support both OSPF and BGP). On Thu, Sep 17, 2015 at 5:37 AM, Ondrej Zajicek <santiago@crfreenet.org> wrote:
On Tue, Sep 15, 2015 at 09:08:00AM -0400, Jigar Mehta wrote:
I have BIRD installed to test working with a custom forwarding plane. The forwarding plane supports HA in the following way :
- Syncs up routing tables - Provides millisecond failover (since routing tables are synced up) - Shared MAC - At any given time only one of the appliance is actively receiving and processing packets. - Both appliances have BIRD running (rte_notify sends route updates to forwarding plane) and a small wrapper that processes forwarding table routing updates (static routes , etc)
Hello
Am i understand it correctly that you have two BIRD instances on two separate appliances/OSes that share a FIB and network interfaces? Does one instance sees routes in FIB from the other instance? How failover works, is it somehow signalized to appliances, or BIRD just start receiving packets? Could both appliances communicate each with other?
Could the hardware be configured in such a way that intefaces have separate MACs for both appliances so both could receive and process packets destinated to it? In that case you could just run two regular routing instances (from the neighbor point of view), both generating the same routing table.
I don't think BFD could be used for this purpose.
Graceful recovery could be used in this case, but you are right that BIRD currently does not support it for OSPF and it would also needed some modifications - BIRD starts graceful recovery immediately after start, so you could either postpone BIRD startup after failover or modify BIRD to wait for it.
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
-- Jigar Mehta
participants (2)
-
Jigar Mehta -
Ondrej Zajicek