Best practices for redundant iBGP/eBGP route distribution? [bird 2.0.7]

Ondrej Zajicek santiago at crfreenet.org
Mon Dec 16 14:26:14 CET 2019


On Mon, Dec 16, 2019 at 01:24:23AM +0100, Nico Schottelius wrote:
> 
> Good morning,
> 
> I was wondering, what is the best practice is for running 2 data
> centers, each with one uplink and a dark fiber in between them?
> 
> I think it's pretty much a standard situation, but will show our setup
> and assumptions as well:
> 
> upstream 1        upstream 2
>  |                |
> dc1--darkfiber----dc 2
> 
> Additional each DC has 2 routers:
> 
> dc1:            dc2:
> |---router1 ------- router 1
> |              /
> |             /
> |   /--------/
> |  router2---------router2
> |                     |
> |---------------------|
> 
> (both routers are connected to each other and both routers are connected
> to each upstream)

Hi

I assume that routers from DC1 use IBGP connection to routers in DC2, and
only routers from DC2 have EBGP connection to upstream 2 (and vice versa).


> The objective is to stay online in each DC as long as possible, so in
> theory:
> 
> - upstream 1 can die or
> - upstream 2 can die or
> - the darkfiber can die
> 
> And in each of the situations, both DCs should still be reachable from
> outside and both be reaching itself.

If you want to survive splitting AS (when darkfiber dies), you need to
use and announce separate address range in each DC (or you need some
backup inter-DC connection, like GRE tunnel). If you announce 2a0a:e5c0::/29
on both sides, then you get traffic for both sides on both DC.


> Question 1: Is "direct;" is the right protocol for all links?
> 
> As all links are layer 2 connections, we have configured all links to be
> direct. However this causes the "Invalid NEXT_HOP attribute" in various
> situations.

Generally, BGP assumes you already have all internal routes - either from
OSPF, or equivalent static/direct routes. These are necessary for recursive
next hop resolution, but can be avoided in direct next hop resolution.

If you do not use IGP, then direct IBGP sessions may have some advantages,
namely they can react to link-down events.

Note that direct/multihop session and direct/recursive gateway resolution
are two options, although the second by default depends on the first. So
it is e.g. possible to configure direct session with recursive resolution.


> Question 2: Is "next hop self ebgp;" the correct answer to the Invalid
> NEXT_HOP attribute?

If you use direct next hop resolution, then received NEXT_HOP is supposed
to be directly reachable, ideally on the same interface. So in this case
you should use 'next hop self;'.

Variant 'next hop self ebgp;' is more for cases where you use recursive
resolution, but your IGP/internal routes do not cover border/inter-AS
links.


> Question 3: Is "not direct" (aka multiphop) the right thing for iBGP?

As i wrote above, you would need static/direct/IGP routes for internal
networks (e.g. from 'direct' protocol).


> So our dcs are directly connected vi layer 2, but the default for iBGP
> is multihop. If we omit the "direct" keyword, the result is that no
> routes are in the end imported from the other DC and that we get various
> warnings like the following in syslog:
> 
> Dec 16 00:58:35 router2 daemon.warn bird: Next hop address 2a0a:e5c0:1:8::5 resolvable through recursive route for 2a0a:e5c0:1:8::/64

You probably do not want to export your internal routes
(2a0a:e5c0:1:8::/64) through IBGP.


> Question 4: How to (not) announce umbrella networks?
> 
> So both data centers are reachable within the 2a0a:e5c0::/29
> prefix and all routers have a statement
> 
>         route 2a0a:e5c0::/29 unreachable;
> 
> in the static protocol. However if the dark fiber goes down, it is not
> clear that / how the upstreams should / will propagate the smaller (per
> DC) /48s.
> 
> If we remove the full /29 announcement, we might run into the problem
> that a packet is being sent out through one upstream and returns back to
> us and thus creates a routing loop.

You should split your range to say /32 per DC and declare each separately
in each DC. Then these /32 ranges would be announced by IBGP from one DC
to the other, so routers in both DCs would announce both ranges to EBGP.
If dark fiber goes down, IBGP is broken and each DC would announce only
its /32 range.

-- 
Elen sila lumenn' omentielvo

Ondrej 'Santiago' Zajicek (email: santiago at crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20191216/de3736db/attachment.sig>


More information about the Bird-users mailing list