Best practices for redundant iBGP/eBGP route distribution? [bird 2.0.7]

Nico Schottelius nico.schottelius at ungleich.ch
Mon Dec 16 15:18:33 CET 2019


Hello Ondrej,


Ondrej Zajicek <santiago at crfreenet.org> writes:>
> I assume that routers from DC1 use IBGP connection to routers in DC2, and
> only routers from DC2 have EBGP connection to upstream 2 (and vice
> versa).

correct, it is an identical setup on both sides.

>> The objective is to stay online in each DC as long as possible, so in
>> theory:
>>
>> - upstream 1 can die or
>> - upstream 2 can die or
>> - the darkfiber can die
>>
>> And in each of the situations, both DCs should still be reachable from
>> outside and both be reaching itself.
>
> If you want to survive splitting AS (when darkfiber dies), you need to
> use and announce separate address range in each DC (or you need some
> backup inter-DC connection, like GRE tunnel). If you announce 2a0a:e5c0::/29
> on both sides, then you get traffic for both sides on both DC.

Thanks, I was fearing that. We will probably need to renumber some
things, as both DCs have addresses (historically) from a single /44

>> Question 1: Is "direct;" is the right protocol for all links?
>>
>> As all links are layer 2 connections, we have configured all links to be
>> direct. However this causes the "Invalid NEXT_HOP attribute" in various
>> situations.
>
> Generally, BGP assumes you already have all internal routes - either from
> OSPF, or equivalent static/direct routes. These are necessary for recursive
> next hop resolution, but can be avoided in direct next hop resolution.

To put a bit more detail into this: we have a transfer/peering vlan that
utilises 2a0a:e5c0:1:8::/64. So every
router/switch/whatever-needs-to-speak-bgp is in this network.

Because every speaker is in this network, my assumption is that it
should be possible to (recursive) resolve every route. Am I wrong on this?

> If you do not use IGP, then direct IBGP sessions may have some advantages,
> namely they can react to link-down events.
>
> Note that direct/multihop session and direct/recursive gateway resolution
> are two options, although the second by default depends on the first. So
> it is e.g. possible to configure direct session with recursive resolution.

Can you elaborate a bit more on the effects?

i.e. if looking only at the effects of the first setting
(direct/multihop) without the adjusted defaults or only at the second
setting (direct/recursive), what are the exact outcome of the 4 combinations

direct/gateway direct
direct/gateway recursive
multihop/gateway direct
multihop/gateway recursive

?

>> Question 2: Is "next hop self ebgp;" the correct answer to the Invalid
>> NEXT_HOP attribute?
>
> If you use direct next hop resolution, then received NEXT_HOP is supposed
> to be directly reachable, ideally on the same interface. So in this case
> you should use 'next hop self;'.

I am puzzled by this statement a bit, because of the following
documentation on the bird website:

"
gateway direct|recursive

For received routes, their gw (immediate next hop) attribute is computed
from received bgp_next_hop attribute. This option specifies how it is
computed. Direct mode means that the IP address from bgp_next_hop is
used if it is directly reachable, otherwise the neighbor IP address is
used.
"

So if I read it here correct, in the direct/direct configuration, the
gateway should be the peer's IPv6 address, if the next hop address is
not directly reachable. However this is *not* what I experience in the
bird 2.0.7 setup. However this *is* what I experience in the bird 1.6
setup.

There is another problem which renders "next hop self" tricky: as all
routers are paired with every other router, we are hiding the origin,
i.e. we announce networks from the dc1 in dc2 and this results in
routing from dc1->dc2->dc1.

> Variant 'next hop self ebgp;' is more for cases where you use recursive
> resolution, but your IGP/internal routes do not cover border/inter-AS
> links.

That sounds a bit similar to the actual situation we have.

>> Question 3: Is "not direct" (aka multiphop) the right thing for iBGP?
>
> As i wrote above, you would need static/direct/IGP routes for internal
> networks (e.g. from 'direct' protocol).
>> So our dcs are directly connected vi layer 2, but the default for iBGP
>> is multihop. If we omit the "direct" keyword, the result is that no
>> routes are in the end imported from the other DC and that we get various
>> warnings like the following in syslog:
>>
>> Dec 16 00:58:35 router2 daemon.warn bird: Next hop address 2a0a:e5c0:1:8::5 resolvable through recursive route for 2a0a:e5c0:1:8::/64
>
> You probably do not want to export your internal routes
> (2a0a:e5c0:1:8::/64) through IBGP.

This is actually directly the result from the "direct" protocol that we
tried to implement as you described :-)

I just reconfigured the test routers to use the combination
direct/gateway recursive and it seems to mostly work, however there is
at least one route that it does not resolve: a route that the router in
dc2 routes to another (non-bgp) router.

It is shown as unreachable in dc1, while it clearly can be reached and
routed in dc2.

The setup is

router2.place5 (2a0a:e5c0:1:8::4)
      | (bgp, routing)
router1.place6 (2a0a:e5c0:1:8::4)
      |
2a0a:e5c0:2:2:0:84ff:fe41:f24d (reachable via 2a0a:e5c0:2:2::/64 router1.place6)
      |
2a0a:e5c1:111::/48 network


dc1 (=place5):
--------------------------------------------------------------------------------
[14:56] router2.place5:~#  ping 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d
PING 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d (2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d): 56 data bytes
ping: sendto: Host is unreachable
[15:07] router2.place5:~#

bird> show route all for 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d
Table master6:
2a0a:e5c1:100::/40   unreachable [router1_place6_ungleich_ch_v6 14:57:01.575 from 2a0a:e5c0:1:8::5] * (100) [i]
        Type: BGP univ
        BGP.origin: IGP
        BGP.as_path:
        BGP.next_hop: 2a0a:e5c0:2:2:0:84ff:fe41:f24d
        BGP.local_pref: 500



dc2 (=place6):
--------------------------------------------------------------------------------
[15:11] router1.place6:~# ping -c2 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d
PING 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d (2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d): 56 data bytes
64 bytes from 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d: seq=0 ttl=62 time=50.923 ms
64 bytes from 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d: seq=1 ttl=62 time=130.140 ms

--- 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 50.923/90.531/130.140 ms
[15:12] router1.place6:~#

bird> show route all for 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d
Table master6:
2a0a:e5c1:100::/40   unicast [place6_v6 23:55:06.876] * (200)
        via 2a0a:e5c0:2:2:0:84ff:fe41:f24d on bond0.12
        Type: static univ
bird>
--------------------------------------------------------------------------------

Shouldn't this resolve via direct/gateway recursive mode?

Best,

Nico


--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch


More information about the Bird-users mailing list