Hello Ondrej, Ondrej Zajicek <santiago@crfreenet.org> writes:>
I assume that routers from DC1 use IBGP connection to routers in DC2, and only routers from DC2 have EBGP connection to upstream 2 (and vice versa).
correct, it is an identical setup on both sides.
The objective is to stay online in each DC as long as possible, so in theory:
- upstream 1 can die or - upstream 2 can die or - the darkfiber can die
And in each of the situations, both DCs should still be reachable from outside and both be reaching itself.
If you want to survive splitting AS (when darkfiber dies), you need to use and announce separate address range in each DC (or you need some backup inter-DC connection, like GRE tunnel). If you announce 2a0a:e5c0::/29 on both sides, then you get traffic for both sides on both DC.
Thanks, I was fearing that. We will probably need to renumber some things, as both DCs have addresses (historically) from a single /44
Question 1: Is "direct;" is the right protocol for all links?
As all links are layer 2 connections, we have configured all links to be direct. However this causes the "Invalid NEXT_HOP attribute" in various situations.
Generally, BGP assumes you already have all internal routes - either from OSPF, or equivalent static/direct routes. These are necessary for recursive next hop resolution, but can be avoided in direct next hop resolution.
To put a bit more detail into this: we have a transfer/peering vlan that utilises 2a0a:e5c0:1:8::/64. So every router/switch/whatever-needs-to-speak-bgp is in this network. Because every speaker is in this network, my assumption is that it should be possible to (recursive) resolve every route. Am I wrong on this?
If you do not use IGP, then direct IBGP sessions may have some advantages, namely they can react to link-down events.
Note that direct/multihop session and direct/recursive gateway resolution are two options, although the second by default depends on the first. So it is e.g. possible to configure direct session with recursive resolution.
Can you elaborate a bit more on the effects? i.e. if looking only at the effects of the first setting (direct/multihop) without the adjusted defaults or only at the second setting (direct/recursive), what are the exact outcome of the 4 combinations direct/gateway direct direct/gateway recursive multihop/gateway direct multihop/gateway recursive ?
Question 2: Is "next hop self ebgp;" the correct answer to the Invalid NEXT_HOP attribute?
If you use direct next hop resolution, then received NEXT_HOP is supposed to be directly reachable, ideally on the same interface. So in this case you should use 'next hop self;'.
I am puzzled by this statement a bit, because of the following documentation on the bird website: " gateway direct|recursive For received routes, their gw (immediate next hop) attribute is computed from received bgp_next_hop attribute. This option specifies how it is computed. Direct mode means that the IP address from bgp_next_hop is used if it is directly reachable, otherwise the neighbor IP address is used. " So if I read it here correct, in the direct/direct configuration, the gateway should be the peer's IPv6 address, if the next hop address is not directly reachable. However this is *not* what I experience in the bird 2.0.7 setup. However this *is* what I experience in the bird 1.6 setup. There is another problem which renders "next hop self" tricky: as all routers are paired with every other router, we are hiding the origin, i.e. we announce networks from the dc1 in dc2 and this results in routing from dc1->dc2->dc1.
Variant 'next hop self ebgp;' is more for cases where you use recursive resolution, but your IGP/internal routes do not cover border/inter-AS links.
That sounds a bit similar to the actual situation we have.
Question 3: Is "not direct" (aka multiphop) the right thing for iBGP?
As i wrote above, you would need static/direct/IGP routes for internal networks (e.g. from 'direct' protocol).
So our dcs are directly connected vi layer 2, but the default for iBGP is multihop. If we omit the "direct" keyword, the result is that no routes are in the end imported from the other DC and that we get various warnings like the following in syslog:
Dec 16 00:58:35 router2 daemon.warn bird: Next hop address 2a0a:e5c0:1:8::5 resolvable through recursive route for 2a0a:e5c0:1:8::/64
You probably do not want to export your internal routes (2a0a:e5c0:1:8::/64) through IBGP.
This is actually directly the result from the "direct" protocol that we tried to implement as you described :-) I just reconfigured the test routers to use the combination direct/gateway recursive and it seems to mostly work, however there is at least one route that it does not resolve: a route that the router in dc2 routes to another (non-bgp) router. It is shown as unreachable in dc1, while it clearly can be reached and routed in dc2. The setup is router2.place5 (2a0a:e5c0:1:8::4) | (bgp, routing) router1.place6 (2a0a:e5c0:1:8::4) | 2a0a:e5c0:2:2:0:84ff:fe41:f24d (reachable via 2a0a:e5c0:2:2::/64 router1.place6) | 2a0a:e5c1:111::/48 network dc1 (=place5): -------------------------------------------------------------------------------- [14:56] router2.place5:~# ping 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d PING 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d (2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d): 56 data bytes ping: sendto: Host is unreachable [15:07] router2.place5:~# bird> show route all for 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d Table master6: 2a0a:e5c1:100::/40 unreachable [router1_place6_ungleich_ch_v6 14:57:01.575 from 2a0a:e5c0:1:8::5] * (100) [i] Type: BGP univ BGP.origin: IGP BGP.as_path: BGP.next_hop: 2a0a:e5c0:2:2:0:84ff:fe41:f24d BGP.local_pref: 500 dc2 (=place6): -------------------------------------------------------------------------------- [15:11] router1.place6:~# ping -c2 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d PING 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d (2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d): 56 data bytes 64 bytes from 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d: seq=0 ttl=62 time=50.923 ms 64 bytes from 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d: seq=1 ttl=62 time=130.140 ms --- 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 50.923/90.531/130.140 ms [15:12] router1.place6:~# bird> show route all for 2a0a:e5c1:111:111:a2af:bdff:fe2a:7b2d Table master6: 2a0a:e5c1:100::/40 unicast [place6_v6 23:55:06.876] * (200) via 2a0a:e5c0:2:2:0:84ff:fe41:f24d on bond0.12 Type: static univ bird> -------------------------------------------------------------------------------- Shouldn't this resolve via direct/gateway recursive mode? Best, Nico -- Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch