Force gateway recursive lookup in iBGP routes
Hello there, I am currently trying to use BIRD for route propagation from our baremetal Kubernetes clusters (Calico CNI, iBGP sessions within AS65100) into infrastructure via eBGP (private AS) and it works well. The issue I have is when I want also to create BGP peering between BIRD and (MetalLB) service inside Kubernetes cluster (multihop, no NAT involved, session established OK) and I receive routes /32 with BGP.next_hop to IP within Kubernetes cluster (=not directly connected). These routes are marked as "unreachable" even if I explicitly set "gateway recursive". I know this recursive gateway lookup works well for routes learned from eBGP, but I can't make Calico peers external because it won't work in my setup. Calico nodes has iBGP full mesh and I would receive all routes from every single node and I wouldn't be able to distinguish which lives where. Unfortunately BGP peer inside cluster has no support to modify next_hop and always sends self, so I am looking for workaround. Also I cannot set specific "gw" in import filter, because I have these multihop peers configured with bgp neighbor range subnet (there will be multiple of them, I don't know exact IP addresses in advance). Configuration snippet like this: # calico cluster peers filter bgp_in_okubedev1_calico { if net ~ [ 10.96.16.0/20+ ] then accept; reject; } protocol bgp okubedev1m1 { local 10.30.20.180 as 65100; neighbor 10.30.21.19 as 65100; passive yes; ipv4 { import filter bgp_in_okubedev1_calico; export none; }; } # metallb multihop peers filter bgp_in_okubedev1_metallb { # gw is recursively looked up localy and passed into BGP.next_hop #bgp_next_hop = gw; if net ~ [ 10.96.255.32/28+ ] then accept; reject; } protocol bgp okubedev1_lb_tpl { local 10.30.20.180 as 65100; neighbor range 10.96.16.0/20 as 65100; passive yes; multihop; ipv4 { gateway recursive; import filter bgp_in_okubedev1_metallb; export none; }; } Produces following routing table: bird> show route all Table master4: 10.96.255.33/32 unreachable [okubedev1_lb1 10:32:14.973 from 10.96.20.25] * (100) [?] Type: BGP univ BGP.origin: Incomplete BGP.as_path: BGP.next_hop: 10.96.20.25 BGP.local_pref: 0 10.96.20.0/26 unicast [okubedev1m1 11:15:45.704] * (100) [i] via 10.30.21.19 on enp0s4 Type: BGP univ BGP.origin: IGP BGP.as_path: BGP.next_hop: 10.30.21.19 BGP.local_pref: 100 10.30.20.0/22 unicast [direct1 13:52:46.301] * (240) dev enp0s4 Type: device univ I am almost sure I am missing some key BIRD or BGP feature, which I need to know to understand this behavior properly. Any comment or suggestion would be appreciated. Best regards -- Miroslav Kalina Systems developement specialist miroslav.kalina@livesport.eu +420 773 071 848 Livesport s.r.o. Aspira Business Centre Bucharova 2928/14a, 158 00 Praha 5 www.livesport.eu
On Fri, Feb 28, 2020 at 02:01:40PM +0100, Miroslav Kalina wrote:
Hello there,
I am currently trying to use BIRD for route propagation from our baremetal Kubernetes clusters (Calico CNI, iBGP sessions within AS65100) into infrastructure via eBGP (private AS) and it works well.
The issue I have is when I want also to create BGP peering between BIRD and (MetalLB) service inside Kubernetes cluster (multihop, no NAT involved, session established OK) and I receive routes /32 with BGP.next_hop to IP within Kubernetes cluster (=not directly connected). These routes are marked as "unreachable" even if I explicitly set "gateway recursive".
Hello The issue here is that BIRD does not support resolution of recursive gateway through another route with recursive next hop. Recursive route 10.96.255.33/32 uses next hop 10.96.20.25, which is resolved through 10.96.20.0/26, which itself has a recursive next hop. Perhaps you could modify okubedev1m1 / 10.96.20.0/26 to have direct next hop.
Produces following routing table:
bird> show route all Table master4: 10.96.255.33/32 unreachable [okubedev1_lb1 10:32:14.973 from 10.96.20.25] * (100) [?] Type: BGP univ BGP.origin: Incomplete BGP.as_path: BGP.next_hop: 10.96.20.25 BGP.local_pref: 0 10.96.20.0/26 unicast [okubedev1m1 11:15:45.704] * (100) [i] via 10.30.21.19 on enp0s4 Type: BGP univ BGP.origin: IGP BGP.as_path: BGP.next_hop: 10.30.21.19 BGP.local_pref: 100 10.30.20.0/22 unicast [direct1 13:52:46.301] * (240) dev enp0s4 Type: device univ
-- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Thank you for your reply, I just don't understand it well enought. Why is route 10.96.20.0/26 recursive, when it's next hop (10.30.21.19) is in my directly connected network (10.30.20.0/22) ? When I tried to change peers to different ASs (okubedev1m1 AS65102, okubedev1_lb1 AS65101), it's working well even if I don't see much difference in routing table. To me it looks like route for 10.96.20.0/16 doesn't change much. Even fiddling AS path in filter doesn't seem to help. 10.96.255.33/32 unicast [okubedev1_lb1 12:06:22.708 from 10.96.20.57] * (100/?) [AS65101?] via 10.30.21.19 on enp0s4 Type: BGP univ BGP.origin: Incomplete BGP.as_path: 65102 65101 BGP.next_hop: 10.96.20.57 BGP.local_pref: 100 10.96.20.0/26 unicast [okubedev1m1 12:04:17.932] * (100) [AS65102i] via 10.30.21.19 on enp0s4 Type: BGP univ BGP.origin: IGP BGP.as_path: 65102 BGP.next_hop: 10.30.21.19 BGP.local_pref: 100 10.30.20.0/22 unicast [direct1 08:32:56.003] * (240) dev enp0s4 Type: device univ Unfortunately this setup works well in single-node clusters only (okubedev1) but I got duplicated routes for multi-node clusters, which I believe I know why is happening (inner iBGP full mesh within cluster) and I don't want to use it that way. 10.96.2.128/26 unicast [okubedev2m1 12:06:04.814] * (100) [AS65102i] via 10.30.21.4 on enp0s4 Type: BGP univ BGP.origin: IGP BGP.as_path: 65102 BGP.next_hop: 10.30.21.4 BGP.local_pref: 100 unicast [okubedev2n1 12:06:09.604] (100) [AS65102i] via 10.30.21.5 on enp0s4 Type: BGP univ BGP.origin: IGP BGP.as_path: 65102 BGP.next_hop: 10.30.21.5 BGP.local_pref: 100 Thanks for your time. Best regards On 2/28/20 4:51 PM, Ondrej Zajicek wrote:
On Fri, Feb 28, 2020 at 02:01:40PM +0100, Miroslav Kalina wrote:
Hello there,
I am currently trying to use BIRD for route propagation from our baremetal Kubernetes clusters (Calico CNI, iBGP sessions within AS65100) into infrastructure via eBGP (private AS) and it works well.
The issue I have is when I want also to create BGP peering between BIRD and (MetalLB) service inside Kubernetes cluster (multihop, no NAT involved, session established OK) and I receive routes /32 with BGP.next_hop to IP within Kubernetes cluster (=not directly connected). These routes are marked as "unreachable" even if I explicitly set "gateway recursive". Hello
The issue here is that BIRD does not support resolution of recursive gateway through another route with recursive next hop. Recursive route 10.96.255.33/32 uses next hop 10.96.20.25, which is resolved through 10.96.20.0/26, which itself has a recursive next hop.
Perhaps you could modify okubedev1m1 / 10.96.20.0/26 to have direct next hop.
Produces following routing table:
bird> show route all Table master4: 10.96.255.33/32 unreachable [okubedev1_lb1 10:32:14.973 from 10.96.20.25] * (100) [?] Type: BGP univ BGP.origin: Incomplete BGP.as_path: BGP.next_hop: 10.96.20.25 BGP.local_pref: 0 10.96.20.0/26 unicast [okubedev1m1 11:15:45.704] * (100) [i] via 10.30.21.19 on enp0s4 Type: BGP univ BGP.origin: IGP BGP.as_path: BGP.next_hop: 10.30.21.19 BGP.local_pref: 100 10.30.20.0/22 unicast [direct1 13:52:46.301] * (240) dev enp0s4 Type: device univ
-- Miroslav Kalina Systems developement specialist miroslav.kalina@livesport.eu +420 773 071 848 Livesport s.r.o. Aspira Business Centre Bucharova 2928/14a, 158 00 Praha 5 www.livesport.eu
On Mon, Mar 02, 2020 at 08:24:19AM +0100, Miroslav Kalina wrote:
Thank you for your reply, I just don't understand it well enought.
Why is route 10.96.20.0/26 recursive, when it's next hop (10.30.21.19) is in my directly connected network (10.30.20.0/22) ?
Because it is received by IBGP (okubedev1m1) and all IBGP routes are by default recursive, even if they are resolved using a direct route. If you configure okubedev1m1 as EBGP (or just set it to 'gateway direct' mode), it should fix the routes from okubedev1_lb1. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Wow, perfect. Thank you for clarifying this. So iBGP sessions are multihop by default and all routes learned from them are treated as "gateway recursive". As soon as I turned off multihop feature with "direct" keyword (because there is nothing like "multihop off") it's receiving routes in same manner as eBGP and it started working. Is there any way how to display routes (like "show route all") with information which routes are treated as direct / recursive? Thank you very much for your time. Best regards On 03. 03. 20 15:04, Ondrej Zajicek wrote:
On Mon, Mar 02, 2020 at 08:24:19AM +0100, Miroslav Kalina wrote:
Thank you for your reply, I just don't understand it well enought.
Why is route 10.96.20.0/26 recursive, when it's next hop (10.30.21.19) is in my directly connected network (10.30.20.0/22) ? Because it is received by IBGP (okubedev1m1) and all IBGP routes are by default recursive, even if they are resolved using a direct route.
If you configure okubedev1m1 as EBGP (or just set it to 'gateway direct' mode), it should fix the routes from okubedev1_lb1.
-- Miroslav Kalina Systems development specialist miroslav.kalina@livesport.eu +420 773 071 848 Livesport s.r.o. Aspira Business Centre Bucharova 2928/14a, 158 00 Praha 5 www.livesport.eu
participants (2)
-
Miroslav Kalina -
Ondrej Zajicek