Hi list, I've been experimenting with bird's ECMP features added to the current git head a while back [1] on my Debian based Linux system. I tried the setup below with git head as of today. I have three Linux routers running with bird. Two I called gw (gateway) one being a frontend. The gateways are configured to establish one BGP session to the frontend each, and advertising fd57::1 and fd57::2 to it. The frontend accepts both advertisement, and exports both to the kernel table. All output below comes from this bird. I configured bird like this: log syslog { debug, trace, info, remote, warning, error, auth, fatal, bug }; router id 10.3.101.3; # Filters filter kernel_export { if net ~ [ fd57::/64{128,128} ] then accept; reject; } # BGP Filters filter bgp_import { if net ~ [ fd57::/64{128,128} ] then accept; reject; } filter bgp_export { reject; } # Local devices protocol device { scan time 10; } protocol direct { interface "*"; } protocol kernel { import none; #learn; merge paths on; export filter kernel_export; } # BGP peers protocol bgp 'gw1' { description "gw1"; default bgp_local_pref 100; local fc57::3 as 65001; neighbor fc57::1 as 65000; next hop self; import filter bgp_import; export filter bgp_export; hold time 30; error wait time 5, 30; } protocol bgp 'gw2' { description "gw2"; default bgp_local_pref 100; local fc57::3 as 65001; neighbor fc57::2 as 65000; next hop self; import filter bgp_import; export filter bgp_export; hold time 30; error wait time 5, 30; } On the system I have this address configuration: root@debian:~# ip addr show eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 64000 qdisc pfifo_fast state UP group default qlen 1000 link/ether 02:01:93:0b:5a:e9 brd ff:ff:ff:ff:ff:ff inet 10.10.216.12/24 brd 10.10.216.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fc57::3/64 scope global valid_lft forever preferred_lft forever inet6 fe80::1:93ff:fe0b:5ae9/64 scope link valid_lft forever preferred_lft forever in bird: root@debian:~# birdc6 BIRD 1.5.0 ready. bird> show protocols name proto table state since info device1 Device master up 14:53:54 direct1 Direct master up 14:53:54 kernel1 Kernel master up 14:53:54 static1 Static master up 14:53:54 gw1 BGP master up 14:53:58 Established gw2 BGP master up 14:53:55 Established bird> show route fd57::2/128 via fc57::1 on eth0 [gw1 14:53:58] ! (100) [AS65000i] via fc57::2 on eth0 [gw2 14:53:55] (100) [AS65000i] fd57::1/128 via fc57::1 on eth0 [gw1 14:53:58] ! (100) [AS65000i] via fc57::2 on eth0 [gw2 14:53:55] (100) [AS65000i] fc57::/64 dev eth0 [direct1 14:53:54] * (240) Result being, that the multipath routes are being installed into the kernel routing table as expected once the sessions are up: root@debian:~# ip -6 route show fc57::/64 dev eth0 proto kernel metric 256 fd57::1 via fc57::1 dev eth0 proto bird metric 1024 fd57::1 via fc57::2 dev eth0 proto bird metric 1024 fd57::2 via fc57::1 dev eth0 proto bird metric 1024 fd57::2 via fc57::2 dev eth0 proto bird metric 1024 fe80::/64 dev eth0 proto kernel metric 256 However, after some time, bird seems to confuse itself by the routes it installed and removes the multipath route again. This can be seen again in ip route show: root@debian:~# ip -6 route show fc57::/64 dev eth0 proto kernel metric 256 fd57::1 via fc57::2 dev eth0 proto bird metric 1024 fd57::2 via fc57::2 dev eth0 proto bird metric 1024 fe80::/64 dev eth0 proto kernel metric 256 In bird they are however still received: root@ps:~# birdc6 show route all BIRD 1.5.0 ready. fd57::2/128 via fc57::1 on eth0 [gw1 16:25:25] ! (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::1 fe80::1:bdff:feab:7f12 BGP.local_pref: 100 via fc57::2 on eth0 [gw2 16:25:23] (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::2 fe80::1:bdff:fe05:64b7 BGP.local_pref: 100 fd57::1/128 via fc57::1 on eth0 [gw1 16:25:25] ! (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::1 fe80::1:bdff:feab:7f12 BGP.local_pref: 100 via fc57::2 on eth0 [gw2 16:25:23] (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::2 fe80::1:bdff:fe05:64b7 BGP.local_pref: 100 ... In the bird log, with debug output enabled I can see: Jan 11 15:12:46 ps bird6: gw1: Got KEEPALIVE Jan 11 15:12:49 ps bird6: gw2: Got KEEPALIVE Jan 11 15:12:50 ps bird6: gw1: Sending KEEPALIVE Jan 11 15:12:52 ps bird6: gw2: Sending KEEPALIVE Jan 11 15:12:54 ps bird6: device1: Scanning interfaces Jan 11 15:12:54 ps bird6: kernel1: Scanning routing table Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: will be updated Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: already seen Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: will be updated Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: already seen Jan 11 15:12:54 ps bird6: kernel1: Pruning table master Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: updating Jan 11 15:12:54 ps bird6: Netlink: File exists Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: updating Jan 11 15:12:54 ps bird6: Netlink: File exists After a while, the problem fixes itself, both routes are being installed, and then the problem reappears for the next cycle. Is there a way around this, or is this actually a bug? To me this looks like bird was scanning it's own routes and falsely scans only one of them. Experimenting with "import all", "learn" etc. for the kernel protocol seems to make no difference. [1] https://gitlab.labs.nic.cz/labs/bird/commit/8d9eef17713a9b38cd42bd59c4ce76c3... -- Arno Töll GnuPG Key-ID: 0x9D80F36D
On 11/01/16 18:38, Arno Töll wrote:
Hi list,
I've been experimenting with bird's ECMP features added to the current git head a while back [1] on my Debian based Linux system. I tried the setup below with git head as of today.
I have three Linux routers running with bird. Two I called gw (gateway) one being a frontend. The gateways are configured to establish one BGP session to the frontend each, and advertising fd57::1 and fd57::2 to it.
The frontend accepts both advertisement, and exports both to the kernel table. All output below comes from this bird. I configured bird like this:
log syslog { debug, trace, info, remote, warning, error, auth, fatal, bug };
router id 10.3.101.3;
# Filters
filter kernel_export { if net ~ [ fd57::/64{128,128} ] then accept; reject; }
# BGP Filters
filter bgp_import { if net ~ [ fd57::/64{128,128} ] then accept; reject; }
filter bgp_export { reject; }
# Local devices protocol device { scan time 10; }
protocol direct { interface "*"; }
protocol kernel { import none; #learn; merge paths on; export filter kernel_export; }
# BGP peers
protocol bgp 'gw1' { description "gw1"; default bgp_local_pref 100; local fc57::3 as 65001; neighbor fc57::1 as 65000; next hop self; import filter bgp_import; export filter bgp_export; hold time 30; error wait time 5, 30; }
protocol bgp 'gw2' { description "gw2"; default bgp_local_pref 100; local fc57::3 as 65001; neighbor fc57::2 as 65000; next hop self; import filter bgp_import; export filter bgp_export; hold time 30; error wait time 5, 30; }
On the system I have this address configuration:
root@debian:~# ip addr show eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 64000 qdisc pfifo_fast state UP group default qlen 1000 link/ether 02:01:93:0b:5a:e9 brd ff:ff:ff:ff:ff:ff inet 10.10.216.12/24 brd 10.10.216.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fc57::3/64 scope global valid_lft forever preferred_lft forever inet6 fe80::1:93ff:fe0b:5ae9/64 scope link valid_lft forever preferred_lft forever
in bird:
root@debian:~# birdc6 BIRD 1.5.0 ready. bird> show protocols name proto table state since info device1 Device master up 14:53:54 direct1 Direct master up 14:53:54 kernel1 Kernel master up 14:53:54 static1 Static master up 14:53:54 gw1 BGP master up 14:53:58 Established gw2 BGP master up 14:53:55 Established bird> show route fd57::2/128 via fc57::1 on eth0 [gw1 14:53:58] ! (100) [AS65000i] via fc57::2 on eth0 [gw2 14:53:55] (100) [AS65000i] fd57::1/128 via fc57::1 on eth0 [gw1 14:53:58] ! (100) [AS65000i] via fc57::2 on eth0 [gw2 14:53:55] (100) [AS65000i] fc57::/64 dev eth0 [direct1 14:53:54] * (240)
Result being, that the multipath routes are being installed into the kernel routing table as expected once the sessions are up:
root@debian:~# ip -6 route show fc57::/64 dev eth0 proto kernel metric 256 fd57::1 via fc57::1 dev eth0 proto bird metric 1024 fd57::1 via fc57::2 dev eth0 proto bird metric 1024 fd57::2 via fc57::1 dev eth0 proto bird metric 1024 fd57::2 via fc57::2 dev eth0 proto bird metric 1024 fe80::/64 dev eth0 proto kernel metric 256
However, after some time, bird seems to confuse itself by the routes it installed and removes the multipath route again. This can be seen again in ip route show:
root@debian:~# ip -6 route show fc57::/64 dev eth0 proto kernel metric 256 fd57::1 via fc57::2 dev eth0 proto bird metric 1024 fd57::2 via fc57::2 dev eth0 proto bird metric 1024 fe80::/64 dev eth0 proto kernel metric 256
In bird they are however still received:
root@ps:~# birdc6 show route all BIRD 1.5.0 ready. fd57::2/128 via fc57::1 on eth0 [gw1 16:25:25] ! (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::1 fe80::1:bdff:feab:7f12 BGP.local_pref: 100 via fc57::2 on eth0 [gw2 16:25:23] (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::2 fe80::1:bdff:fe05:64b7 BGP.local_pref: 100 fd57::1/128 via fc57::1 on eth0 [gw1 16:25:25] ! (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::1 fe80::1:bdff:feab:7f12 BGP.local_pref: 100 via fc57::2 on eth0 [gw2 16:25:23] (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::2 fe80::1:bdff:fe05:64b7 BGP.local_pref: 100 ...
In the bird log, with debug output enabled I can see:
Jan 11 15:12:46 ps bird6: gw1: Got KEEPALIVE Jan 11 15:12:49 ps bird6: gw2: Got KEEPALIVE Jan 11 15:12:50 ps bird6: gw1: Sending KEEPALIVE Jan 11 15:12:52 ps bird6: gw2: Sending KEEPALIVE Jan 11 15:12:54 ps bird6: device1: Scanning interfaces Jan 11 15:12:54 ps bird6: kernel1: Scanning routing table Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: will be updated Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: already seen Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: will be updated Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: already seen Jan 11 15:12:54 ps bird6: kernel1: Pruning table master Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: updating Jan 11 15:12:54 ps bird6: Netlink: File exists Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: updating Jan 11 15:12:54 ps bird6: Netlink: File exists
After a while, the problem fixes itself, both routes are being installed, and then the problem reappears for the next cycle.
Is there a way around this, or is this actually a bug? To me this looks like bird was scanning it's own routes and falsely scans only one of them. Experimenting with "import all", "learn" etc. for the kernel protocol seems to make no difference.
This is actually a known bug, because the linux kernel does not properly do ECMP for IPv6. In this case, the API is not symmetrical. You can set routes via the multipath structures, but the Linux kernel splits this up into separate routes internally, because with IPv6 you can now have multiple routes to the same destination that are not linked together (why? Maybe to remove/add one of the nexthops independently or something). When scanning, to bird it looks as though the routes have changed, because they are represented differently from how bird has installed those routes. Because multiple routes to the same destination in the linux kernel have no relation to eachother anymore, don't expect weighted ECMP to work either on IPv6. I would love it if somebody fixed this shit and made it work like IPv4, or at least put weights into ECMP on IPv6 and allow per-packet and per-flow ECMP.. -- Wilco
Hi, On Tue, Jan 12, 2016 at 12:09 AM, Wilco Baan Hofman <wilco@baanhofman.nl> wrote:
In this case, the API is not symmetrical. You can set routes via the multipath structures, but the Linux kernel splits this up into separate routes internally, because with IPv6 you can now have multiple routes to the same destination that are not linked together (why? Maybe to remove/add one of the nexthops independently or something).
You are right. If I do ip -6 route add fd57::1/128 nexthop via fc57::1 nexthop via fc57::2 I get: root@ps:~# ip -6 route show .. fd57::1 via fc57::2 dev eth0 metric 1024 fd57::1 via fc57::1 dev eth0 metric 1024 With IPv4 I get: root@ps:~# ip route add 192.168.0.1/32 nexthop via 10.10.216.1 nexthop via 10.10.216.2 root@ps:~# ip route show ... 192.168.0.1 nexthop via 10.10.216.1 dev eth0 weight 1 nexthop via 10.10.216.2 dev eth0 weight 1 This sucks. I suppose this is merely a Linux "feature", than a bug in bird. Also, as I take it, there is no way around this in bird? That means ECMP with bird on IPv6 is basically useless currently. -- Arno Töll GnuPG Key-ID: 0x9D80F36D
Hi, On Tuesday 12 January 2016 00:09:34 Wilco Baan Hofman wrote:
In this case, the API is not symmetrical. You can set routes via the multipath structures, but the Linux kernel splits this up into separate routes internally, because with IPv6 you can now have multiple routes to the same destination that are not linked together (why? Maybe to remove/add one of the nexthops independently or something).
arguably I think bird should adapt to whatever the public APIs of Linux provide, and not the other way around as long as Linux is a supported platform to bird. This makes me wonder if you guys would accept patches working around this asymmetry for ECMP route in bird in order to have compliant ECMP support in bird for IPv6 based on Ondrej Z.'s patch? If so, do you have any constraints? What about bird 2? (sorry for the bounce Ondrej.) -- Arno Töll GnuPG Key-ID: 0x9D80F36D
On Tue, Jan 19, 2016 at 04:11:10PM +0100, Arno Töll wrote:
Hi,
On Tuesday 12 January 2016 00:09:34 Wilco Baan Hofman wrote:
In this case, the API is not symmetrical. You can set routes via the multipath structures, but the Linux kernel splits this up into separate routes internally, because with IPv6 you can now have multiple routes to the same destination that are not linked together (why? Maybe to remove/add one of the nexthops independently or something).
arguably I think bird should adapt to whatever the public APIs of Linux provide, and not the other way around as long as Linux is a supported platform to bird.
This makes me wonder if you guys would accept patches working around this asymmetry for ECMP route in bird in order to have compliant ECMP support in bird for IPv6 based on Ondrej Z.'s patch? If so, do you have any constraints? What about bird 2?
Hi Sorry for a late answer. Patches for handling IPv6 ECMP support in Linux with current API could be accepted if they are not too crazy. Unfortunately the API has several problems that complicates its usage from BIRD. E.g.: 1. when asynchronous updates are received, they do not contain the whole route, just the modified next hop. 2. when a new ECMP route appears, it is announced as a sequence of next hops, but there is AFAIK no flag for 'this is last next hop'. Seems to me that there are several options how to workaround that. For start, we could support IPv6 ECMP only in non-learn mode. Or we could support learn/import (for IPv6 ECMP) during periodic scans only. Using RTA_MULTIPATH in IPv6 Linux API is for backwards compatibility (although it does not really provide it), so it would make sense to not use that and just send muliple routes. The patch could be against int-new branch for BIRD 2. Or you could send two patches, one against master for BIRD 1.x and one against int-new for BIRD 2. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
I would also be very much interested in this. Back in 2015-07 I started a similar thread, and would be willing to help implement something. I'd need a few pointers into the code, though, as Bird isn't exactly trivial and time is unfortunately a scarce resource for us all. This is the one feature that I really miss on Bird, coming from other Linux routing daemons. I migrated to Bird because it's much more practical to configure, and stable for that matter. But the IPv6 ECMP would sure be handy :) -- Israel G. Lugo Núcleo de Redes e Comunicações Direção de Serviços de Informática Instituto Superior Técnico On 27-01-2016 10:37, Ondrej Zajicek wrote:
On Tue, Jan 19, 2016 at 04:11:10PM +0100, Arno Töll wrote:
Hi,
On Tuesday 12 January 2016 00:09:34 Wilco Baan Hofman wrote:
In this case, the API is not symmetrical. You can set routes via the multipath structures, but the Linux kernel splits this up into separate routes internally, because with IPv6 you can now have multiple routes to the same destination that are not linked together (why? Maybe to remove/add one of the nexthops independently or something).
arguably I think bird should adapt to whatever the public APIs of Linux provide, and not the other way around as long as Linux is a supported platform to bird.
This makes me wonder if you guys would accept patches working around this asymmetry for ECMP route in bird in order to have compliant ECMP support in bird for IPv6 based on Ondrej Z.'s patch? If so, do you have any constraints? What about bird 2?
Hi
Sorry for a late answer. Patches for handling IPv6 ECMP support in Linux with current API could be accepted if they are not too crazy. Unfortunately the API has several problems that complicates its usage from BIRD.
E.g.: 1. when asynchronous updates are received, they do not contain the whole route, just the modified next hop. 2. when a new ECMP route appears, it is announced as a sequence of next hops, but there is AFAIK no flag for 'this is last next hop'.
Seems to me that there are several options how to workaround that. For start, we could support IPv6 ECMP only in non-learn mode. Or we could support learn/import (for IPv6 ECMP) during periodic scans only.
Using RTA_MULTIPATH in IPv6 Linux API is for backwards compatibility (although it does not really provide it), so it would make sense to not use that and just send muliple routes.
The patch could be against int-new branch for BIRD 2. Or you could send two patches, one against master for BIRD 1.x and one against int-new for BIRD 2.
participants (4)
-
Arno Töll -
Israel G. Lugo -
Ondrej Zajicek -
Wilco Baan Hofman