BIRD and ECMP on Linux seems flaky

Wilco Baan Hofman wilco at baanhofman.nl
Tue Jan 12 00:09:34 CET 2016


On 11/01/16 18:38, Arno Töll wrote:
> Hi list,
>
> I've been experimenting with bird's ECMP features added to the current git
> head a while back [1] on my Debian based Linux system. I tried the setup below
> with git head as of today.
>
> I have three Linux routers running with bird. Two I called gw (gateway) one
> being a frontend. The gateways are configured to establish one BGP session to
> the frontend each, and advertising fd57::1 and fd57::2 to it.
>
> The frontend accepts both advertisement, and exports both to the
> kernel table. All output below comes from this bird. I configured bird
> like this:
>
>
>
> log syslog { debug, trace, info, remote, warning, error, auth, fatal, bug };
>
> router id 10.3.101.3;
>
> # Filters
>
> filter kernel_export {
>     if net ~ [ fd57::/64{128,128} ] then accept;
>     reject;
> }
>
>
> # BGP Filters
>
> filter bgp_import {
>     if net ~ [ fd57::/64{128,128} ] then accept;
>     reject;
> }
>
> filter bgp_export {
>     reject;
> }
>
> # Local devices
> protocol device {
>     scan time 10;
> }
>
> protocol direct {
>     interface "*";
> }
>
> protocol kernel {
>     import none;
>     #learn;
>     merge paths on;
>     export filter kernel_export;
> }
>
>
> # BGP peers
>
> protocol bgp 'gw1' {
>     description "gw1";
>     default bgp_local_pref 100;
>     local fc57::3 as 65001;
>     neighbor fc57::1 as 65000;
>     next hop self;
>     import filter bgp_import;
>     export filter bgp_export;
>     hold time 30;
>     error wait time 5, 30;
> }
>
> protocol bgp 'gw2' {
>     description "gw2";
>     default bgp_local_pref 100;
>     local fc57::3 as 65001;
>     neighbor fc57::2 as 65000;
>     next hop self;
>     import filter bgp_import;
>     export filter bgp_export;
>     hold time 30;
>     error wait time 5, 30;
> }
>
> On the system I have this address configuration:
>
> root at debian:~# ip addr show eth0
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 64000 qdisc pfifo_fast state UP
> group default qlen 1000
>     link/ether 02:01:93:0b:5a:e9 brd ff:ff:ff:ff:ff:ff
>     inet 10.10.216.12/24 brd 10.10.216.255 scope global eth0
>        valid_lft forever preferred_lft forever
>     inet6 fc57::3/64 scope global
>        valid_lft forever preferred_lft forever
>     inet6 fe80::1:93ff:fe0b:5ae9/64 scope link
>        valid_lft forever preferred_lft forever
>
>
> in bird:
>
>
> root at debian:~# birdc6
> BIRD 1.5.0 ready.
> bird> show protocols
> name     proto    table    state  since       info
> device1  Device   master   up     14:53:54
> direct1  Direct   master   up     14:53:54
> kernel1  Kernel   master   up     14:53:54
> static1  Static   master   up     14:53:54
> gw1      BGP      master   up     14:53:58    Established
> gw2      BGP      master   up     14:53:55    Established
> bird> show route
> fd57::2/128        via fc57::1 on eth0 [gw1 14:53:58] ! (100) [AS65000i]
>                    via fc57::2 on eth0 [gw2 14:53:55] (100) [AS65000i]
> fd57::1/128        via fc57::1 on eth0 [gw1 14:53:58] ! (100) [AS65000i]
>                    via fc57::2 on eth0 [gw2 14:53:55] (100) [AS65000i]
> fc57::/64          dev eth0 [direct1 14:53:54] * (240)
>
>
>
> Result being, that the multipath routes are being installed into the kernel
> routing table as expected once the sessions are up:
>
> root at debian:~# ip -6 route show
> fc57::/64 dev eth0  proto kernel  metric 256
> fd57::1 via fc57::1 dev eth0  proto bird  metric 1024
> fd57::1 via fc57::2 dev eth0  proto bird  metric 1024
> fd57::2 via fc57::1 dev eth0  proto bird  metric 1024
> fd57::2 via fc57::2 dev eth0  proto bird  metric 1024
> fe80::/64 dev eth0  proto kernel  metric 256
>
> However, after some time, bird seems to confuse itself by the routes it
> installed and removes the multipath route again. This can be seen again in ip
> route show:
>
> root at debian:~# ip -6 route show
> fc57::/64 dev eth0  proto kernel  metric 256
> fd57::1 via fc57::2 dev eth0  proto bird  metric 1024
> fd57::2 via fc57::2 dev eth0  proto bird  metric 1024
> fe80::/64 dev eth0  proto kernel  metric 256
>
> In bird they are however still received:
>
> root at ps:~# birdc6 show route all
> BIRD 1.5.0 ready.
> fd57::2/128        via fc57::1 on eth0 [gw1 16:25:25] ! (100) [AS65000i]
>         Type: BGP unicast univ
>         BGP.origin: IGP
>         BGP.as_path: 65000
>         BGP.next_hop: fc57::1 fe80::1:bdff:feab:7f12
>         BGP.local_pref: 100
>                    via fc57::2 on eth0 [gw2 16:25:23] (100) [AS65000i]
>         Type: BGP unicast univ
>         BGP.origin: IGP
>         BGP.as_path: 65000
>         BGP.next_hop: fc57::2 fe80::1:bdff:fe05:64b7
>         BGP.local_pref: 100
> fd57::1/128        via fc57::1 on eth0 [gw1 16:25:25] ! (100) [AS65000i]
>         Type: BGP unicast univ
>         BGP.origin: IGP
>         BGP.as_path: 65000
>         BGP.next_hop: fc57::1 fe80::1:bdff:feab:7f12
>         BGP.local_pref: 100
>                    via fc57::2 on eth0 [gw2 16:25:23] (100) [AS65000i]
>         Type: BGP unicast univ
>         BGP.origin: IGP
>         BGP.as_path: 65000
>         BGP.next_hop: fc57::2 fe80::1:bdff:fe05:64b7
>         BGP.local_pref: 100
> ...
>
>
> In the bird log, with debug output enabled I can see:
>
> Jan 11 15:12:46 ps bird6: gw1: Got KEEPALIVE
> Jan 11 15:12:49 ps bird6: gw2: Got KEEPALIVE
> Jan 11 15:12:50 ps bird6: gw1: Sending KEEPALIVE
> Jan 11 15:12:52 ps bird6: gw2: Sending KEEPALIVE
> Jan 11 15:12:54 ps bird6: device1: Scanning interfaces
> Jan 11 15:12:54 ps bird6: kernel1: Scanning routing table
> Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: will be updated
> Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: already seen
> Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: will be updated
> Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: already seen
> Jan 11 15:12:54 ps bird6: kernel1: Pruning table master
> Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: updating
> Jan 11 15:12:54 ps bird6: Netlink: File exists
> Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: updating
> Jan 11 15:12:54 ps bird6: Netlink: File exists
>
>
> After a while, the problem fixes itself, both routes are being installed, and
> then the problem reappears for the next cycle.
>
> Is there a way around this, or is this actually a bug? To me this looks like
> bird was scanning it's own routes and falsely scans only one of them.
> Experimenting with "import all", "learn" etc. for the kernel protocol seems to
> make no difference.
>
>
This is actually a known bug, because the linux kernel does not properly
do ECMP for IPv6.

In this case, the API is not symmetrical. You can set routes via the
multipath structures, but the Linux kernel splits this up into separate
routes internally, because with IPv6 you can now have multiple routes to
the same destination that are not linked together (why? Maybe to
remove/add one of the nexthops independently or something).

When scanning, to bird it looks as though the routes have changed,
because they are represented differently from how bird has installed
those routes.

Because multiple routes to the same destination in the linux kernel have
no relation to eachother anymore, don't expect weighted ECMP to work
either on IPv6.

I would love it if somebody fixed this shit and made it work like IPv4,
or at least put weights into ECMP on IPv6 and allow per-packet and
per-flow ECMP..


-- Wilco




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20160112/40b06aab/attachment.asc>


More information about the Bird-users mailing list