BIRD BGP and VRF - Cannot assign requested address
Hello, I am trying to set up 2 BIRD routers as AS border routers (on Debian 9). Installation is as follows : |BIRD router 1|-----IBGP-----|BIRD router 2| | | EBGP EBGP | | |Hardware router 1|---IBGP---|Hardware router 2| | (downstream) | | (downstream) | (later the BIRD routers would also be connected to our upstreams ISPs, but for the sake of simplicity let's just consider this part of the setup) I plan to use both ipv4 and ipv6 and also, on BIRD routers both the forwarding and the BGP peerings are to take place in a linux vrf named "internet" (using l3mdev, not namespaces or custom ip rules). At the BIRD level I also set up a table named "internet". Now, to my problem : the EBGP peerings with the downstreams routers are UP, but on the other hand the IBGP between the two BIRD routers won't come up, I get the following error messages : On ipv4 peering, the service complains that : ibgp_internet: Socket error: bind: Cannot assign requested address On ipv6 peering : the service won't complain but using birdc6 I can see that the protocol says : Socket: Network is unreachable Both routers have almost identical config (apart from the IP addresses of course), here is the conf on BIRD router 1 : ------------------------------------------------------ ---------------------bird.conf------------------------ ------------------------------------------------------ table internet; router id 10.206.81.36; define my_as=205555; protocol device { } protocol kernel { kernel table 10; table internet; metric 64; import all; export all; scan time 10; } protocol static default_route_to_bgp { import all; table internet; route 0.0.0.0/0 reject; } template bgp template_base_bgp { med metric; } protocol bgp ibgp_internet from template_base_bgp { table internet; source address 10.206.81.81; local as my_as; neighbor 10.206.81.82 as my_as; import all; export all; next hop self; } protocol bgp downstream_internet from template_base_bgp { table internet; export where proto = "default_route_to_bgp"; source address 10.206.81.90; local as my_as; neighbor 10.206.81.89 as 65206; } ------------------------------------------------------ ------------------end of bird.conf-------------------- ------------------------------------------------------ ------------------------------------------------------ ---------------------bird6.conf----------------------- ------------------------------------------------------ table internet; router id 10.206.81.36; define my_as=205555; protocol device { } protocol kernel { kernel table 10; table internet; metric 64; import all; export all; scan time 10; } template bgp template_base_bgp { med metric; } protocol bgp ibgp_internet from template_base_bgp { table internet; source address 2a0b:95c0:0000:0101::1; local as my_as; neighbor 2a0b:95c0:0000:0101::2 as my_as; import all; export all; next hop self; } protocol static default_route_to_bgp { import all; table internet; route ::/0 reject; } protocol bgp downstream_internet from template_base_bgp { table internet; export where proto = "default_route_to_bgp"; source address 2a0b:95c0:0000:0102::2; local as my_as; neighbor 2a0b:95c0:0000:0102::1 as 65206; } ------------------------------------------------------ -----------------end of bird6.conf-------------------- ------------------------------------------------------ The network interfaces seems to be properly UP and attached to the internet VRF, here is the IBGP one : # ip addr show eth1.3 7: eth1.3@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master internet state UP group default qlen 1000 link/ether 94:18:82:ab:8b:ec brd ff:ff:ff:ff:ff:ff inet 10.206.81.81/29 brd 10.206.81.87 scope global eth1.3 valid_lft forever preferred_lft forever inet6 2a0b:95c0:0:101::1/64 scope global valid_lft forever preferred_lft forever inet6 fe80::9618:82ff:feab:8bec/64 scope link valid_lft forever preferred_lft forever and here is the EBGP one (the one which has the functional peering) : # ip addr show ens1.14 8: ens1.14@ens1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master internet state UP group default qlen 1000 link/ether 9c:dc:71:45:5f:e0 brd ff:ff:ff:ff:ff:ff inet 10.206.81.90/29 brd 10.206.81.95 scope global ens1.14 valid_lft forever preferred_lft forever inet6 2a0b:95c0:0:102::2/64 scope global valid_lft forever preferred_lft forever inet6 fe80::9edc:71ff:fe45:5fe0/64 scope link valid_lft forever preferred_lft forever Now the "internet" kernel routing table, we can see that the local and broadcast routes are attached to it too : # ip route show table internet unreachable default proto bird metric 64 broadcast 10.206.81.80 dev eth1.3 proto kernel scope link src 10.206.81.81 10.206.81.80/29 dev eth1.3 proto kernel scope link src 10.206.81.81 local 10.206.81.81 dev eth1.3 proto kernel scope host src 10.206.81.81 broadcast 10.206.81.87 dev eth1.3 proto kernel scope link src 10.206.81.81 broadcast 10.206.81.88 dev ens1.14 proto kernel scope link src 10.206.81.90 10.206.81.88/29 dev ens1.14 proto kernel scope link src 10.206.81.90 local 10.206.81.90 dev ens1.14 proto kernel scope host src 10.206.81.90 broadcast 10.206.81.95 dev ens1.14 proto kernel scope link src 10.206.81.90 Finally, netstat shows that the BGP service is listening on * as it is supposed to. We also see the established EBGP downstream sessions : # netstat -an | grep 179 tcp 0 0 0.0.0.0:179 0.0.0.0:* LISTEN tcp 0 0 10.206.81.90:179 10.206.81.89:9266 ESTABLISHED tcp6 0 0 :::179 :::* LISTEN tcp6 0 0 2a0b:95c0:0:102::2:179 2a0b:95c0:0:102:::64247 ESTABLISHED and tcp_l3mdev_accept is enabled in the kernel config : # sysctl net.ipv4.tcp_l3mdev_accept net.ipv4.tcp_l3mdev_accept = 1 So overall it seems like BIRD fails to bind to one interface, this is possibly linked to the kernel VRF setup but I can't see why it doesn't work when the peering to downstream routers using the same VRF do work properly. Any help would be greatly appreciated. Thanks. clément
On Fri, Aug 04, 2017 at 04:43:01PM +0200, Clément Guivy wrote:
Hello, I am trying to set up 2 BIRD routers as AS border routers (on Debian 9). ... I plan to use both ipv4 and ipv6 and also, on BIRD routers both the forwarding and the BGP peerings are to take place in a linux vrf named "internet" (using l3mdev, not namespaces or custom ip rules). At the BIRD level I also set up a table named "internet". ... So overall it seems like BIRD fails to bind to one interface, this is possibly linked to the kernel VRF setup but I can't see why it doesn't work when the peering to downstream routers using the same VRF do work properly.
Hi I guess that the difference is that EBGP sessions are bound to the associated interface (direct mode), while IBGP sessions are multihop and not associated with particular interface (multihop mode). Although a source address is specified, that is used just for the regular bind() operation, which AFAIK does not cause TCP connection to be associated with particular VRF/l3mdev. For that it is necessary to use SO_BINDTODEVICE socket option, which is used just in the direct mode. You could try to use 'direct' option for IBGP to run it in IBGP mode. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On 04/08/2017 23:00, Ondrej Zajicek wrote:
You could try to use 'direct' option for IBGP to run it in IBGP mode.
Thanks, I missed this difference between IBGP and EBGP. Now with this "direct" setting it's getting better, no more complaining from the service. However BGP session is now stuck in "Connect" state. Netstat shows SYN_SENT status for the TCP session, and a tcpdump trace shows that conversation between the two BIRD routers is limited to SYN packet => immediate SYN+ACK response => immediate RST response, this 3-step process being repeated every few seconds. Not sure what to conclude from there. I tried to enable debug with "debug ibgp_internet all" command and check the syslog, but it just keeps repeating the following : bird: ibgp_internet: Connecting to 10.206.81.82 from local address 10.206.81.81 (same log on the other router, with ip addresses in inversed order of course) Also I tried to modify the IPv6 peering in onrder to use the link-local addresses, and added the following setting : interface "eth1.3"; Which established the ipv6 peering! This does not really fits my needs (since my understanding is that this setting is limited to ipv6 link-local peerings) but maybe it's relevant. Please tell me if you need any more info. I appreciate your help.
On Sat, Aug 05, 2017 at 01:53:11AM +0200, Clément Guivy wrote:
On 04/08/2017 23:00, Ondrej Zajicek wrote:
You could try to use 'direct' option for IBGP to run it in IBGP mode.
Thanks, I missed this difference between IBGP and EBGP. Now with this "direct" setting it's getting better, no more complaining from the service. However BGP session is now stuck in "Connect" state. Netstat shows SYN_SENT status for the TCP session, and a tcpdump trace shows that conversation between the two BIRD routers is limited to SYN packet => immediate SYN+ACK response => immediate RST response, this 3-step process being repeated every few seconds. Not sure what to conclude from there.
I tried to enable debug with "debug ibgp_internet all" command and check the syslog, but it just keeps repeating the following :
bird: ibgp_internet: Connecting to 10.206.81.82 from local address 10.206.81.81
It seems like receiving sides do not accept incoming connections, hence RST response and no receiving event in log. AFAIK, socket listening on 0.0.0.0 should receive connections regardless of VRFs, so it should not be a problem. Do EBGP sessions work if you use 'passive' option, i.e. force to accept incoming sessions instead of relying on outgoing sessions? One thing to try would be to use 'listen bgp' option on IP address and interface used for IBGP session. Also, which Linux kernel version do you use? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Sat, Aug 05, 2017 at 01:02:08PM +0200, Ondrej Zajicek wrote:
On Sat, Aug 05, 2017 at 01:53:11AM +0200, Clément Guivy wrote:
On 04/08/2017 23:00, Ondrej Zajicek wrote:
You could try to use 'direct' option for IBGP to run it in IBGP mode.
Thanks, I missed this difference between IBGP and EBGP. Now with this "direct" setting it's getting better, no more complaining from the service. However BGP session is now stuck in "Connect" state. Netstat shows SYN_SENT status for the TCP session, and a tcpdump trace shows that conversation between the two BIRD routers is limited to SYN packet => immediate SYN+ACK response => immediate RST response, this 3-step process being repeated every few seconds. Not sure what to conclude from there.
I tried to enable debug with "debug ibgp_internet all" command and check the syslog, but it just keeps repeating the following :
bird: ibgp_internet: Connecting to 10.206.81.82 from local address 10.206.81.81
It seems like receiving sides do not accept incoming connections, hence RST response and no receiving event in log. AFAIK, socket listening on 0.0.0.0 should receive connections regardless of VRFs, so it should not be a problem.
Hi I replicated the problem and noticed that RST is from the sender, so it is not related to the listening socket but to the connecting socket. I found that it is probably a bug/behavior of Linux VRF implementation. Socket can be bound to an iface, which is also used to choose related VRF. For UDP sockets, it works for both VRF ifaces and underlying (real) ifaces. But for TCP (and perhaps ICMP) sockets it seems to work only for VRF ifaces, while BIRD tries to bind the socket with the real iface. Similarly, i cannot ping in VRF using 'ping -I eth0 A.B.C.D', while i can ping with 'ping -I vrf0 A.B.C.D' when eth0 is interface under vrf0. A very ugly workaround for BIRD BGP is to add appropriate IP addresses also to vrf iface (with 'noprefixroute' option to not mess routing table) and then use 'interface' BGP protocol option with vrf interface. In your case: ip addr add 10.206.81.81/29 dev internet noprefixroute protocol bgp ibgp_internet from template_base_bgp { table internet; local 10.206.81.81 as my_as; neighbor 10.206.81.82 as my_as; interface "internet"; direct; ... } -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On 05/08/2017 23:55, Ondrej Zajicek wrote:
I found that it is probably a bug/behavior of Linux VRF implementation. Socket can be bound to an iface, which is also used to choose related VRF. For UDP sockets, it works for both VRF ifaces and underlying (real) ifaces. But for TCP (and perhaps ICMP) sockets it seems to work only for VRF ifaces, while BIRD tries to bind the socket with the real iface.
A very ugly workaround for BIRD BGP is to add appropriate IP addresses also to vrf iface (with 'noprefixroute' option to not mess routing table) and then use 'interface' BGP protocol option with vrf interface.
Thanks for your answer. First to respond to your previous mail, I'm using stock Debian kernel 4.9.0.3. I have read the changelog for version 4.10 and 4.11, didn't find anything related to my case. What I don't get with the Linux bug/behavior idea is that the peering with the downstream router works fine where I would expect it to fail as well since it uses the same vrf setup (it is EBGP instead of IBGP but I don't see that making a difference from the kernel point of view ?). I tried the replicated address in the vrf interface trick and the "interface" option as you suggested, but the service won't start : ########################################### bird: /etc/bird/bird.conf, line 58: Link-local address and interface scope must be used together ########################################### As per the documentation this error makes sense as it should be only used with link-local addresses. Am I missing something ? Nonetheless, with just the replicated address in the vrf interface, the peering establishes. bird6 just complains a little but that doesn't seem too bad : ########################################### bird6: ibgp_internet: Missing link local address on interface internet ########################################### But I wonder if this behavior is deterministic (and if yes according to which algorithm), or if the system could at some point revert to bind to eth1.3 and get back to its prior behaviour (sending RST after receiving SYN+ACK). I tried to reboot and bring down/up interfaces, for now it keeps re-establishing peering. Also, being bound to a virtual interface, bird doesn't benefit from the physical link failure detection. "Check link" option doesn't work, which I guess makes sense since it probably tracks the state of the vrf interface itself, which doesn't go down. At least I could use BFD to circumvent that I suppose.
On Sun, Aug 06, 2017 at 08:05:38AM +0200, Clément Guivy wrote:
On 05/08/2017 23:55, Ondrej Zajicek wrote:
I found that it is probably a bug/behavior of Linux VRF implementation. Socket can be bound to an iface, which is also used to choose related VRF. For UDP sockets, it works for both VRF ifaces and underlying (real) ifaces. But for TCP (and perhaps ICMP) sockets it seems to work only for VRF ifaces, while BIRD tries to bind the socket with the real iface.
A very ugly workaround for BIRD BGP is to add appropriate IP addresses also to vrf iface (with 'noprefixroute' option to not mess routing table) and then use 'interface' BGP protocol option with vrf interface.
Thanks for your answer. First to respond to your previous mail, I'm using stock Debian kernel 4.9.0.3. I have read the changelog for version 4.10 and 4.11, didn't find anything related to my case.
What I don't get with the Linux bug/behavior idea is that the peering with the downstream router works fine where I would expect it to fail as well since it uses the same vrf setup (it is EBGP instead of IBGP but I don't see that making a difference from the kernel point of view ?).
Hi The difference is that such bug affect only outgoing connnections, not incoming connections. In your IBGP case, both routers are affected by the bug, so no connection is possible. In your EBGP case, incoming connections are from hardware routers not affected by the bug.
I tried the replicated address in the vrf interface trick and the "interface" option as you suggested, but the service won't start :
As per the documentation this error makes sense as it should be only used with link-local addresses. Am I missing something ?
My mistake. It is commit 33b6c292c3e3a8972d0b9f43d156aae50db65720 [1], which is newer than the verson 1.6.3, which you are probably using. [1] https://gitlab.labs.nic.cz/labs/bird/commit/33b6c292c3e3a8972d0b9f43d156aae5...
Nonetheless, with just the replicated address in the vrf interface, the peering establishes. bird6 just complains a little but that doesn't seem too bad :
########################################### bird6: ibgp_internet: Missing link local address on interface internet ###########################################
This can be ignored.
But I wonder if this behavior is deterministic (and if yes according to which algorithm), or if the system could at some point revert to bind to eth1.3 and get back to its prior behaviour (sending RST after receiving SYN+ACK). I tried to reboot and bring down/up interfaces, for now it keeps re-establishing peering.
I don't think it is deterministic by BIRD code, if you have multiple interfaces with the same prefix, the selected interface for given IP depends probably on the order in which BIRD found them. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On 06/08/2017 12:27, Ondrej Zajicek wrote:
The difference is that such bug affect only outgoing connnections, not incoming connections. In your IBGP case, both routers are affected by the bug, so no connection is possible. In your EBGP case, incoming connections are from hardware routers not affected by the bug.
ok, that makes sense.
My mistake. It is commit 33b6c292c3e3a8972d0b9f43d156aae50db65720 [1], which is newer than the verson 1.6.3, which you are probably using.
[1] https://gitlab.labs.nic.cz/labs/bird/commit/33b6c292c3e3a8972d0b9f43d156aae5...
that's in version 2 I suppose ? (don't have much knowledge of gitlab sorry). If so I'd rather not use it until it's out of beta. Good to know it's coming though.
I don't think it is deterministic by BIRD code, if you have multiple interfaces with the same prefix, the selected interface for given IP depends probably on the order in which BIRD found them.
Not sure I fully understand here, what if I set different IP to eth1.3 and internet vrf interface (while leaving them in the same prefix) ? Since in the bird.conf we specify a source IP address (not a prefix), wouldn't that be enough to guarantee that internet vrf interface is picked and not eth1.3 ?
On Sun, Aug 06, 2017 at 05:03:00PM +0200, Clément Guivy wrote:
On 06/08/2017 12:27, Ondrej Zajicek wrote:
The difference is that such bug affect only outgoing connnections, not incoming connections. In your IBGP case, both routers are affected by the bug, so no connection is possible. In your EBGP case, incoming connections are from hardware routers not affected by the bug.
ok, that makes sense.
I got confirmed from David Ahern that kernel patch fixing the behavior for sockets bound to VRF-enslaved devices was just released yesterday: http://www.spinics.net/lists/netdev/msg448040.html http://www.spinics.net/lists/netdev/msg448223.html
My mistake. It is commit 33b6c292c3e3a8972d0b9f43d156aae50db65720 [1], which is newer than the verson 1.6.3, which you are probably using.
[1] https://gitlab.labs.nic.cz/labs/bird/commit/33b6c292c3e3a8972d0b9f43d156aae5...
that's in version 2 I suppose ? (don't have much knowledge of gitlab sorry). If so I'd rather not use it until it's out of beta. Good to know it's coming though.
It is devel code for branch 1.6.x.
I don't think it is deterministic by BIRD code, if you have multiple interfaces with the same prefix, the selected interface for given IP depends probably on the order in which BIRD found them.
Not sure I fully understand here, what if I set different IP to eth1.3 and internet vrf interface (while leaving them in the same prefix) ? Since in the bird.conf we specify a source IP address (not a prefix), wouldn't that be enough to guarantee that internet vrf interface is picked and not eth1.3 ?
No, because the search is based on IP address of the neighbor. But if your interfaces are created in fixed order during boot, then they will be enumerated in fixed order and you get selected the same iface. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On 08/08/2017 22:52, Ondrej Zajicek wrote:
On Sun, Aug 06, 2017 at 05:03:00PM +0200, Clément Guivy wrote:
On 06/08/2017 12:27, Ondrej Zajicek wrote:
The difference is that such bug affect only outgoing connnections, not incoming connections. In your IBGP case, both routers are affected by the bug, so no connection is possible. In your EBGP case, incoming connections are from hardware routers not affected by the bug.
ok, that makes sense.
I got confirmed from David Ahern that kernel patch fixing the behavior for sockets bound to VRF-enslaved devices was just released yesterday:
Great, I had seen the start of that patch proposal and initial refusal but didn't catch the follow-up. Thanks.
But if your interfaces are created in fixed order during boot, then they will be enumerated in fixed order and you get selected the same iface.
By chance do you know how I can determine whether or not my interfaces are created in fixed order at boot? If that matters they are all simply configured in the interfaces files, except the vrf one which is created using a @reboot crontab.
participants (2)
-
Clément Guivy -
Ondrej Zajicek