Hi, I'm trying to setup BFD multihop session between Bird 1.6.3 and a Juniper MX router to fast failover eBGP multihop sessions. I tried different configuration with no success, the current one is: ------------------------ protocol bfd { interface "*" { interval 300 ms; multiplier 3; }; multihop { interval 300 ms; multiplier 3; }; } protocol bgp { import none; multihop 2; export filter vips_filter; local 208.80.153.77 as 64605; neighbor 208.80.153.192 as 14907; bfd yes; } ------------------------ And the Juniper side: ------------------------ type external; multihop { ttl 2; } local-address 208.80.153.192; import anycast_import; family inet { unicast { prefix-limit { maximum 50; teardown; } } } export NONE; peer-as 64605; bfd-liveness-detection { minimum-interval 300; } multipath; neighbor 208.80.153.77; ------------------------ The firewalls on both sides allow udp/4784 udp/3784 udp/3785 to/from each side. The Bird's BFD state stays on Init, while Junos on Down. ------------------------ bird> show bfd sessions bfd1: IP address Interface State Since Interval Timeout 208.80.153.192 --- Init 15:47:25 2.000 6.000 ------------------------ cr1-codfw> show bfd session address 208.80.153.77 extensive Detect Transmit Address State Interface Time Interval Multiplier 208.80.153.77 Down 0.000 1.000 3 Client BGP, TX interval 0.300, RX interval 0.300 Local diagnostic None, remote diagnostic None Remote state AdminDown, version 1 Replicated Session type: Multi hop BFD Min async interval 0.300, min slow interval 1.000 Adaptive async TX interval 2.000, RX interval 2.000 Local min TX interval 2.000, minimum RX interval 0.300, multiplier 3 Remote min TX interval 0.000, min RX interval 0.000, multiplier 0 Local discriminator 3556, remote discriminator 0 Echo mode disabled/inactive, no-absorb, no-refresh Multi-hop min-recv-TTL 254, route table 0, local-address 208.80.153.192 Session ID: 0x4d954 1 sessions, 1 clients Cumulative transmit rate 1.0 pps, cumulative receive rate 0.0 pps ------------------------ We can see that Bird does send and receive BFD control packets, from/to the proper IPs. ------------------------ dns2001:~$ sudo tcpdump -p -i eno1 "host 208.80.153.192" 16:11:23.585910 IP cr1-codfw.wikimedia.org.49152 > dns2001.wikimedia.org.4784: UDP, length 24 16:11:23.643194 IP dns2001.wikimedia.org.35807 > cr1-codfw.wikimedia.org.4784: UDP, length 24 ------------------------ dns2001:~$ tailf /tmp/bird-debug.log 2018-11-20 15:49:20 <TRACE> bfd1: Sending CTL to 208.80.153.192 [Init] 2018-11-20 15:49:20 <TRACE> bfd1: CTL received from 208.80.153.192 [Down C] ------------------------ Running "monitor traffic" on the proper Junos interface doesn't show any BFD, but I believe that's because it's offloaded to the hardware (not seen on the routing engine). BGP on the other hand is properly established, so it's not a routing issue. I have some more troubleshooting in our ticket https://phabricator.wikimedia.org/T209989 Any pointers on miss-configurations, items to verify or other troubleshooting commands are welcome. Thank you. -- Arzhel
On Tue, Mar 05, 2019 at 05:21:44PM -0500, Arzhel Younsi wrote:
Hi,
I'm trying to setup BFD multihop session between Bird 1.6.3 and a Juniper MX router to fast failover eBGP multihop sessions.
Hi I guess this could be strict src port checking in Juniper, see https://bird.network.cz/?get_doc&v=16&f=bird-6.html#ss6.2 : BFD packets are sent with a dynamic source port number. Linux systems use by default a bit different dynamic port range than the IANA approved one (49152-65535). If you experience problems with compatibility, please adjust /proc/sys/net/ipv4/ip_local_port_range And also: https://bird.network.cz/pipermail/bird-users/2015-August/009846.html -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Thanks for your reply Ondrej, I changed the port range as suggested, confirmed that BFD packets were leaving from a correct port, but the BFD session still stays down. -- Arzhel On Tue, Mar 5, 2019, at 19:32, Ondrej Zajicek wrote:
On Tue, Mar 05, 2019 at 05:21:44PM -0500, Arzhel Younsi wrote:
Hi,
I'm trying to setup BFD multihop session between Bird 1.6.3 and a Juniper MX router to fast failover eBGP multihop sessions.
Hi
I guess this could be strict src port checking in Juniper, see https://bird.network.cz/?get_doc&v=16&f=bird-6.html#ss6.2 :
BFD packets are sent with a dynamic source port number. Linux systems use by default a bit different dynamic port range than the IANA approved one (49152-65535). If you experience problems with compatibility, please adjust /proc/sys/net/ipv4/ip_local_port_range
And also: https://bird.network.cz/pipermail/bird-users/2015-August/009846.html
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Thu, Mar 07, 2019 at 07:13:58PM -0500, Arzhel Younsi wrote:
Thanks for your reply Ondrej,
I changed the port range as suggested, confirmed that BFD packets were leaving from a correct port, but the BFD session still stays down.
208.80.153.77 Down 0.000 1.000 3 Client BGP, TX interval 0.300, RX interval 0.300 Local diagnostic None, remote diagnostic None Remote state AdminDown, version 1 Replicated Session type: Multi hop BFD Min async interval 0.300, min slow interval 1.000 Adaptive async TX interval 2.000, RX interval 2.000 Local min TX interval 2.000, minimum RX interval 0.300, multiplier 3 Remote min TX interval 0.000, min RX interval 0.000, multiplier 0 Local discriminator 3556, remote discriminator 0 Echo mode disabled/inactive, no-absorb, no-refresh Multi-hop min-recv-TTL 254, route table 0, local-address 208.80.153.192 Perhaps there is an issue with 'min-recv-TTL 254'. For single-hop BFD sessions, the RFC 5880 requires TTL security mechanism and therefore BIRD specifies outgoing TTL 255. For multi-hop BFD there is no such requirement and therefore BIRD uses OS default TTL, which is AFAIK 64 on Linux. You can check that with tcpdump and perhaps disable the check on Juniper or set /proc/sys/net/ipv4/ip_default_ttl on Linux. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Bingo! As soon as I the system TTL to 255, the session went up. Thanks a lot! Now that we know where to look, we started to dig a bit in the code to not have to change the TTL system wide but only for Bird. It seems like there a TODO to make the TTL value customizable: https://github.com/BIRD/bird/blob/master/proto/bfd/packets.c#L453 And in some (so far unknown) cases, it sets the TTL to 255 https://github.com/BIRD/bird/blob/master/proto/bfd/packets.c#L456 -- Arzhel On Sun, Mar 10, 2019, at 19:10, Ondrej Zajicek wrote:
On Thu, Mar 07, 2019 at 07:13:58PM -0500, Arzhel Younsi wrote:
Thanks for your reply Ondrej,
I changed the port range as suggested, confirmed that BFD packets were leaving from a correct port, but the BFD session still stays down.
208.80.153.77 Down 0.000 1.000 3 Client BGP, TX interval 0.300, RX interval 0.300 Local diagnostic None, remote diagnostic None Remote state AdminDown, version 1 Replicated Session type: Multi hop BFD Min async interval 0.300, min slow interval 1.000 Adaptive async TX interval 2.000, RX interval 2.000 Local min TX interval 2.000, minimum RX interval 0.300, multiplier 3 Remote min TX interval 0.000, min RX interval 0.000, multiplier 0 Local discriminator 3556, remote discriminator 0 Echo mode disabled/inactive, no-absorb, no-refresh Multi-hop min-recv-TTL 254, route table 0, local-address 208.80.153.192
Perhaps there is an issue with 'min-recv-TTL 254'. For single-hop BFD sessions, the RFC 5880 requires TTL security mechanism and therefore BIRD specifies outgoing TTL 255. For multi-hop BFD there is no such requirement and therefore BIRD uses OS default TTL, which is AFAIK 64 on Linux.
You can check that with tcpdump and perhaps disable the check on Juniper or set /proc/sys/net/ipv4/ip_default_ttl on Linux.
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Tue, Mar 12, 2019 at 01:04:28PM -0400, Arzhel Younsi wrote:
Bingo! As soon as I the system TTL to 255, the session went up. Thanks a lot!
Now that we know where to look, we started to dig a bit in the code to not have to change the TTL system wide but only for Bird.
It seems like there a TODO to make the TTL value customizable: https://github.com/BIRD/bird/blob/master/proto/bfd/packets.c#L453 And in some (so far unknown) cases, it sets the TTL to 255 https://github.com/BIRD/bird/blob/master/proto/bfd/packets.c#L456
That is for single-hop BFD cases. As i wrote in the previous e-mail:
For single-hop BFD sessions, the RFC 5880 requires TTL security mechanism and therefore BIRD specifies outgoing TTL 255.
You can just change it to "sk->ttl = 255;" and recompile. Is this 'min-recv-TTL 254' some special setting in Juniper, or its default BFD behavior? If the second case, then perhaps it would be best to make a bugreport to Juniper as they have packet checks that are not requested by BFD specifications. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Thanks again! Junos uses the BGP multihop TTL value for BFD as well, and assumes the other side's default TTL is 255. So if I do: [edit protocols bgp group Anycast4 multihop] - ttl 2; + ttl 3; Then Multi-hop min-recv-TTL drops to 253. I couldn't find any knob to set the default TTL of the remote side. So an easier workaround than recompiling Bird: I set that TTL to 193, which sets min-recv-TTL to 63 and the session went up. This requires firewall filters to only allow BGP and BFD from authorized peers. -- Arzhel On Tue, Mar 12, 2019, at 10:47, Ondrej Zajicek wrote:
On Tue, Mar 12, 2019 at 01:04:28PM -0400, Arzhel Younsi wrote:
Bingo! As soon as I the system TTL to 255, the session went up. Thanks a lot!
Now that we know where to look, we started to dig a bit in the code to not have to change the TTL system wide but only for Bird.
It seems like there a TODO to make the TTL value customizable: https://github.com/BIRD/bird/blob/master/proto/bfd/packets.c#L453 And in some (so far unknown) cases, it sets the TTL to 255 https://github.com/BIRD/bird/blob/master/proto/bfd/packets.c#L456
That is for single-hop BFD cases. As i wrote in the previous e-mail:
For single-hop BFD sessions, the RFC 5880 requires TTL security mechanism and therefore BIRD specifies outgoing TTL 255.
You can just change it to "sk->ttl = 255;" and recompile.
Is this 'min-recv-TTL 254' some special setting in Juniper, or its default BFD behavior? If the second case, then perhaps it would be best to make a bugreport to Juniper as they have packet checks that are not requested by BFD specifications.
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
participants (2)
-
Arzhel Younsi -
Ondrej Zajicek