Good morning bird'ers, we have a bit of a strange error in regards to bfd, on two sessions we get continuously the following error message: server142: -------------------------------------------------------------------------------- 2024-08-10 09:50:25.533 <ERR> bfd1: Socket error: Destination address required 2024-08-10 09:50:25.738 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eafc - wrong TTL (254) 2024-08-10 09:50:25.840 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eb04 - wrong TTL (254) 2024-08-10 09:50:26.287 <ERR> bfd1: Socket error: Destination address required 2024-08-10 09:50:26.512 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eafc - wrong TTL (254) 2024-08-10 09:50:26.639 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eb04 - wrong TTL (254) -------------------------------------------------------------------------------- The two devices with the incorrect TTL are openwrt devices. All routers are running BIRD version 2.15.1. Now things are getting even more interesting, but let me first show the rough topology: -------------------------------------------------------------------------------- s141 ------|------s123 (alpine linux) | (alpine linux) s142 -- ibgp ---s122 (alpine linux) | (alpine linux) | vigir28 ----------- vigir29 (openwrt) (openwrt) All connections are layer2, direct, vigirs only connect to servers, not to each other. -------------------------------------------------------------------------------- So now comes the interesting facts: - s141 has bfd up with s122, s123, s142, vigir28 - s142 has bfd up with s122, s123, s141, no vigir - s122 has bfd up with s123, s141, s142, no vigir - s123 has bfd up with s122, s141, s142, vigir29 - vigir28 has bfd s141 - vigir29 has bfd s123 Each and every device can ping the other one, so I am strangely confused as to what is going on. Additionally, probably correctly, the bgp sessions fail to initiate and/or are down: -------------------------------------------------------------------------------- s122: ibgp_s123 BGP --- up 2024-07-10 Established ibgp_s141 BGP --- up 2024-08-04 Established ibgp_s142 BGP --- up 2024-08-08 Established ibgp_vigir28 BGP --- start 10:17:53.412 Idle BGP Error: Hold timer expired ibgp_vigir29 BGP --- start 10:23:57.781 Idle BGP Error: Hold timer expired s123: (bfd & bgp fluctuate for vigir28) ibgp_s122 BGP --- up 2024-07-10 Established ibgp_s141 BGP --- up 2024-08-04 Established ibgp_s142 BGP --- up 2024-08-08 Established ibgp_vigir28 BGP --- up 10:21:16.449 Established ibgp_vigir29 BGP --- up 2024-08-08 Established s141: ibgp_s122 BGP --- up 2024-08-04 Established ibgp_s123 BGP --- up 2024-08-04 Established ibgp_s142 BGP --- up 2024-08-08 Established ibgp_vigir28 BGP --- start 10:20:42.819 OpenSent Socket: Connection closed ibgp_vigir29 BGP --- start 10:25:10.338 OpenSent BGP Error: Hold timer expired s142: ibgp_s122 BGP --- up 2024-08-08 Established ibgp_s123 BGP --- up 2024-08-08 Established ibgp_s141 BGP --- up 2024-08-08 Established ibgp_vigir28 BGP --- start 10:27:20.079 OpenSent BGP Error: Hold timer expired ibgp_vigir29 BGP --- start 10:26:21.088 OpenSent BGP Error: Hold timer expired vigir28: bgp1 BGP --- start 10:26:00.453 OpenConfirm Received: Hold timer expired bgp2 BGP --- up 10:21:30.592 Established bgp3 BGP --- start 10:25:09.416 OpenConfirm BGP Error: Hold timer expired bgp4 BGP --- start 10:25:07.000 OpenConfirm Socket: Host is unreachable vigir29: bgp1 BGP --- start 10:24:21.241 OpenConfirm Socket: Host is unreachable bgp2 BGP --- up 2024-08-08 Established bgp3 BGP --- start 10:25:15.541 OpenConfirm Socket: Host is unreachable bgp4 BGP --- start 10:28:48.584 Idle Received: Hold timer expired -------------------------------------------------------------------------------- Some configuration samples: -------------------------------------------------------------------------------- vigir28: log syslog all; router id 0.0.1.28; protocol device { } protocol bfd { } # Just announce, no kernel interaction protocol static static6 { ipv6; route 2a0a:e5c0:10:10::/96 unreachable; } # for getting iBGP routes protocol babel { interface "br-lan", "wan" { type wired; authentication mac; password "..."; }; ipv6 { export where (source = RTS_DEVICE) || (source = RTS_BABEL); }; } protocol kernel kernel_v6 { ipv6 { export where source ~ [ RTS_BABEL ]; }; } protocol bgp { local as 213081; neighbor 2a0a:e5c0:10:1::122 as 213081; direct; bfd on; ipv6 { import none; export where source ~ [ RTS_STATIC ]; }; } (repeat bgp session for each ibgp peer) -------------------------------------------------------------------------------- And s141: -------------------------------------------------------------------------------- log stderr all; protocol device { } # Using BFD virtually everywhere, enable it globally protocol bfd { } protocol babel { interface "eth*" { type wired; authentication mac; password "..."; }; # This matches the default of babeld: redistribute all addresses # configured on local interfaces, plus re-distribute all routes received # from other babel peers. ipv4 { export where (source = RTS_DEVICE) || (source = RTS_BABEL); }; ipv6 { export where (source = RTS_DEVICE) || (source = RTS_BABEL); }; } protocol bgp ibgp_vigir28 { local as myas; neighbor 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eafc as myas; direct; bfd on; ipv6 { import all; export filter static_and_bgp; gateway recursive; }; ipv4 { import all; export filter static_and_bgp; gateway recursive; extended next hop on; }; } (repeat bgp session for each ibgp peer) -------------------------------------------------------------------------------- s122 + s123 are virtually identical, as well as s141+s142, their configurations are generated. Any help in this direction would be appreciated. My next try will probably be to disable bfd on all sessions to see if the bgp sessions then stay up. Best regards, Nico -- Sustainable and modern Infrastructures by ungleich.ch
Hello, On 8/10/24 12:33, Nico Schottelius via Bird-users wrote:
Good morning bird'ers,
we have a bit of a strange error in regards to bfd, on two sessions we get continuously the following error message:
server142:
-------------------------------------------------------------------------------- 2024-08-10 09:50:25.533 <ERR> bfd1: Socket error: Destination address required 2024-08-10 09:50:25.738 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eafc - wrong TTL (254) 2024-08-10 09:50:25.840 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eb04 - wrong TTL (254) 2024-08-10 09:50:26.287 <ERR> bfd1: Socket error: Destination address required 2024-08-10 09:50:26.512 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eafc - wrong TTL (254) 2024-08-10 09:50:26.639 <RMT> bfd1: Bad packet from 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eb04 - wrong TTL (254) --------------------------------------------------------------------------------
The two devices with the incorrect TTL are openwrt devices. All routers are running BIRD version 2.15.1.
Now things are getting even more interesting, but let me first show the rough topology:
--------------------------------------------------------------------------------
s141 ------|------s123 (alpine linux) | (alpine linux) s142 -- ibgp ---s122 (alpine linux) | (alpine linux) | vigir28 ----------- vigir29 (openwrt) (openwrt)
All connections are layer2, direct, vigirs only connect to servers, not to each other. --------------------------------------------------------------------------------
So now comes the interesting facts:
- s141 has bfd up with s122, s123, s142, vigir28 - s142 has bfd up with s122, s123, s141, no vigir - s122 has bfd up with s123, s141, s142, no vigir - s123 has bfd up with s122, s141, s142, vigir29 - vigir28 has bfd s141 - vigir29 has bfd s123
Each and every device can ping the other one, so I am strangely confused as to what is going on.
Additionally, probably correctly, the bgp sessions fail to initiate and/or are down:
-------------------------------------------------------------------------------- s122:
ibgp_s123 BGP --- up 2024-07-10 Established ibgp_s141 BGP --- up 2024-08-04 Established ibgp_s142 BGP --- up 2024-08-08 Established ibgp_vigir28 BGP --- start 10:17:53.412 Idle BGP Error: Hold timer expired ibgp_vigir29 BGP --- start 10:23:57.781 Idle BGP Error: Hold timer expired
s123: (bfd & bgp fluctuate for vigir28) ibgp_s122 BGP --- up 2024-07-10 Established ibgp_s141 BGP --- up 2024-08-04 Established ibgp_s142 BGP --- up 2024-08-08 Established ibgp_vigir28 BGP --- up 10:21:16.449 Established ibgp_vigir29 BGP --- up 2024-08-08 Established
s141: ibgp_s122 BGP --- up 2024-08-04 Established ibgp_s123 BGP --- up 2024-08-04 Established ibgp_s142 BGP --- up 2024-08-08 Established ibgp_vigir28 BGP --- start 10:20:42.819 OpenSent Socket: Connection closed ibgp_vigir29 BGP --- start 10:25:10.338 OpenSent BGP Error: Hold timer expired
s142: ibgp_s122 BGP --- up 2024-08-08 Established ibgp_s123 BGP --- up 2024-08-08 Established ibgp_s141 BGP --- up 2024-08-08 Established ibgp_vigir28 BGP --- start 10:27:20.079 OpenSent BGP Error: Hold timer expired ibgp_vigir29 BGP --- start 10:26:21.088 OpenSent BGP Error: Hold timer expired
vigir28: bgp1 BGP --- start 10:26:00.453 OpenConfirm Received: Hold timer expired bgp2 BGP --- up 10:21:30.592 Established bgp3 BGP --- start 10:25:09.416 OpenConfirm BGP Error: Hold timer expired bgp4 BGP --- start 10:25:07.000 OpenConfirm Socket: Host is unreachable
vigir29: bgp1 BGP --- start 10:24:21.241 OpenConfirm Socket: Host is unreachable bgp2 BGP --- up 2024-08-08 Established bgp3 BGP --- start 10:25:15.541 OpenConfirm Socket: Host is unreachable bgp4 BGP --- start 10:28:48.584 Idle Received: Hold timer expired --------------------------------------------------------------------------------
Some configuration samples:
-------------------------------------------------------------------------------- vigir28:
log syslog all; router id 0.0.1.28;
protocol device { } protocol bfd { }
# Just announce, no kernel interaction protocol static static6 { ipv6; route 2a0a:e5c0:10:10::/96 unreachable; } # for getting iBGP routes protocol babel { interface "br-lan", "wan" { type wired; authentication mac; password "..."; }; ipv6 { export where (source = RTS_DEVICE) || (source = RTS_BABEL); }; } protocol kernel kernel_v6 { ipv6 { export where source ~ [ RTS_BABEL ]; }; } protocol bgp { local as 213081; neighbor 2a0a:e5c0:10:1::122 as 213081; direct; bfd on;
ipv6 { import none; export where source ~ [ RTS_STATIC ]; }; }
(repeat bgp session for each ibgp peer) --------------------------------------------------------------------------------
And s141:
-------------------------------------------------------------------------------- log stderr all;
protocol device { }
# Using BFD virtually everywhere, enable it globally protocol bfd { } protocol babel { interface "eth*" { type wired; authentication mac; password "..."; };
# This matches the default of babeld: redistribute all addresses # configured on local interfaces, plus re-distribute all routes received # from other babel peers.
ipv4 { export where (source = RTS_DEVICE) || (source = RTS_BABEL); }; ipv6 { export where (source = RTS_DEVICE) || (source = RTS_BABEL); }; } protocol bgp ibgp_vigir28 { local as myas; neighbor 2a0a:e5c0:10:1:fa5e:3cff:fe2d:eafc as myas; direct; bfd on;
ipv6 { import all; export filter static_and_bgp;
gateway recursive; };
ipv4 { import all; export filter static_and_bgp;
gateway recursive; extended next hop on; }; }
(repeat bgp session for each ibgp peer) --------------------------------------------------------------------------------
s122 + s123 are virtually identical, as well as s141+s142, their configurations are generated.
Any help in this direction would be appreciated. My next try will probably be to disable bfd on all sessions to see if the bgp sessions then stay up.
Best regards,
Nico
the BFD 'bfd1: Bad packet from ... - wrong TTL (254)' error can happen when the associated BGP has a 'direct' option set in a setup where multihop BGP should be used. Some node in the middle decreases the TTL of the BFD packets but BFD expects no middle node. This is probably also the reason why the BGP sessions are not UP as well. Is is possible that the direct connections are established through wireguard (or something similar) in a point-to-multipoint manner? If so, I would try to change the BGPs to multihop mode. I have not managed to observe the 'bfd1: Socket error: Destination address required' in our simulated setup and would need more information about the topology (and the config of wireguard tunnels if they are used). Hope this helps, David -- – David Petera (he/him) | BIRD Tech Support | CZ.NIC, z.s.p.o.
participants (2)
-
David Petera -
Nico Schottelius