Bird setting TTL to 1 at the end of a passive BGP session opening
Hello Bird users, First time poster and new subscriber. I noticed something strange and wanted to report it here in case this is in fact a bug that deserves attention. I run bird 2.0.7-4.1 on Debian 11. I have a BGP section configured as passive that acts as a TCP health-check endpoint. It is as follows: *--- cut --* protocol bgp HEALTHCHECKv4 { hold time 6; startup hold time 20; connect delay time 3; connect retry time 6; error wait time 3, 12; passive on; local 100.64.0.5 as 65000; neighbor 100.64.0.4 as 65535; } *--- cut --* What ends up happening on the wire is this: *--- cut --* 23:15:09.443792 IP (tos 0x0, ttl 254, id 4040, offset 0, flags [DF], proto TCP (6), length 60) 100.64.0.4.16141 > 100.64.0.5.179: Flags [S], cksum 0xa78f (correct), seq 723435095, win 8961, options [mss 8621,sackOK,TS val 3290475421 ecr 0,nop,wscale 0], length 0 23:15:09.443823 IP (tos 0xc0, ttl 255, id 0, offset 0, flags [DF], proto TCP (6), length 60) 100.64.0.5.179 > 100.64.0.4.16141: Flags [S.], cksum 0xc8b7 (incorrect -> 0x0785), seq 124371865, ack 723435096, win 62643, options [mss 8961,sackOK,TS val 2210037294 ecr 3290475421,nop,wscale 7], length 0 23:15:09.444437 IP (tos 0x0, ttl 254, id 4041, offset 0, flags [DF], proto TCP (6), length 52) 100.64.0.4.16141 > 100.64.0.5.179: Flags [.], cksum 0x2550 (correct), seq 1, ack 1, win 8961, options [nop,nop,TS val 3290475422 ecr 2210037294], length 0 23:15:09.444471 IP (tos 0x0, ttl 254, id 4042, offset 0, flags [DF], proto TCP (6), length 52) 100.64.0.4.16141 > 100.64.0.5.179: Flags [F.], cksum 0x254e (correct), seq 1, ack 1, win 8961, options [nop,nop,TS val 3290475423 ecr 2210037294], length 0 23:15:09.444576 IP (tos 0xc0, *ttl 1*, id 55411, offset 0, flags [DF], proto TCP (6), length 99) 100.64.0.5.179 > 100.64.0.4.16141: Flags [P.], cksum 0xc8de (incorrect -> 0x58b6), seq 1:48, ack 2, win 490, options [nop,nop,TS val 2210037294 ecr 3290475423], length 47: BGP Open Message (1), length: 47 Version 4, my AS 65000, Holdtime 6s, ID 100.64.0.5 Optional parameters, length: 18 Option Capabilities Advertisement (2), length: 16 Route Refresh (2), length: 0 Graceful Restart (64), length: 2 Restart Flags: [none], Restart Time 120s 0x0000: 0078 32-Bit AS Number (65), length: 4 4 Byte AS 65000 0x0000: 0000 fde8 Enhanced Route Refresh (70), length: 0 no decoder for Capability 70 Long-lived Graceful Restart (71), length: 0 23:15:09.444602 IP (tos 0xc0, *ttl 1*, id 55412, offset 0, flags [DF], proto TCP (6), length 52) 100.64.0.5.179 > 100.64.0.4.16141: Flags [F.], cksum 0xc8af (incorrect -> 0x4635), seq 48, ack 2, win 490, options [nop,nop,TS val 2210037294 ecr 3290475423], length 0 23:15:09.444670 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto ICMP (1), length 56) 100.64.0.4 > 100.64.0.5: ICMP time exceeded in-transit, length 36 IP (tos 0xc0, ttl 1, id 55411, offset 0, flags [DF], proto TCP (6), length 99) 100.64.0.5.179 > 100.64.0.4.16141: [|tcp] *--- cut --* As you can see the TTL on our packets is initially set to 255. At the end of the connection during the last PUSH and FIN packets all of a sudden bird sets the TTL to 1. I have no ttl security enabled and even if I explicitly disable it the problem persists. A workaround that I found to work is to pass multihop 20 directive which then changes the ttl 1 above to ttl 20 which alleviates the problem. Let me know if you need any additional information. Regards, -- Rumen Telbizov Site Reliability Engineer <http://telbizov.com/>
The setup of the TCP session is handled by the kernel, hence the higher TTL. Once TCP is established, (e)BGP tends to use a TTL of 1 unless it's a multihop session.l, or you're using GTSM. It's expected, and is partly due to the limitations of how sockets are implemented in Linux. M On Fri, 1 Apr 2022, 03:10 Rumen Telbizov, <rumen.telbizov@menlosecurity.com> wrote:
Hello Bird users,
First time poster and new subscriber. I noticed something strange and wanted to report it here in case this is in fact a bug that deserves attention.
I run bird 2.0.7-4.1 on Debian 11. I have a BGP section configured as passive that acts as a TCP health-check endpoint.
It is as follows: *--- cut --* protocol bgp HEALTHCHECKv4 { hold time 6; startup hold time 20; connect delay time 3; connect retry time 6; error wait time 3, 12; passive on;
local 100.64.0.5 as 65000; neighbor 100.64.0.4 as 65535; } *--- cut --*
What ends up happening on the wire is this: *--- cut --* 23:15:09.443792 IP (tos 0x0, ttl 254, id 4040, offset 0, flags [DF], proto TCP (6), length 60) 100.64.0.4.16141 > 100.64.0.5.179: Flags [S], cksum 0xa78f (correct), seq 723435095, win 8961, options [mss 8621,sackOK,TS val 3290475421 ecr 0,nop,wscale 0], length 0 23:15:09.443823 IP (tos 0xc0, ttl 255, id 0, offset 0, flags [DF], proto TCP (6), length 60) 100.64.0.5.179 > 100.64.0.4.16141: Flags [S.], cksum 0xc8b7 (incorrect -> 0x0785), seq 124371865, ack 723435096, win 62643, options [mss 8961,sackOK,TS val 2210037294 ecr 3290475421,nop,wscale 7], length 0 23:15:09.444437 IP (tos 0x0, ttl 254, id 4041, offset 0, flags [DF], proto TCP (6), length 52) 100.64.0.4.16141 > 100.64.0.5.179: Flags [.], cksum 0x2550 (correct), seq 1, ack 1, win 8961, options [nop,nop,TS val 3290475422 ecr 2210037294], length 0 23:15:09.444471 IP (tos 0x0, ttl 254, id 4042, offset 0, flags [DF], proto TCP (6), length 52) 100.64.0.4.16141 > 100.64.0.5.179: Flags [F.], cksum 0x254e (correct), seq 1, ack 1, win 8961, options [nop,nop,TS val 3290475423 ecr 2210037294], length 0 23:15:09.444576 IP (tos 0xc0, *ttl 1*, id 55411, offset 0, flags [DF], proto TCP (6), length 99) 100.64.0.5.179 > 100.64.0.4.16141: Flags [P.], cksum 0xc8de (incorrect -> 0x58b6), seq 1:48, ack 2, win 490, options [nop,nop,TS val 2210037294 ecr 3290475423], length 47: BGP Open Message (1), length: 47 Version 4, my AS 65000, Holdtime 6s, ID 100.64.0.5 Optional parameters, length: 18 Option Capabilities Advertisement (2), length: 16 Route Refresh (2), length: 0 Graceful Restart (64), length: 2 Restart Flags: [none], Restart Time 120s 0x0000: 0078 32-Bit AS Number (65), length: 4 4 Byte AS 65000 0x0000: 0000 fde8 Enhanced Route Refresh (70), length: 0 no decoder for Capability 70 Long-lived Graceful Restart (71), length: 0
23:15:09.444602 IP (tos 0xc0, *ttl 1*, id 55412, offset 0, flags [DF], proto TCP (6), length 52) 100.64.0.5.179 > 100.64.0.4.16141: Flags [F.], cksum 0xc8af (incorrect -> 0x4635), seq 48, ack 2, win 490, options [nop,nop,TS val 2210037294 ecr 3290475423], length 0 23:15:09.444670 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto ICMP (1), length 56) 100.64.0.4 > 100.64.0.5: ICMP time exceeded in-transit, length 36 IP (tos 0xc0, ttl 1, id 55411, offset 0, flags [DF], proto TCP (6), length 99) 100.64.0.5.179 > 100.64.0.4.16141: [|tcp] *--- cut --*
As you can see the TTL on our packets is initially set to 255. At the end of the connection during the last PUSH and FIN packets all of a sudden bird sets the TTL to 1.
I have no ttl security enabled and even if I explicitly disable it the problem persists. A workaround that I found to work is to pass multihop 20 directive which then changes the ttl 1 above to ttl 20 which alleviates the problem.
Let me know if you need any additional information.
Regards, -- Rumen Telbizov Site Reliability Engineer <http://telbizov.com/>
Not exactly. Yes, the session is handled by the kernel. But it should be possible to set the TTL before listening to the socket. Looks like bird just do not use the TTL for the listening BGP sessions before it gets incoming connection. And it somewhat OK for the case when bird listens a single socket on 0.0.0.0 for all connections - it can not determine the TTL beforehand. But in case one use strict bind for BGP sockets, I think it would be better to set TTL. Although even in that case it can have several sessions with the same source IP and various TTLs. I tried to made some quick patch, which set ttl for strict bind sessions, but it does not handle the case with mixed TTLs on the same socket, otherwise it should work. Let's see what bird's developers say about it. On Fri, Apr 1, 2022 at 10:58 AM Matthew Walster <matthew@walster.org> wrote:
The setup of the TCP session is handled by the kernel, hence the higher TTL. Once TCP is established, (e)BGP tends to use a TTL of 1 unless it's a multihop session.l, or you're using GTSM.
It's expected, and is partly due to the limitations of how sockets are implemented in Linux.
M
On Fri, 1 Apr 2022, 03:10 Rumen Telbizov, <rumen.telbizov@menlosecurity.com> wrote:
Hello Bird users,
First time poster and new subscriber. I noticed something strange and wanted to report it here in case this is in fact a bug that deserves attention.
I run bird 2.0.7-4.1 on Debian 11. I have a BGP section configured as passive that acts as a TCP health-check endpoint.
It is as follows: --- cut -- protocol bgp HEALTHCHECKv4 { hold time 6; startup hold time 20; connect delay time 3; connect retry time 6; error wait time 3, 12; passive on;
local 100.64.0.5 as 65000; neighbor 100.64.0.4 as 65535; } --- cut --
What ends up happening on the wire is this: --- cut -- 23:15:09.443792 IP (tos 0x0, ttl 254, id 4040, offset 0, flags [DF], proto TCP (6), length 60) 100.64.0.4.16141 > 100.64.0.5.179: Flags [S], cksum 0xa78f (correct), seq 723435095, win 8961, options [mss 8621,sackOK,TS val 3290475421 ecr 0,nop,wscale 0], length 0 23:15:09.443823 IP (tos 0xc0, ttl 255, id 0, offset 0, flags [DF], proto TCP (6), length 60) 100.64.0.5.179 > 100.64.0.4.16141: Flags [S.], cksum 0xc8b7 (incorrect -> 0x0785), seq 124371865, ack 723435096, win 62643, options [mss 8961,sackOK,TS val 2210037294 ecr 3290475421,nop,wscale 7], length 0 23:15:09.444437 IP (tos 0x0, ttl 254, id 4041, offset 0, flags [DF], proto TCP (6), length 52) 100.64.0.4.16141 > 100.64.0.5.179: Flags [.], cksum 0x2550 (correct), seq 1, ack 1, win 8961, options [nop,nop,TS val 3290475422 ecr 2210037294], length 0 23:15:09.444471 IP (tos 0x0, ttl 254, id 4042, offset 0, flags [DF], proto TCP (6), length 52) 100.64.0.4.16141 > 100.64.0.5.179: Flags [F.], cksum 0x254e (correct), seq 1, ack 1, win 8961, options [nop,nop,TS val 3290475423 ecr 2210037294], length 0 23:15:09.444576 IP (tos 0xc0, ttl 1, id 55411, offset 0, flags [DF], proto TCP (6), length 99) 100.64.0.5.179 > 100.64.0.4.16141: Flags [P.], cksum 0xc8de (incorrect -> 0x58b6), seq 1:48, ack 2, win 490, options [nop,nop,TS val 2210037294 ecr 3290475423], length 47: BGP Open Message (1), length: 47 Version 4, my AS 65000, Holdtime 6s, ID 100.64.0.5 Optional parameters, length: 18 Option Capabilities Advertisement (2), length: 16 Route Refresh (2), length: 0 Graceful Restart (64), length: 2 Restart Flags: [none], Restart Time 120s 0x0000: 0078 32-Bit AS Number (65), length: 4 4 Byte AS 65000 0x0000: 0000 fde8 Enhanced Route Refresh (70), length: 0 no decoder for Capability 70 Long-lived Graceful Restart (71), length: 0
23:15:09.444602 IP (tos 0xc0, ttl 1, id 55412, offset 0, flags [DF], proto TCP (6), length 52) 100.64.0.5.179 > 100.64.0.4.16141: Flags [F.], cksum 0xc8af (incorrect -> 0x4635), seq 48, ack 2, win 490, options [nop,nop,TS val 2210037294 ecr 3290475423], length 0 23:15:09.444670 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto ICMP (1), length 56) 100.64.0.4 > 100.64.0.5: ICMP time exceeded in-transit, length 36 IP (tos 0xc0, ttl 1, id 55411, offset 0, flags [DF], proto TCP (6), length 99) 100.64.0.5.179 > 100.64.0.4.16141: [|tcp] --- cut --
As you can see the TTL on our packets is initially set to 255. At the end of the connection during the last PUSH and FIN packets all of a sudden bird sets the TTL to 1.
I have no ttl security enabled and even if I explicitly disable it the problem persists. A workaround that I found to work is to pass multihop 20 directive which then changes the ttl 1 above to ttl 20 which alleviates the problem.
Let me know if you need any additional information.
Regards, -- Rumen Telbizov Site Reliability Engineer
On Fri, Apr 01, 2022 at 03:57:12PM +0200, Alexander Zubkov wrote:
Not exactly. Yes, the session is handled by the kernel. But it should be possible to set the TTL before listening to the socket. Looks like bird just do not use the TTL for the listening BGP sessions before it gets incoming connection. And it somewhat OK for the case when bird listens a single socket on 0.0.0.0 for all connections - it can not determine the TTL beforehand.
Hi As Matthew Walster wrote, we just listen on common socket, then after the connection is accepted we associate it with a session and then see expected TTL (1 for direct EBGP, more for multihop sessions) and set it for the new TCP connection. I think that this behavior is okay and should work, i do not know why it failed for Rumen Telbizov with 'ICMP time exceeded in-transit' error, as by the IPs (100.64.0.4 > 100.64.0.5) it is a direct connection. It would be better if we could apriori configure per-remote-IP TTL on listening socket, like we can do for MD5 auth, but i do not think that is possible with Linux API.
But in case one use strict bind for BGP sockets, I think it would be better to set TTL. Although even in that case it can have several sessions with the same source IP and various TTLs. I tried to made some quick patch, which set ttl for strict bind sessions, but it does not handle the case with mixed TTLs on the same socket, otherwise it should work. Let's see what bird's developers say about it.
That is an interesting idea. It need not be restricted to 'strict bind', just each listening socket could be set to use maximum TTL value from BGP instances bound to that socket. So if all these BGP instances are direct EBGP, it would be 1. It would need to be done correctly w.r.t. reconfigurations and also handle the case that for multihop session we do not explicitly handle the TTL value, we use system default. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
participants (4)
-
Alexander Zubkov -
Matthew Walster -
Ondrej Zajicek -
Rumen Telbizov