[PATCH] BGP: Do not start connect retry timer if connect immediately succeeds
BGP connect retry timer is unconditionally started in bgp_connect(), which will be stopped by bgp_send_open() in bgp_connected() with successful connection. However, if the sk_connect() in sk_open() immediately succeeds, this code path will be immediately called. This happens before the timer is even started, thus the timer will never be stopped again. Signed-off-by: Ze Xia <billxia135@gmail.com> --- proto/bgp/bgp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/proto/bgp/bgp.c b/proto/bgp/bgp.c index ad51e626..6358afc6 100644 --- a/proto/bgp/bgp.c +++ b/proto/bgp/bgp.c @@ -1628,8 +1628,10 @@ bgp_connect(struct bgp_proto *p) /* Enter Connect state and start establishing c s->ao_keys_num = 0; s->ao_keys_init = NULL; - DBG("BGP: Waiting for connect success\n"); - bgp_start_timer(conn->connect_timer, p->cf->connect_retry_time); + if (s->type == SK_TCP_ACTIVE) { + DBG("BGP: Waiting for connect success\n"); + bgp_start_timer(p, conn->connect_timer, p->cf->connect_retry_time); + } return; err: -- 2.45.2
Hello to the community! I have 2 things to clarify here. On Fri, Oct 17, 2025 at 9:43 PM Ze Xia <billxia135@gmail.com> wrote:
+ bgp_start_timer(p, conn->connect_timer, p->cf->connect_retry_time);
First, the above line is copied from 3.1.2 code and is inconsistent with master branch, the p parameter should be removed in master branch, I'm sorry for this mistake. The second thing is about how to reproduce the bug. It happens whenever TCP connect() succeeds immediately, however it's pretty hard to happen in real life if it is ever possible. I tried starting two bird daemons that binds to 127.0.0.1 with different ports, and strace shows that connect() returns -1 with EINPROGRESS like this: connect(12, {sa_family=AF_INET, sin_port=htons(179), sin_addr=inet_addr("127.0.0.1")}, 32) = -1 EINPROGRESS (Operation now in progress) which means that the bug won't be reproduced. I discovered this bug in a very special use case. I'm running a lot of bird daemons on a single physical machine to acquire routing tables for a network, and I'm trying to replace TCP sockets with UDS(Unix Domain Socket), by LD_PRELOAD-ing a customized .so, as in this proxychains project: https://github.com/haad/proxychains/tree/master Basically I'm replacing the implementation of connect() function (along with others) in libc.so with my own one, so all connect() calls made by bird jumps into my code. Then, UDS connect()-s can success immediately, thus I returned 0 to connect() call, representing immediate success, causing the bug to happen. The phenomenon is that bird actively tears down BGP connections after connect retry timeout (around 90s, default value:120s). I can stably reproduce this. In my use case, I should fix this by emulating real-world behaviours of connect(), i.e. returning EINPROGRESS and inject POLLOUT in later poll(). However, I don't think that bird should rely on the fact that connect() never immediately success, thus this patch should be applied.
Hello Ze Xia, this looks like a real bug, yet I'm not sure whether we happen to observe it in real world often. Please, do you have any instructions how to trigger it reliably so that we can add it to our CI? Thanks, Maria On October 17, 2025 3:43:26 PM GMT+02:00, Ze Xia <billxia135@gmail.com> wrote:
BGP connect retry timer is unconditionally started in bgp_connect(), which will be stopped by bgp_send_open() in bgp_connected() with successful connection. However, if the sk_connect() in sk_open() immediately succeeds, this code path will be immediately called. This happens before the timer is even started, thus the timer will never be stopped again.
Signed-off-by: Ze Xia <billxia135@gmail.com> --- proto/bgp/bgp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/proto/bgp/bgp.c b/proto/bgp/bgp.c index ad51e626..6358afc6 100644 --- a/proto/bgp/bgp.c +++ b/proto/bgp/bgp.c @@ -1628,8 +1628,10 @@ bgp_connect(struct bgp_proto *p) /* Enter Connect state and start establishing c s->ao_keys_num = 0; s->ao_keys_init = NULL;
- DBG("BGP: Waiting for connect success\n"); - bgp_start_timer(conn->connect_timer, p->cf->connect_retry_time); + if (s->type == SK_TCP_ACTIVE) { + DBG("BGP: Waiting for connect success\n"); + bgp_start_timer(p, conn->connect_timer, p->cf->connect_retry_time); + } return;
err: -- 2.45.2
-- Maria Matejka (she/her) | BIRD Team Leader | CZ.NIC, z.s.p.o.
On Fri, Oct 17, 2025 at 10:57 PM Maria Matejka <maria.matejka@nic.cz> wrote:
Hello Ze Xia,
this looks like a real bug, yet I'm not sure whether we happen to observe it in real world often. Please, do you have any instructions how to trigger it reliably so that we can add it to our CI?
Thanks, Maria
I tried to trigger it "naturally" by creating 2 bird daemons connected through veth-pair, this fails to reproduce the bug. According to strace, connect() always returns -1 with errno = EINPROGRESS. However, I figured out that I can wait a little while for connect() to success by preloading a custom dynamically-linked library. My current implementation: #include <dlfcn.h> #include <errno.h> #include <poll.h> #include <sys/socket.h> // milliseconds #define MAX_CONNECT_BLOCKTIME 10 typedef int (*connect_t)(int, const struct sockaddr *, socklen_t); __attribute__((visibility("default"))) int connect(int sock, const struct sockaddr *addr, socklen_t len) { int orig_errno = errno; connect_t true_connect = dlsym(RTLD_NEXT, "connect"); int r = true_connect(sock, addr, len); if (!(addr->sa_family == AF_INET && r == -1 && errno == EINPROGRESS)) return r; struct pollfd fds[1] = {{.fd = sock, .events = POLLOUT | POLLERR | POLLHUP}}; int poll_res = poll(fds, 1, MAX_CONNECT_BLOCKTIME); if (poll_res == 0) { errno = EINPROGRESS; return -1; } int err; socklen_t errlen = sizeof(err); getsockopt(sock, SOL_SOCKET, SO_ERROR, &err, &errlen); if (err == 0) { errno = orig_errno; return 0; } else { errno = err; return -1; } } Compile it with: gcc bird-preload.c -fPIC -fvisibility=hidden -shared -o libpreload.so Then write the absolute path of libpreload.so to /etc/ld.so.preload (man ld.so for more information about LD_PRELOAD). I started 2 bird daemons inside a docker container with config file as in attachment of this mail, and connected them with veth-pair. When this libpreload.so is preloaded, the connect retry timer (2s) should fire every time and tears down the connection, causing a reconnection, which can be checked in the debug log. With the libpreload.so, bird should behave just like the thread does not get scheduled for a while (<10ms) when calling connect(), it seems to have no other side-effect to me. I'm not sure does this fits in your CI workflow though. Hope this helps! Regards, Ze Xia
Hi all, My guess is that it could depend on socket type. I.e. nonblocking tcp socket in Linux might always return EINPROGRESS, but for unix socket other situations might be possible. Regards, Alexander On Fri, Oct 17, 2025, 22:02 Ze Xia <billxia135@gmail.com> wrote:
On Fri, Oct 17, 2025 at 10:57 PM Maria Matejka <maria.matejka@nic.cz> wrote:
Hello Ze Xia,
this looks like a real bug, yet I'm not sure whether we happen to
observe it in real world often. Please, do you have any instructions how to trigger it reliably so that we can add it to our CI?
Thanks, Maria
I tried to trigger it "naturally" by creating 2 bird daemons connected through veth-pair, this fails to reproduce the bug. According to strace, connect() always returns -1 with errno = EINPROGRESS.
However, I figured out that I can wait a little while for connect() to success by preloading a custom dynamically-linked library. My current implementation:
#include <dlfcn.h> #include <errno.h> #include <poll.h> #include <sys/socket.h>
// milliseconds #define MAX_CONNECT_BLOCKTIME 10
typedef int (*connect_t)(int, const struct sockaddr *, socklen_t);
__attribute__((visibility("default"))) int connect(int sock, const struct sockaddr *addr, socklen_t len) { int orig_errno = errno;
connect_t true_connect = dlsym(RTLD_NEXT, "connect"); int r = true_connect(sock, addr, len); if (!(addr->sa_family == AF_INET && r == -1 && errno == EINPROGRESS)) return r;
struct pollfd fds[1] = {{.fd = sock, .events = POLLOUT | POLLERR | POLLHUP}}; int poll_res = poll(fds, 1, MAX_CONNECT_BLOCKTIME); if (poll_res == 0) { errno = EINPROGRESS; return -1; } int err; socklen_t errlen = sizeof(err); getsockopt(sock, SOL_SOCKET, SO_ERROR, &err, &errlen); if (err == 0) { errno = orig_errno; return 0; } else { errno = err; return -1; } }
Compile it with:
gcc bird-preload.c -fPIC -fvisibility=hidden -shared -o libpreload.so
Then write the absolute path of libpreload.so to /etc/ld.so.preload (man ld.so for more information about LD_PRELOAD). I started 2 bird daemons inside a docker container with config file as in attachment of this mail, and connected them with veth-pair. When this libpreload.so is preloaded, the connect retry timer (2s) should fire every time and tears down the connection, causing a reconnection, which can be checked in the debug log.
With the libpreload.so, bird should behave just like the thread does not get scheduled for a while (<10ms) when calling connect(), it seems to have no other side-effect to me. I'm not sure does this fits in your CI workflow though. Hope this helps!
Regards, Ze Xia
participants (4)
-
Alexander Zubkov -
Maria Matejka -
Ze Xia -
夏泽