Hi,
Can it be some IO issue? We had similar problems with bird making an
IO loop for too much time so that hold timers were expired by that
time. It was probably caused when it was writing a log file on a busy
HDD. But we catch those with syslog too, because that write is
blocking for the bird too.
But nevertheless the OS should have been replying something in the TCP
session in your case - accepting the segments or showing that the
window is full. As far as I know bird does not have its own TCP stack,
so the OS is to be blamed for that part. It can be stuck for some
reason/bug or as other people suggested it could be sending packets
somewhere else or not knowing where to send them.
On Fri, Feb 28, 2020 at 4:46 PM Ondrej Zajicek <
santiago@crfreenet.org> wrote:
On Fri, Feb 28, 2020 at 03:33:06PM +0100, Stavros Konstantaras wrote:
HI Alarig,
Thank you for sharing your experiences. I don’t have the MSS currently but if that was the case, wouldn’t have experienced the drops more frequently?
Currently it happens once per month (or 0.8 per month) and contrary to your case which was 100% network related, in our case we don’t even see the
reply packet being generated and leaving the box.
What puzzles me also and based on the capture, is that I don’t see the TCP-ACK messages being sent to the customer. If BIRD opens a TCP socket
(not a simple RAW socket), I assume that the TCP connection will be handled by the OS and BIRD will push data segments (BGP keep alive messages) when ready.
But as per output, I don’t see the TCP ack messages at all. Is BIRD handling the TCP communication as well?
Hi
That is a good point. BIRD uses regular TCP socket, so if you do not see
TCP ack, then it is likely an underlying (kernel) issue. There were some
reports of IPv6 issues in recent kernels [*]
Also, the log message:
Feb 20 21:46:11 rs1-mng bird6: 2001:7F8:1::A500:19:7727:1: Received: Hold timer expired
shows that the notification message was received by the BIRD. The packet
dump shows that keepalives were not sent by BIRD side. You could enable
'debug all' for given peer to see if BIRD tries to send keepalives. You
could also monitor state of socket using 'ss' tool.
[*] https://bird.network.cz/pipermail/bird-users/2020-February/014270.html
--
Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."