BFD socket bind error upon reboot
Hi, upon a reboot of a system running BFD I recently noticed BFD breaking with error "Cannot assign requested address" once the system came back. Restarting the BFD protocol in question solved the problem. Hence, apparently the system's network wasn't fully up and running at the time bird started; according to systemd documentation (https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/), one possible fix is to change bird.service from After=network.target to After=network-online.target Wants=network-online.target Alternatively, IP_FREEBIND on the BFD tx socket would allow binding to an IP address that does not (yet) exist which seems more elegant since it doesn't need to take into account that different systems may define "online" differently. I'm wondering whether there's something I'm missing as to why IP_FREEBIND shouldn't be used? Thanks, J.
On Fri, Aug 10, 2018 at 11:23:56PM +0200, J. Kendzorra wrote:
Hi,
upon a reboot of a system running BFD I recently noticed BFD breaking with error "Cannot assign requested address" once the system came back. Restarting the BFD protocol in question solved the problem. Hence, apparently the system's network wasn't fully up and running at the time bird started; according to systemd documentation (https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/), one possible fix is to change bird.service from After=network.target to After=network-online.target Wants=network-online.target Alternatively, IP_FREEBIND on the BFD tx socket would allow binding to an IP address that does not (yet) exist which seems more elegant since it doesn't need to take into account that different systems may define "online" differently. I'm wondering whether there's something I'm missing as to why IP_FREEBIND shouldn't be used?
Hi Generally, BIRD should not try to use an address before it notices that the address is available/active. If BIRD tries to bind the socket before that, then it is a bug. Which BIRD version it is? It is an IPv6 address? Which protocol caused the BFD session or it is static one? I would suspect that the issue is related to IPv6 DAD. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Hi,
Generally, BIRD should not try to use an address before it notices that the address is available/active. If BIRD tries to bind the socket before that, then it is a bug.
This seems to be a common pattern for services that are started when network is supposedly ready, but it really isn't (see many discussions around network.target vs. network-online.target).
Which BIRD version it is? It is an IPv6 address? Which protocol caused the BFD session or it is static one? I would suspect that the issue is related to IPv6 DAD.
Unfortunately, that's not the case; we have BFD enabled only for IPv4 protocol bfd bfd_internal { neighbor 192.168.1.2; (...) neighbor 192.168.1.10; }; We're running bird 1.6.3-3 (Ubuntu Bionic). The error I've seen when the BFD sessions didn't come up was this: <ERR> bfd_internal: Socket error: bind: Cannot assign requested address Since the listeners on ports 3784 and 4784 are wildcard binds, those shouldn't generate the error. However, there's a tx socket for communication with BFD peers on a random port (192.168.1.1:38164 as of today :) which I believe is the reason for the error message. Since the interface in question is a vlan on top of a bond (with multiple NICs involved), my working theory has been that upon reboot bird tried binding to 192.168.1.1:0 but since this interface wasn't fully available, got EADDRNOTAVAIL returned and BFD broke as a result. Note that I was able to restart the protocol later (birdc restart bfd_internal) which immediately made things work. I added some debug code to the sections where the BFD protocol binds sockets to catch when the bind happens and the error occurs (to support my theory), however I haven't been able to replicate the problem yet. Regards, J.
On Wed, Aug 15, 2018 at 11:23:33AM +0200, J. Kendzorra wrote:
Hi,
Generally, BIRD should not try to use an address before it notices that the address is available/active. If BIRD tries to bind the socket before that, then it is a bug.
This seems to be a common pattern for services that are started when network is supposedly ready, but it really isn't (see many discussions around network.target vs. network-online.target).
Well, BIRD should not depend on starting when network is supposedly ready, as it should handle when interfaces/addresses are added or removed during its run.
Which BIRD version it is? It is an IPv6 address? Which protocol caused the BFD session or it is static one? I would suspect that the issue is related to IPv6 DAD.
Unfortunately, that's not the case; we have BFD enabled only for IPv4
protocol bfd bfd_internal { neighbor 192.168.1.2; (...) neighbor 192.168.1.10; };
We're running bird 1.6.3-3 (Ubuntu Bionic). The error I've seen when the BFD sessions didn't come up was this: <ERR> bfd_internal: Socket error: bind: Cannot assign requested address
Since the listeners on ports 3784 and 4784 are wildcard binds, those shouldn't generate the error. However, there's a tx socket for communication with BFD peers on a random port (192.168.1.1:38164 as of today :) which I
Just a guess, don't you have two IP addresses from the same range on the machine, or some other address/range that covers 192.168.1.2-10? As it does not check really for availability of the local listening addres, but for availability of the neighbor address. But in such case it will use the second address, unless the first one is explicitly configured.
I added some debug code to the sections where the BFD protocol binds sockets to catch when the bind happens and the error occurs (to support my theory), however I haven't been able to replicate the problem yet.
It may be also usefult to add enable debug { events, interfaces } for device and BFD protocols. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
participants (2)
-
J. Kendzorra -
Ondrej Zajicek