Hi, This patch fixes a situation where an interface enters TMPDOWN state and doesn't leave it until a device scan is triggered by timer. This seems to happen whenever the interface is up but with no IP address configured. BIRD will then put the interface in TMPDOWN state (correctly), but the state is not left even though an IP address is assigned later. The problem in turn appears to be the way IF_UPDATE flags are used to pass state information between if_update and ifa_update calls. A scan will make it right, but simply processing RTM messages won't. A workaround to this issue is to scan devices frequently, which might be not acceptable if reactions to RTM_NEWADDR/RTM_DELADDR/etc. are expected to happen without much average delay. As you'll see, this patch is probably overkill. Thanks, Pierluigi diff -a -u -r a/sysdep/linux/netlink.c b/sysdep/linux/netlink.c --- a/sysdep/linux/netlink.c 2013-11-25 06:19:10.000000000 -0800 +++ b/sysdep/linux/netlink.c 2013-12-05 18:33:09.282768865 -0800 @@ -963,11 +963,24 @@ case RTM_DELLINK: DBG("KRT: Received async link notification (%d)\n", h->nlmsg_type); nl_parse_link(h, 0); + kif_do_scan(NULL); break; case RTM_NEWADDR: case RTM_DELADDR: DBG("KRT: Received async address notification (%d)\n", h->nlmsg_type); nl_parse_addr(h); + /* XXX: this (and the one above) are to get unstuck when the if comes up + * with no IP address, then a route gets installed, then an IP address + * gets configured. The crux of the problem is that async message processing + * apparently never gets the interface unstuck from TMPDOWN state, + * but the scan will fix it. + * + * I believe that the problem is that BOTH if_update AND ifa_update must + * be called within the same scan (IF_UPDATE flags are relevant). + * This will scan ALL interface which is overkill. The parameter (currently + * unused) might be used to indicate which one to scan. + */ + kif_do_scan(NULL); break; default: DBG("KRT: Received unknown async notification (%d)\n", h->nlmsg_type);
* Pierluigi Rolando
This patch fixes a situation where an interface enters TMPDOWN state and doesn't leave it until a device scan is triggered by timer.
This seems to happen whenever the interface is up but with no IP address configured. BIRD will then put the interface in TMPDOWN state (correctly)
Hi, I question the correctness of putting an interface in any form of DOWN state even though there are no addresses assigned. There exists no requirement that an interface has addresses, even if it doesn't, it may still be UP and have active routes pointing to it. I therefore would ask you to consider fixing this in another way, by simply not requiring any IP addresses on the interface in order for it to be considered active. FWIW, here's my use case for unnumbered interfaces: $ ip a l dev nat64 3: nat64: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 500 link/none inet 192.0.2.0/32 scope global nat64 inet6 2001:db8::/128 scope global valid_lft forever preferred_lft forever $ ip r l dev nat64 87.238.33.15 proto bird 100.64.0.0/10 proto bird $ ip -6 r l dev nat64 2001:db8:: proto kernel metric 256 2a02:c0:0:0:64::/96 proto bird metric 1024 The "nat64" interface is a tun interface with a TAYGA process in the other end. The two addresses assigned to the interface (192.0.2.0/32 and 2001:db8::/128) serve absolutely no purpose except to persuade BIRD to consider the interface active - I'd rather get rid of them, if possible. Tore
On 25 Feb 2014, at 06:37, Tore Anderson wrote:
I question the correctness of putting an interface in any form of DOWN state even though there are no addresses assigned. There exists no requirement that an interface has addresses, even if it doesn't, it may still be UP and have active routes pointing to it. I therefore would ask you to consider fixing this in another way, by simply not requiring any IP addresses on the interface in order for it to be considered active.
+1. Either this or reliably detect subsequent numbering of the interfaces. -- Alex Bligh
-----Original Message----- From: Tore Anderson [mailto:tore@fud.no] Sent: Monday, February 24, 2014 10:38 PM To: Pierluigi Rolando; bird-users@network.cz Subject: Re: BIRD 1.4.0 bugfixes [1/2]
* Pierluigi Rolando
This patch fixes a situation where an interface enters TMPDOWN state and doesn't leave it until a device scan is triggered by timer.
This seems to happen whenever the interface is up but with no IP address configured. BIRD will then put the interface in TMPDOWN state (correctly)
Hi,
I question the correctness of putting an interface in any form of DOWN state even though there are no addresses assigned. There exists no requirement that an interface has addresses, even if it doesn't, it may still be UP and have active routes pointing to it. I therefore would ask you to consider fixing this in another way, by simply not requiring any IP addresses on the interface in order for it to be considered active.
Hi Tore, I understand your use case and I am inclined to agree with you. However. As much as I'd like to help you, I think this is better left in the hands of the BIRD core developers. All I'm trying to do here is to describe the bug we were having and showing one (admittedly nasty) workaround. I'll see if I can come up with something better, no promises though. Thanks, Pierluigi
I'll see if I can come up with something better, no promises though.
Quick follow-up for BIRD developers: what's the point of the IF_TMP_DOWN state in the first place? I think I understand its usage in the 'too much has changed' case of if_update, but not why it stays set for interfaces with no addresses. Thanks, Pierluigi
On Tue, Feb 25, 2014 at 08:56:52PM +0000, Pierluigi Rolando wrote:
I'll see if I can come up with something better, no promises though.
Quick follow-up for BIRD developers: what's the point of the IF_TMP_DOWN state in the first place? I think I understand its usage in the 'too much has changed' case of if_update, but not why it stays set for interfaces with no addresses.
Hi The reason for IF_TMP_DOWN during initial scan is probably that interface notifications to protocols are grouped - protocols are notified about new interfaces after initial scan is complete and addresses of interfaces are known. Your analysis of the problem has one error - the interface is not kept in IF_TMP_DOWN state when it has no address. IF_TMP_DOWN is set during initial scan, then it is cleared in if_end_update(), but the interface without IP address is held down (no IF_UP) because of if_recalc_flags(). Then, when a new address appears, IF_TMP_DOWN is set again for the interface, probably because of ifa_recalc_primary() in ifa_update(), and now it is not cleared (which is a bug), because it was set during asynchronous update, so there is no if_end_update() like in periodic scans. Similar problem is when the IP address is there but the primary IP address changed, this also causes IF_TMP_DOWN until the next scan. This could be probably fixed by calling if_end_partial_update() after asynchronous notification for related interfaces. The mentioned issue that an interface without addresses shouldn't be artificially held down is true, but it is currently assumed by protocols, so it would require some further code revision to fix it. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Mon, Feb 24, 2014 at 11:36:01PM +0000, Pierluigi Rolando wrote:
Hi,
This patch fixes a situation where an interface enters TMPDOWN state and doesn't leave it until a device scan is triggered by timer.
This seems to happen whenever the interface is up but with no IP address configured. BIRD will then put the interface in TMPDOWN state (correctly), but the state is not left even though an IP address is assigned later. The problem in turn appears to be the way IF_UPDATE flags are used to pass state information between if_update and ifa_update calls. A scan will make it right, but simply processing RTM messages won't.
A workaround to this issue is to scan devices frequently, which might be not acceptable if reactions to RTM_NEWADDR/RTM_DELADDR/etc. are expected to happen without much average delay.
As you'll see, this patch is probably overkill.
You could try this patch, which fixes IF_TMP_DOWN flag. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
participants (4)
-
Alex Bligh -
Ondrej Zajicek -
Pierluigi Rolando -
Tore Anderson