On Sat, May 23, 2020 at 10:43:52AM +0000, Joakim Tjernlund wrote:
On Fri, 2020-05-22 at 22:59 +0200, Ondrej Zajicek wrote:
On Fri, May 22, 2020 at 07:14:44PM +0000, Kenth Eriksson wrote:
On Thu, 2020-05-21 at 12:43 +0200, Ondrej Zajicek wrote:
This patch should fix the issue, could you try it?
Looks promising, applied on top of 2.0.7, and a quick test on the 5 node setup looks correct. Will do some more testing.
We definitely need this fix in the pending 2.0.8 :-)
This issue has a long history. In 2012, we changed data field for unnumbered PtP links from iface id (specified by RFC) to IP address based on reports of bugs in Quagga that required it, and we used out-of-band information to distinquish unnumberred PtPs with the same local IP address.
Then with OSPF graceful restart implementation, we found that we can no longer use out-of-band information, and we need to use only LSAdb info for routing table calculation, but i forgot to finish handling of this case, so multiple unnumbered PtPs with the same local IP addresses were broken.
This patch returned back iface id to data field for unnumbered PtP links (i.e. reverted back the change from 2012), while doing computation just from LSAdb info. It fixed your case (multiple unnumbered PtPs with the same local IP address) and is correct per RFC, but it may trigger bugs with other implementations (like the one that led to the 2012 change).
Not sure I follow here, have you done away with rt_pos_to_ifa() and friends now and gone back to the old way?
Yes, it does not use rt_pos_to_ifa(). The approach with rt_pos_to_ifa() does not work with graceful restart - after restart, router learns its own LSAs (generated by previous run) and needs to do routing table calculation without stored pos info. And it is probably bad idea to have different route calculation algorithms in these cases.
The old way had several drawbacks, one of them was this dependency on interface ID. Does current impl. depend on a well behaved neighbor too?
The current (2.0.7) is broken with regard to multiple unnumbered PtPs with the same local IP address (as it uses only IP address in the data field), but does not depend on well behaved neighbors. The offered patch uses interface IDs, like described in RFC. That patch is reliable (i do not see any issue with using interface IDs, what do you mean?), but depends on well behaved neighbors. And it seems that there are significat badly behaved ones.
Is it compatible with any other Bird release?
Yes, that is not an issue. BIRD (at least post-2012) does not use 'data' field of PtP links from neighbor LSAs. This field is only relevant for the router who originated that LSA. It would work (for BIRD peers) even if we put some random numbers here. My current idea how to make it work without interface-ids and without stored pos info: The problem is to match Router-LSA records with OSPF ifaces that generated them. Instead of using just 'data' field, we can use all fields ('data' to mach local IP address, router-id to see if there is established neighbor with that router-id, and matching configured cost). And for case with two parallel equal links that are described in Router-LSA by two equal records, we would have flag (in ospf_iface) that ensures one OSPF iface is matched with at most one record, so the first record is matched with the first matching iface and the second record is matched with the second matching iface. I would be glad to hear any comments to this idea or suggestions of other ideas how to solve it. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."