Hi! We are having a problem were sometimes bird ospf neighbor state machine becomes stuck in a 2-way state. This happens when two broadcast interfaces are connected, one of them is running quagga and the other end is running bird. The quagga router has prio 0 and bird prio 5, i.e. quagga is not eligible to become DR. Why does not bird transition from 2-way to ExStart? I believe the following lines causes me.dr to become 0.0.0.0. Bird adds itself as an eligible router as described in section 10.4, but with the address of zero? me.dr = ospf_is_v2(p) ? ipa_to_u32(ifa->drip) : ifa->drid; me.bdr = ospf_is_v2(p) ? ipa_to_u32(ifa->bdrip) : ifa->bdrid; me.iface_id = ifa->iface_id; add_tail(&ifa->neigh_list, NODE & me); nbdr = elect_bdr(p, ifa->neigh_list); ndr = elect_dr(p, ifa->neigh_list); Due to the above, I believe later in "can_do_adj" the following will not become true; case OSPF_IS_DROTHER: if (((n->rid == ifa->drid) || (n->rid == ifa->bdrid)) && (n->state >= NEIGHBOR_2WAY)) i = 1; The quagga router says that bird is the DR, see tcpdump below and bird logs below; bird> show ospf interface ospfv2_1: Interface eth1 (10.210.138.68/30) Type: broadcast Area: 2.2.2.2 (33686018) State: DROther Priority: 5 Cost: 10 Hello timer: 10 Wait timer: 40 Dead timer: 40 Retransmit timer: 5 Designated router (ID): 0.0.0.0 Designated router (IP): 0.0.0.0 Backup designated router (ID): 0.0.0.0 Backup designated router (IP): 0.0.0.0 bird> show ospf neighbors ospfv2_1: Router ID Pri State DTime Interface Router IP 10.210.138.70 0 2-Way/Other 35.744 eth1 10.210.138.7 0 A tcpdump showing the OSPF hellos between the two routers. 10.210.138.69 is running bird and 10.210.138.70 is running quagga. # tcpdump -r ospf_bird.pcap -Z root -v reading from file ospf_bird.pcap, link-type EN10MB (Ethernet) 09:16:31.311431 IP (tos 0xc0, ttl 1, id 44687, offset 0, flags [none], proto OSPF (89), length 52) 10.210.138.70 > 10.210.138.69: OSPFv2, Database Description, length: 32 Router-ID: 10.210.138.70, Area 2.2.2.2, Authentication Type: none (0) Options: [External], DD Flags: [Init, More, Master] 09:16:34.705820 IP (tos 0xc0, ttl 1, id 55304, offset 0, flags [none], proto OSPF (89), length 68) 10.210.138.69 > 224.0.0.5: OSPFv2, Hello, length: 48 Router-ID: 10.210.138.69, Area 2.2.2.2, Authentication Type: none (0) Options: [External] Hello Timer: 10s, Dead Timer 40s, Mask: 255.255.255.252, Priority: 5 Neighbor List: 10.210.138.70 09:16:36.292044 IP (tos 0xc0, ttl 1, id 44688, offset 0, flags [none], proto OSPF (89), length 68) 10.210.138.70 > 224.0.0.5: OSPFv2, Hello, length: 48 Router-ID: 10.210.138.70, Area 2.2.2.2, Authentication Type: none (0) Options: [External] Hello Timer: 10s, Dead Timer 40s, Mask: 255.255.255.252, Priority: 0 Designated Router 10.210.138.69, Backup Designated Router 10.210.138.69 Neighbor List: 10.210.138.69 09:16:36.311317 IP (tos 0xc0, ttl 1, id 44689, offset 0, flags [none], proto OSPF (89), length 52) 10.210.138.70 > 10.210.138.69: OSPFv2, Database Description, length: 32 Router-ID: 10.210.138.70, Area 2.2.2.2, Authentication Type: none (0) Options: [External], DD Flags: [Init, More, Master] #
On Thu, Oct 03, 2019 at 01:49:23PM +0000, Kenth Eriksson wrote:
Hi!
We are having a problem were sometimes bird ospf neighbor state machine becomes stuck in a 2-way state. This happens when two broadcast interfaces are connected, one of them is running quagga and the other end is running bird. The quagga router has prio 0 and bird prio 5, i.e. quagga is not eligible to become DR.
Why does not bird transition from 2-way to ExStart?
I believe the following lines causes me.dr to become 0.0.0.0. Bird adds itself as an eligible router as described in section 10.4, but with the address of zero?
me.dr = ospf_is_v2(p) ? ipa_to_u32(ifa->drip) : ifa->drid; me.bdr = ospf_is_v2(p) ? ipa_to_u32(ifa->bdrip) : ifa->bdrid;
Hi I do not get why do you think that it is added with the address of zero. There is a line above specifying local address is used: me.ip = ifa->addr->ip; It seems to me that quagga elected bird, but bird for some reason does not elect itself. How often does it happen in that configuration? The election should be relatively deterministic, it should happen just on order of wait timers expiration and hello packet reception. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Thu, 2019-10-03 at 16:46 +0200, Ondrej Zajicek wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Thu, Oct 03, 2019 at 01:49:23PM +0000, Kenth Eriksson wrote:
Hi!
We are having a problem were sometimes bird ospf neighbor state machine becomes stuck in a 2-way state. This happens when two broadcast interfaces are connected, one of them is running quagga and the other end is running bird. The quagga router has prio 0 and bird prio 5, i.e. quagga is not eligible to become DR.
Why does not bird transition from 2-way to ExStart?
I believe the following lines causes me.dr to become 0.0.0.0. Bird adds itself as an eligible router as described in section 10.4, but with the address of zero?
me.dr = ospf_is_v2(p) ? ipa_to_u32(ifa->drip) : ifa->drid; me.bdr = ospf_is_v2(p) ? ipa_to_u32(ifa->bdrip) : ifa->bdrid;
Hi
I do not get why do you think that it is added with the address of zero. There is a line above specifying local address is used:
me.ip = ifa->addr->ip;
me.dr is 0 because ifa->drip is 0, but should it have declared itself as an eligible DR?
It seems to me that quagga elected bird, but bird for some reason does not elect itself.
Correct.
How often does it happen in that configuration? The election should be relatively deterministic, it should happen just on order of wait timers expiration and hello packet reception.
This does not happen every time. Not sure at this point how frequent it is. But when it happens it is stuck there forever.
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Thu, Oct 03, 2019 at 03:28:45PM +0000, Kenth Eriksson wrote:
On Thu, 2019-10-03 at 16:46 +0200, Ondrej Zajicek wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Thu, Oct 03, 2019 at 01:49:23PM +0000, Kenth Eriksson wrote:
Hi!
We are having a problem were sometimes bird ospf neighbor state machine becomes stuck in a 2-way state. This happens when two broadcast interfaces are connected, one of them is running quagga and the other end is running bird. The quagga router has prio 0 and bird prio 5, i.e. quagga is not eligible to become DR.
Why does not bird transition from 2-way to ExStart?
I believe the following lines causes me.dr to become 0.0.0.0. Bird adds itself as an eligible router as described in section 10.4, but with the address of zero?
me.dr = ospf_is_v2(p) ? ipa_to_u32(ifa->drip) : ifa->drid; me.bdr = ospf_is_v2(p) ? ipa_to_u32(ifa->bdrip) : ifa->bdrid;
Hi
I do not get why do you think that it is added with the address of zero. There is a line above specifying local address is used:
me.ip = ifa->addr->ip;
me.dr is 0 because ifa->drip is 0, but should it have declared itself as an eligible DR?
Oh, you meant DR IP, not neghbor IP. I think that it is correct - the process should start with the node idea of DR IP, which is initially zero (RFC 2328 9.4. the first paragraph). It should also be fixed by the second election: /* 9.4. (4) */ if (((ifa->drid == myid) && (ndr != &me)) || ((ifa->drid != myid) && (ndr == &me)) || ((ifa->bdrid == myid) && (nbdr != &me)) || ((ifa->bdrid != myid) && (nbdr == &me))) { me.dr = ndr ? neigh_get_id(p, ndr) : 0; me.bdr = nbdr ? neigh_get_id(p, nbdr) : 0; nbdr = elect_bdr(p, ifa->neigh_list); ndr = elect_dr(p, ifa->neigh_list); if (ndr == NULL) ndr = nbdr; } -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Thu, Oct 03, 2019 at 06:36:05PM +0200, Ondrej Zajicek wrote:
me.dr is 0 because ifa->drip is 0, but should it have declared itself as an eligible DR?
Oh, you meant DR IP, not neghbor IP. I think that it is correct - the process should start with the node idea of DR IP, which is initially zero (RFC 2328 9.4. the first paragraph).
Just reviewed the election code and it seems consistent to the algorithm in RFC 2328. If no electable neighbor is there and me.dr / me.bdr is initially zero, then it should elect itself as BDR in the first round (also elect itself as DR by the ndr==NULL condition), and elect itself as DR in the second election. Could you add some debug output to check the value of DR and BDR (both ID and IP) after first and possibly second round? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Thu, 2019-10-03 at 19:04 +0200, Ondrej Zajicek wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Thu, Oct 03, 2019 at 06:36:05PM +0200, Ondrej Zajicek wrote:
me.dr is 0 because ifa->drip is 0, but should it have declared itself as an eligible DR?
Oh, you meant DR IP, not neghbor IP. I think that it is correct - the process should start with the node idea of DR IP, which is initially zero (RFC 2328 9.4. the first paragraph).
Just reviewed the election code and it seems consistent to the algorithm in RFC 2328. If no electable neighbor is there and me.dr / me.bdr is initially zero, then it should elect itself as BDR in the first round (also elect itself as DR by the ndr==NULL condition), and elect itself as DR in the second election.
Could you add some debug output to check the value of DR and BDR (both ID and IP) after first and possibly second round?
Will do. Will re-post results once error is re-produced.
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Thu, 2019-10-03 at 19:04 +0200, Ondrej Zajicek wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Thu, Oct 03, 2019 at 06:36:05PM +0200, Ondrej Zajicek wrote:
me.dr is 0 because ifa->drip is 0, but should it have declared itself as an eligible DR?
Oh, you meant DR IP, not neghbor IP. I think that it is correct - the process should start with the node idea of DR IP, which is initially zero (RFC 2328 9.4. the first paragraph).
Just reviewed the election code and it seems consistent to the algorithm in RFC 2328. If no electable neighbor is there and me.dr / me.bdr is initially zero, then it should elect itself as BDR in the first round (also elect itself as DR by the ndr==NULL condition), and elect itself as DR in the second election.
Could you add some debug output to check the value of DR and BDR (both ID and IP) after first and possibly second round?
Collected some more logs when the state machine becomes stuck. It appears as the eth1 has been DR when this stuck state happens. Note that interface transitions from Down to DROther in one step and then maintains that state even after prio is changed from 0 to 5. 2019-10-04 19:34:17.443 <TRACE> ospfv2_1 < interface eth1 changes link 2019-10-04 19:34:17.443 <TRACE> ospfv2_1: Interface eth1 changed state from DR to Loopback 2019-10-04 19:34:17.443 <TRACE> ospfv2_1: Removing interface eth1 (10.210.138.68/30) from area 2.2.2.2 2019-10-04 19:34:17.443 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from Full to Down 2019-10-04 19:34:17.443 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 removed ... 2019-10-04 19:34:17.443 <TRACE> kernel1 < interface eth1 changes link 2019-10-04 19:34:17.443 <TRACE> kernel2 < interface eth1 changes link 2019-10-04 19:34:17.443 <TRACE> direct1 < interface eth1 changes link ... 2019-10-04 19:34:22.562 <TRACE> ospfv2_1: Interface eth1 changed state from Loopback to Waiting 2019-10-04 19:34:22.611 <TRACE> ospfv2_1: HELLO packet received from nbr 10.210.138.70 on eth1 2019-10-04 19:34:22.611 <TRACE> ospfv2_1: New neighbor 10.210.138.70 on eth1, IP address 10.210.138.70 2019-10-04 19:34:22.611 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from Down to Init 2019-10-04 19:34:22.611 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from Init to 2-Way ... 2019-10-04 19:34:32.564 <TRACE> ospfv2_1: HELLO packet sent via eth1 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: DBDES packet received from nbr 10.210.138.70 on eth1 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: length 32 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: router 10.210.138.70 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: mtu 1500 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: imms I M MS 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: ddseq 948166447 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: DBDES packet ignored - lesser state than ExStart ... 2019-10-04 19:34:47.370 <TRACE> ospfv2_1: HELLO packet sent via eth1 2019-10-04 19:34:47.370 <TRACE> ospfv2_1: Interface eth1 changed state from Waiting to Down 2019-10-04 19:34:47.370 <TRACE> ospfv2_1: Removing interface eth1 (10.210.138.68/30) from area 2.2.2.2 2019-10-04 19:34:47.370 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from 2-Way to Down 2019-10-04 19:34:47.370 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 removed ... 2019-10-04 19:35:40.066 <TRACE> ospfv2_1: Interface eth1 changed state from Down to DROther 2019-10-04 19:35:40.066 <TRACE> ospfv2_1: HELLO packet sent via eth1 2019-10-04 19:35:40.076 <TRACE> ospfv2_1: HELLO packet received from nbr 10.210.138.70 on eth1 2019-10-04 19:35:40.076 <TRACE> ospfv2_1: New neighbor 10.210.138.70 on eth1, IP address 10.210.138.70 2019-10-04 19:35:40.076 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from Down to Init 2019-10-04 19:35:40.076 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from Init to 2-Way 2019-10-04 19:35:40.109 <INFO> Reconfiguring 2019-10-04 19:35:40.109 <TRACE> ospfv2_1: Changing priority of eth1 from 0 to 5 2019-10-04 19:35:40.109 <TRACE> ospfv2_1: Reconfigured ... 2019-10-04 19:35:50.066 <TRACE> ospfv2_1: HELLO packet sent via eth1 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: DBDES packet received from nbr 10.210.138.70 on eth1 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: length 32 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: router 10.210.138.70 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: mtu 1500 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: imms I M MS 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: ddseq 948166575 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: DBDES packet ignored - lesser state than ExStart Next some more logs collected with LOCAL_DEBUG. The final call of ospf_dr_election results in both DR and BDR 0. From that point, no moreelection attempts are done. ospfv2_1: Neighbor? on iface eth1 ospfv2_1: Iface eth1 can_do_adj=0 OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 10.210.138.69) Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived SM on eth1. Event is 'WaitTimer' (B)DR election. 1: ndr 00000000, bdr bfbb8100 2: ndr bfbb8100, bdr 00000000 DR=10.210.138.69, BDR=0.0.0.0 Neighbor state machine for 10.210.138.70 on eth1, event AdjOK? ospfv2_1: Iface eth1 can_do_adj=1 OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 10.210.138.69) Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived Neighbor state machine for 10.210.138.70 on eth1, event NegotiationDone OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 10.210.138.69) Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived Neighbor state machine for 10.210.138.70 on eth1, event ExchangeDone ... OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 224.0.0.6) Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived Deleting LSA (Type: 2001 Id: 10.210.138.69 Rt: 10.210.138.69) from lsrtl for neighbor 10.210.138.70 Deleting LSA (Type: 2002 Id: 10.210.138.69 Rt: 10.210.138.69) from lsrtl for neighbor 10.210.138.70 SM on eth1. Event is 'LoopInd' Neighbor state machine for 10.210.138.70 on eth1, event KillNbr SM on eth1. Event is 'NeighborChange' OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 10.210.138.69) OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 224.0.0.5) SM on eth1. Event is 'UnloopInd' SM on eth1. Event is 'InterfaceUp' OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 224.0.0.5) Allocating OSPF hash of order 6: 64 hash_entries, 0 low, 256 high Allocating OSPF hash of order 6: 64 hash_entries, 0 low, 256 high Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived Neighbor state machine for 10.210.138.70 on eth1, event 2-WayReceived SM on eth1. Event is 'NeighborChange' ospfv2_1: Neighbor? on iface eth1 ospfv2_1: Iface eth1 can_do_adj=0 SM on eth1. Event is 'InterfaceDown' Neighbor state machine for 10.210.138.70 on eth1, event KillNbr SM on eth1. Event is 'NeighborChange' SM on eth1. Event is 'InterfaceUp' OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 224.0.0.5) Allocating OSPF hash of order 6: 64 hash_entries, 0 low, 256 high Allocating OSPF hash of order 6: 64 hash_entries, 0 low, 256 high Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived Neighbor state machine for 10.210.138.70 on eth1, event 2-WayReceived SM on eth1. Event is 'NeighborChange' (B)DR election. 1: ndr 00000000, bdr 00000000 2: ndr 00000000, bdr 00000000 DR=0.0.0.0, BDR=0.0.0.0 ospfv2_1: Iface eth1 can_do_adj=0
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Mon, Oct 07, 2019 at 12:23:56PM +0000, Kenth Eriksson wrote:
Collected some more logs when the state machine becomes stuck. It appears as the eth1 has been DR when this stuck state happens. Note that interface transitions from Down to DROther in one step and then maintains that state even after prio is changed from 0 to 5.
2019-10-04 19:35:40.109 <TRACE> ospfv2_1: Changing priority of eth1 from 0 to 5
Thanks. Not sure why there is the prio change. In the first mail you wrote that BIRD has 5 and Quagga has 0. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Mon, 2019-10-07 at 14:44 +0200, Ondrej Zajicek wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Mon, Oct 07, 2019 at 12:23:56PM +0000, Kenth Eriksson wrote:
Collected some more logs when the state machine becomes stuck. It appears as the eth1 has been DR when this stuck state happens. Note that interface transitions from Down to DROther in one step and then maintains that state even after prio is changed from 0 to 5. 2019-10-04 19:35:40.109 <TRACE> ospfv2_1: Changing priority of eth1 from 0 to 5
Thanks. Not sure why there is the prio change. In the first mail you wrote that BIRD has 5 and Quagga has 0.
The prio change is user driven, the user changed from prio 0 to 5 and then re-configured. So now bird has prio 5 and quagga 0. Initially both ends had prio 0.
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Mon, 2019-10-07 at 12:50 +0000, Kenth Eriksson wrote:
On Mon, 2019-10-07 at 14:44 +0200, Ondrej Zajicek wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Mon, Oct 07, 2019 at 12:23:56PM +0000, Kenth Eriksson wrote:
Collected some more logs when the state machine becomes stuck. It appears as the eth1 has been DR when this stuck state happens. Note that interface transitions from Down to DROther in one step and then maintains that state even after prio is changed from 0 to 5. 2019-10-04 19:35:40.109 <TRACE> ospfv2_1: Changing priority of eth1 from 0 to 5
Thanks. Not sure why there is the prio change. In the first mail you wrote that BIRD has 5 and Quagga has 0.
The prio change is user driven, the user changed from prio 0 to 5 and then re-configured. So now bird has prio 5 and quagga 0. Initially both ends had prio 0.
Shouldn't the interface state machine be kicked when interface priority is changed? E.g. from ospf_iface_reconfigure, invoke ospf_iface_sm with state ISM_NEICH?
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Mon, Oct 07, 2019 at 02:38:13PM +0000, Kenth Eriksson wrote:
The prio change is user driven, the user changed from prio 0 to 5 and then re-configured. So now bird has prio 5 and quagga 0. Initially both ends had prio 0.
Shouldn't the interface state machine be kicked when interface priority is changed? E.g. from ospf_iface_reconfigure, invoke ospf_iface_sm with state ISM_NEICH?
See commit fa1e0ba35416561bda3708ec808d24641dd8995f (fixed in 2.0.5) If your issue is related to change prio from 0 to 5 and have older version, then it might be this. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Mon, 2019-10-07 at 18:34 +0200, Ondrej Zajicek wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Mon, Oct 07, 2019 at 02:38:13PM +0000, Kenth Eriksson wrote:
The prio change is user driven, the user changed from prio 0 to 5 and then re-configured. So now bird has prio 5 and quagga 0. Initially both ends had prio 0.
Shouldn't the interface state machine be kicked when interface priority is changed? E.g. from ospf_iface_reconfigure, invoke ospf_iface_sm with state ISM_NEICH?
See commit fa1e0ba35416561bda3708ec808d24641dd8995f (fixed in 2.0.5)
That fix is included in 2.0.5 but not in 2.0.6. Did you revert the fix after that? The fix looks identical to what I proposed a few lines up in this thread.
If your issue is related to change prio from 0 to 5 and have older version, then it might be this.
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Tue, 2019-10-08 at 06:45 +0000, Kenth Eriksson wrote:
On Mon, 2019-10-07 at 18:34 +0200, Ondrej Zajicek wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Mon, Oct 07, 2019 at 02:38:13PM +0000, Kenth Eriksson wrote:
The prio change is user driven, the user changed from prio 0 to 5 and then re-configured. So now bird has prio 5 and quagga 0. Initially both ends had prio 0.
Shouldn't the interface state machine be kicked when interface priority is changed? E.g. from ospf_iface_reconfigure, invoke ospf_iface_sm with state ISM_NEICH?
See commit fa1e0ba35416561bda3708ec808d24641dd8995f (fixed in 2.0.5)
That fix is included in 2.0.5 but not in 2.0.6. Did you revert the fix after that?
I noticed now that the fix is included in your gitlab version of 2.0.6. Did you by any chance move the 2.0.6 tag?
The fix looks identical to what I proposed a few lines up in this thread.
If your issue is related to change prio from 0 to 5 and have older version, then it might be this.
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Tue, Oct 08, 2019 at 07:53:53AM +0000, Kenth Eriksson wrote:
On Tue, 2019-10-08 at 06:45 +0000, Kenth Eriksson wrote:
On Mon, 2019-10-07 at 18:34 +0200, Ondrej Zajicek wrote:
On Mon, Oct 07, 2019 at 02:38:13PM +0000, Kenth Eriksson wrote:
The prio change is user driven, the user changed from prio 0 to 5 and then re-configured. So now bird has prio 5 and quagga 0. Initially both ends had prio 0.
Shouldn't the interface state machine be kicked when interface priority is changed? E.g. from ospf_iface_reconfigure, invoke ospf_iface_sm with state ISM_NEICH?
See commit fa1e0ba35416561bda3708ec808d24641dd8995f (fixed in 2.0.5)
That fix is included in 2.0.5 but not in 2.0.6. Did you revert the fix after that?
I noticed now that the fix is included in your gitlab version of 2.0.6. Did you by any chance move the 2.0.6 tag?
If i remember correctly, we mistakenly released v2.0.6 tag pointing to some older release (2.0.4?), then fixed it. Proper 2.0.6 release commit is 5235c3f78da15826b0654ba68dc7a897faa42c98 . What commit do you use? You can also check NEWS to see last version. -- Ondrej Santiago Zajicek
On Tue, 2019-10-08 at 11:34 +0200, Ondrej Zajicek wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Tue, Oct 08, 2019 at 07:53:53AM +0000, Kenth Eriksson wrote:
On Tue, 2019-10-08 at 06:45 +0000, Kenth Eriksson wrote:
On Mon, 2019-10-07 at 18:34 +0200, Ondrej Zajicek wrote:
On Mon, Oct 07, 2019 at 02:38:13PM +0000, Kenth Eriksson wrote:
The prio change is user driven, the user changed from prio 0 to 5 and then re-configured. So now bird has prio 5 and quagga 0. Initially both ends had prio 0.
Shouldn't the interface state machine be kicked when interface priority is changed? E.g. from ospf_iface_reconfigure, invoke ospf_iface_sm with state ISM_NEICH?
See commit fa1e0ba35416561bda3708ec808d24641dd8995f (fixed in 2.0.5)
That fix is included in 2.0.5 but not in 2.0.6. Did you revert the fix after that?
I noticed now that the fix is included in your gitlab version of 2.0.6. Did you by any chance move the 2.0.6 tag?
If i remember correctly, we mistakenly released v2.0.6 tag pointing to some older release (2.0.4?), then fixed it.
Proper 2.0.6 release commit is 5235c3f78da15826b0654ba68dc7a897faa42c98 .
What commit do you use? You can also check NEWS to see last version.
The broken 2.0.6 tag I had locally was 3a22a6e858cd703d254ab331183ccd56fe195c6b, which is only six commits after 2.0.4. But I have now deleted that erroneous tag and fetched again. Thanks for confirming my suspicion about a faulty 2.0.6 tag, but next time please try to avoid rebasing/rewriting git history... I recall I had a similar problem when you moved from github to gitlabs.
-- Ondrej Santiago Zajicek
On Tue, Oct 08, 2019 at 09:45:54AM +0000, Kenth Eriksson wrote:
On Tue, 2019-10-08 at 11:34 +0200, Ondrej Zajicek wrote: The broken 2.0.6 tag I had locally was 3a22a6e858cd703d254ab331183ccd56fe195c6b, which is only six commits after 2.0.4. But I have now deleted that erroneous tag and fetched again.
Thanks for confirming my suspicion about a faulty 2.0.6 tag, but next time please try to avoid rebasing/rewriting git history... I recall I had a similar problem when you moved from github to gitlabs.
I agree that we should avoid rebasing/rewritting git history, but this issue was not intentional rebasing, but fixing badly generated tag. If we had kept the old broken tag v2.0.6 and had generated a new tag v2.0.6-fixed, that would have likely caused much more confusion. BTW, we did not move from Github to Gitlab. We never used Github. Before Gitlab, we used regular git server. -- Ondrej Santiago Zajicek
participants (2)
-
Kenth Eriksson -
Ondrej Zajicek