Hi,
I have two Ubuntu 14.04 physical machines running BIRD 1.4.5.
I am noticing the OSPF neighbor relationship is dieing. it appears that keepalives are not being received and the dead timer expires.
They are directly connected to each other using LACP. There is no issue with the physical connection and layer 3 connectivity remains up over the interconnect when the problem occurs. The problem is that OSPF is advertising the loopback addresses used by iBGP which drop when the OSPF fails.
2015-04-20 09:44:17 <TRACE> OSPF_1: Inactivity timer fired on interface bond0 for neighbor x.x.x.x.
2015-04-20 09:44:17 <TRACE> OSPF_1: Neighbor x.x.x.x changes state from " full" to " down".
2015-04-20 09:44:17 <TRACE> OSPF_1: Scheduling router-LSA origination for area 0.0.0.0
2015-04-20 09:44:17 <TRACE> OSPF_1: Scheduling network-LSA origination for iface bond0
2015-04-20 09:44:17 <TRACE> OSPF_1: Deleting neigbor.
2015-04-20 09:44:17 <TRACE> OSPF_1: New neighbor found: x.x.x.x on bond0
2015-04-20 09:44:17 <TRACE> OSPF_1: Neighbor x.x.x.x changes state from " down" to " init".
2015-04-20 09:44:17 <TRACE> OSPF_1: Neighbor x.x.x.x changes state from " init" to " 2way".
2015-04-20 09:44:17 <TRACE> OSPF_1: Neighbor x.x.x.x changes state from " 2way" to " exstart".
2015-04-20 09:44:19 <TRACE> OSPF_1: Originating router-LSA for area 0.0.0.0
2015-04-20 09:44:19 <TRACE> OSPF_1: Scheduling routing table calculation
2015-04-20 09:44:19 <TRACE> OSPF_1: Starting routing table calculation
2015-04-20 09:44:19 <TRACE> OSPF_1: Starting routing table calculation for area 0.0.0.0
2015-04-20 09:44:19 <TRACE> OSPF_1: Starting routing table calculation for inter-area (area 0.0.0.0)
2015-04-20 09:44:19 <TRACE> OSPF_1: Starting routing table calculation for ext routes
2015-04-20 09:44:19 <TRACE> OSPF_1: Starting routing table synchronisation
2015-04-20 09:44:24 <TRACE> OSPF_1: Neighbor x.x.x.x changes state from " exstart" to "exchange".
2015-04-20 09:44:26 <TRACE> OSPF_1: Neighbor x.x.x.x changes state from "exchange" to " loading".
2015-04-20 09:44:27 <TRACE> OSPF_1: Neighbor x.x.x.x changes state from " loading" to " full".
2015-04-20 09:44:27 <TRACE> OSPF_1: Scheduling router-LSA origination for area 0.0.0.0
2015-04-20 09:44:27 <TRACE> OSPF_1: Scheduling network-LSA origination for iface bond0
2015-04-20 09:44:27 <TRACE> OSPF_1: Scheduling routing table calculation
Configuration:
# OSPF process 1
protocol ospf OSPF_1 {
description "OSPF area 0";
area 0 {
interface "bond0" { # To x-x-x-01
hello 1;
dead count 4;
type ptp;
cost 1;
};
stubnet x.x.x.x/29 { # bond2.51
cost 0;
};
stubnet x.x.x.x/32 { # lo:1
cost 0;
};
stubnet x.x.x.x/31 { # bond1.31
cost 0;
};
stubnet x.x.x.x/31 { # bond1.33
cost 0;
};
};
#import filter import_OSPF;
#export filter export_OSPF;
}
Am I doing anything out of the ordinary here? Has anyone else had problems with OSPF dropping, this happens 5-10 times a day. I see no issues with the CPU utilisation on either of these hosts.
I have used workarounds such as temporary static routes and also found that adjusting the dead count from 4 to 8 has kept it stable for ~2 days.
2015-04-20 10:26:53 <TRACE> OSPF_1: Changing dead interval on interface bond0 from 4 to 8
Thanks,
Tom.