On Wed, Nov 30, 2016 at 10:35:45AM +0530, naveen chowdary Yerramneni wrote:
Hi,
*Issue Description*: LSA ID collision issue is seen with OSPF *stub-networks* configured and BIRD is acting as ABR
Hi, thanks for the bugreport and analysis. My comments and questions are below.
*Code Flow*: 1. Stubnets are advertised to area-0 by generating router LSA ( LSA_T_RT). o ospf_disp() -> ospf_update_topology() -> ospf_originate_rt_lsa()->ospf_ originate_lsa() 2. Now, stubnets are added to top graph table (p->gr) with LSA type LSA_T_RT 3. When creating routing table, these stubnets are added to FIB ( p->rtf). o ospf_disp() -> ospf_rt_spf() -> ospf_rt_spfa() -> spfa_process_rt() -> add_network() -> ri_install_net() 4. When BIRD is acting as ABR then, walk through FIB(p->rtf) and send summary LSA (LSA_T_SUM_NET) with LSA mode as LSA_M_RTCALC. Also, nf pointer is set (stores fib node address) in top_hash_entry. o ospf_disp() -> ospf_rt_abr2() -> check_sum_net_lsa() -> ospf_originate_sum_net_lsa() ->ospf_originate_lsa() 5. Now, stubnets are added to top graph table (p->gr) with LSA type LSA_T_SUM_NET
Until this it looks OK.
6. Stubnets are removed from FIB(p->rtf) o ospf_disp() -> rt_sync() -> fib_delete()
This seems strange. If these networs are part of the topology, they should be in FIB(p->rtf). You mean that appropriate record in FIB is removed automatically as a conseqyence of this code seqence, or is there any external change (like stubnet removal) that causes it to be removed?
7. Now, stubnets entries are still present in top graph table (Note: fib_delte() doesn't free the node, it just moves the fib node to free pool, fib_node pointer is still valid)
That looks like a dangerous bug. Although fib_delete just moves the fib node back to slab pool, the pointer is considered freed and invalid (the slab pool is just optimization).
8. With any change in network, ospf_rt_spf()is called. In ospf_rt_reset(), LSA mode is updated fromLSA_M_RTCALC to LSA_M_STALE. 9. Again, steps 3-6 are repeated. In step-3, fib node pointer is changed and in step-4, fib node pointer comparison fails in ospf_originate_lsa()which is leading to LSA id collision.
*Issue is resolved with below code change. *Please review the change and provide your comments. Also, please let me know if any other information is required.
Well, it seems to me that the real cause of the bug is in step 6. BTW, your stubnets are explicitly configured ones or based on prefixes of stub interfaces? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."