On 14.08.2013 14:18, Ondrej Zajicek wrote:
On Wed, Aug 14, 2013 at 03:35:07AM +0400, Alexander V. Chernikov wrote:
Hello list!
Currently bird performs LSA premature aging in very strange way which sometimes upsets quagga (up to SIGSEGV) and Cisco. Every time aging is done via setting seq number to LSA_MAXSEQNO.
RFC 2328, on the opposite, shows us only 2 cases where MaxSequenceNumber is used in outgoing LSAs: 1) Self LSA originating with given seq num (very rare event) 2) [Premature] Aging of LSA from 1) because a) disappeared net and b) seq number increase (12.1.6)
For all other cases seq number should be either increased or left intact.
Bird, however, insist on setting LSA_MAXSEQNO while: 1) Receiving self-originating LSA which is not in local LSDB ( ospf_lsupd_receive ) which is not a special case according to 14.1 2) In all other places like handling if-down, external route disappear, etc... ( ospf_lsupd_flush_nlsa ) You are right that BIRD uses LSA_MAXSEQNO in more cases than prescribed by RFC 2328 (essentially in most cases when a local LSA is flushed), but these changes were introduced mostly as a reaction to problems with the usual flushing (just premature aging), where BIRD sends LSA with MaxAge, then forgets the LSA and when reintroduced the LSA later, it starts with InitSeqNum. In cases like external route flaps (and similar events for other kinds of LSAs) that caused that newer LSA meets MaxAge LSA during flood, is considered as older and removed (and combined with other minor inconsistencies in router behaviors this would lead to all kinds of strange events). Let me return to this old, but still relevant topic :) Two patches, unifying work with sn/age LSA fields with sn story-keeping stuff attached.
First one replaces most initial LSA header setup with single fill_lsa_header() function. It is also responsible for (unique) LSA id generation. Second one handles sequence number keeping for all LSA types using per-are FIB for most LSAs and per-proto FIB for AS-boundary (type5/7/0x2007/0x4005). I'd prefer to use fib2_init() for lookups, but fib_init() seems to be sufficient, too. We definitely have to convert ipa_to_rid() to some stateful stuff at least in IPv6 case, but that's the next step. We also need to deal more accurately with LSA_MAXSEQ.
The idea behind the behavior is that you should either keep the old seqnum for your LSA when it is flushed (and continue in it when you reintroduce it), or you should use LSA_MAXSEQNO to ensure that old seqnum is eliminated from the OSPF domain so you could forget seqnum and later start from InitSeqNum. The behavior is not according to RFC 2328, but it is compatible with it. And i have a work-in-progress code that fixes several issues in the flooding code together with this.