On Thu, May 05, 2016 at 03:02:55PM +0200, Juliusz Chroboczek wrote:
What you describe is perfectly correct.
I have two questions w.r.t. this sequence of events:
1) How is router restart and seqnos supposed to be handled without waiting for route timeout?
It's worse than that, actually -- it's not the route timeout, it's the source GC time.
Yes, you are right, i missed this.
The issue is a consequence of having a stateful loop-avoidance algorithm: if the state is lost, the loop-avoidance algorithm gets confused, and only recovers after the state has expired.
Babeld currently has two workarounds:
- it stores the current seqno on disk when it shuts down, so that it can use the same seqno when it restarts; - it can optionally draw a random router-id at startup, so that the old and new states don't interfere.
It would be great to design a procedure to recover from this case without a timeout, but I haven't given it much thought yet. So for now consider it as a flaw in the protocol.
Using random router-id seems like a good idea. Perhaps even an TLV that describes 'nominal' configured router-id, so regular router-id could be random, but routes could still contain configured router-id for admin purposes. Unfortunately, Babel does not have support for something like Opaque-LSA. Could not help with this issue just to allow increasing seqnum by more than 1 in reaction to recevied seqno request (3.8.1.2), to value rcv_seqno+1 ? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."