Ondrej Zajicek <santiago@crfreenet.org> writes:
On Wed, Apr 20, 2022 at 01:43:21AM +0200, Toke Høiland-Jørgensen wrote:
When shutting down a Babel instance we send a wildcard retraction to make sure all peers can quickly switch to other route origins. Add another small optimisation borrowed from babeld: sending a Hello message (along with the retraction) with a very low interval.
This will cause neighbours to modify their expiry timers for the node's state to quickly time it out, thus conserving resources in the network.
Hi
Thanks, merged. Just changed BABEL_TIME_UNITS to BABEL_MIN_INTERVAL.
Awesome! I noticed we had that define as well and did the same rename in my local tree :)
BTW, when we added CI tests for Babel authentication, we noticed that it has rather slow convergence after reconfiguration. The reason is that when authentication changed to become non-matching, it took many missed (misauthenticated) hellos to sufficiently clean up hello_map for neighbor to go down.
Just to make sure I'm understanding the scenario correctly: A node is reconfigured to turn on authentication, but not all peers use it; so now some peers are essentially cut off (basically like if they just dropped off the network). However, their hello history remain, so their routes stay active until they time out. Right?
Perhaps there could be some decision in iface reconfiguration that the change is significant to affect reachability of neighbors and in such case deprecate some/most items in hello_map.
Hmm, yeah, we could do something like that I suppose. I'm wondering if it should really be stronger, though? Enabling auth on an already-running instance is an increase in the "security level" of the interface, so should we really be keeping unauthenticated data around at all? I.e., maybe we should simply flush all neighbour entries on an interface when enabling auth on that interface? Or another, slightly less disruptive, option is to flush a neighbour if we receive a packet from it that fails auth, and that neighbour doesn't have the 'auth_passed' flag set? For existing neighbours that succeeds auth, that flag should be set immediately on the next packet we receive from that neighbour, whereas this would quickly clear out neighbours that fail it. We could maybe speed things up further by immediately issuing auth challenges to all neighbours when the config changes? -Toke