Ondrej Zajicek <santiago@crfreenet.org> writes:
Hi Toke (and others)
I updated Babel to use microsecond timers [1] and during testing i noticed several unrelated issues.
Ah, awesome, thanks for doing that! :)
1) Hello expiration was not properly implemented, as i understand RFC, it should reset the timer based on last packet hello interval and time-out repeatedly until hello map is zero. Fixed in [2].
Yeah, I think you are right. However, your fix doesn't apply BABEL_HELLO_EXPIRY_FACTOR to the value stored in last_hello_int, which is probably a bug, right?
2) In some setups unicast seqno requests were sent repeatedly and seqno was increased indefinitely due to last case in babel_handle_update(), when the RFC specifies it should be done only if unfeasible route would otherwise become best. Fixed in [3], by this condition handling all unicast seqno requests:
if (!feasible && (metric != BABEL_INFINITY) && (!best || (r == best) || (metric < best->metric))) babel_unicast_seqno_request(e, s, nbr);
It also changes the behavior by sending seqno request when receiving infeasible update if there is no previous route from that neighbor.
Yup, this seams reasonable as well.
3) Seems like the fix of hello expiration caused that when neighbor disappears, it is no longer removed timely and routes have to wait for route timeout, which is about a minute for 4 s hello interval.
It seems that by RFC, we should recompute route metrics whanever rxcost reported by IHU is changed, so IHU expiration should be enough to make all related routes unreachable. Apparently, this is not done in BIRD.
Hmm, yeah, babel_expire_ihu() should probably do something similar to babel_flush_neighbor(), i.e., loop through all selected routes from that neighbor and re-run the route selection on them?
4) Seems like we do not trigger updates when a route update with just increased seqno due to forwarded unicast request is received. That also slows convergence. IMHO we should do in babel_handle_update() something like:
if (route_is_selected() && matching_entry_in_seqno_request_cache()) babel_trigger_update()
Not sure we should use the request cache for this? We might not have seen the request, but a downstream still needs the updated seqno; so maybe just do something like: diff --git a/proto/babel/babel.c b/proto/babel/babel.c index c5d695c9..06e1d3ad 100644 --- a/proto/babel/babel.c +++ b/proto/babel/babel.c @@ -1219,6 +1219,11 @@ babel_handle_update(union babel_msg *m, struct babel_iface *ifa) } else { + /* Trigger an update if we get a seqno increase for a selected route, to + improve convergence time */ + if (r->seqno < msg->seqno && r == best) + babel_trigger_update(p); + /* Last paragraph above - update the entry */ r->advert_metric = msg->metric; r->metric = metric;
Do you think that my interpretation of these issues and the fixes are correct?
Think so, yeah; nice job. And thanks for taking the time to go over this :) -Toke