Issues in Babel

Toke Høiland-Jørgensen toke at toke.dk
Thu Oct 19 13:59:49 CEST 2017


Ondrej Zajicek <santiago at crfreenet.org> writes:

> On Thu, Oct 19, 2017 at 01:23:45PM +0200, Toke Høiland-Jørgensen wrote:
>> Ondrej Zajicek <santiago at crfreenet.org> writes:
>>
>> > 1) Hello expiration was not properly implemented, as i understand
>> > RFC, it should reset the timer based on last packet hello interval
>> > and time-out repeatedly until hello map is zero. Fixed in [2].
>> 
>> Yeah, I think you are right. However, your fix doesn't apply
>> BABEL_HELLO_EXPIRY_FACTOR to the value stored in last_hello_int, which
>> is probably a bug, right?
>
> I think it is correct, see section A.1:
>
>   If the Hello history is empty (it contains 0 bits only), the neighbour
>   entry is flushed; otherwise, it resets the neighbour's Hello timer to
>   the value advertised in the last Hello received from this neighbour (no
>   extra margin is necessary in this case).
>
> The BABEL_HELLO_EXPIRY_FACTOR is the extra margin for jitter.

Ah, right, gotcha.

>> > 3) Seems like the fix of hello expiration caused that when neighbor
>> > disappears, it is no longer removed timely and routes have to wait
>> > for route timeout, which is about a minute for 4 s hello interval.
>> >
>> > It seems that by RFC, we should recompute route metrics whanever rxcost
>> > reported by IHU is changed, so IHU expiration should be enough to make
>> > all related routes unreachable. Apparently, this is not done in BIRD.
>> 
>> Hmm, yeah, babel_expire_ihu() should probably do something similar to
>> babel_flush_neighbor(), i.e., loop through all selected routes from that
>> neighbor and re-run the route selection on them?
>
> I think route selection re-run should be done when IHU changes rxcost
> (either explicitly, or by expiration). It is explained in section 3.4.2.:
>
>   After updating a neighbour's txcost, the receiving node recomputes
>   the neighbour's cost (Section 3.4.3) and runs the route selection
>   procedure (Section 3.6).
>
> And it seems that BIRD should re-run the route selection even if
> effective cost is changed by Hello receipt or expiration:
>
>   After receiving a Hello, or determining that it has missed one, the
>   node recomputes the association's cost (Section 3.4.3) and runs the
>   route selection procedure (Section 3.6).

Ah, right, of course it should be done when the metric changes in either
direction; silly me. Well, a new babel_neighbor_update_routes() function
is warranted, I guess...

>> > 4) Seems like we do not trigger updates when a route update with just
>> > increased seqno due to forwarded unicast request is received. That also
>> > slows convergence. IMHO we should do in babel_handle_update() something
>> > like:
>> >
>> > if (route_is_selected() && matching_entry_in_seqno_request_cache())
>> >   babel_trigger_update()
>> 
>> Not sure we should use the request cache for this? We might not have
>> seen the request, but a downstream still needs the updated seqno; so
>> maybe just do something like:
>
> IMHO the fast propagation is necessary just for the originator of the
> seqno request and there exist at least one path of routers that
> propagates such request. Others learn updated seqno during regular
> updates. It is mentioned in section 3.7.2:
>
>   A node SHOULD NOT send triggered updates for other reasons, such as
>   when there is a minor fluctuation in a route's metric, when the
>   selected next hop changes, or to propagate a new sequence number
>   (except to satisfy a request, as specified in Section 3.8).

My thinking was that there may be a case where the request made it back
to the source in a roundabout way and that we could be helping by just
flushing the route with the updated seqno as soon as possible. But OK,
if the RFC says not to do that, better check the request cache first ;)

-Toke



More information about the Bird-users mailing list