Babel: Clarifications on seqno request handling in bird
Hi Ondrej, Toke, Juliusz and lists, I'm working on v4 of my bird route selection patch[1] and I just have a couple babel spec and bird implementation questions. [1]: http://trubka.network.cz/pipermail/bird-users/2023-January/016621.html Just a quick recap for anyone not following my "Replace internal route selection" patch: the idea is to install every feasible route into the bird core and let it give us a callback (rt_notify) with the optimal route it picked instead of doing this in the babel protocol. As Toke noted up to my v3 patch I break sending seqno requests when we loose the last feasible route (RFC8966 section 3.8.2.1.) because I accidentally removed the code which deals with that together with the babel_select_route function. To fix this I tried to move that code to our rt_notify callback instead. In testing I've found a major problem with this approach however. Say the babel proto is currently exporting a route to the core, then a more preferred route is exported by some other protocol (say "static") and then the babel route expires. In this case we ought to send a seqno request when the babel routes go away but rt_notify doesn't get called since the selected (static) route didn't change at all. I don't think RFC8966 is really framed in bird's "multi protocol" mindset so it's unclear to me whether this is something we have to fix or not. Section 3.8.2.1. says:
A node that has lost all feasible routes to a given destination but still has unexpired unfeasible routes to that destination MUST send a seqno request;
I could for example read this as the above mentioned static route constituting a feasible route received from a bird "internal" babel neighbour which would make the behaviour described above perfectly fine, no? @Juliusz what's your opinion of this interpretation? Any suggestions on how to deal with this? @Bird folks: can anyone think of a way to be notified of any and all changes in rt_notify? I assume it's not possible from some light reading of the nest code but figured I might as well ask. If all else fails I can reinstate parts of the babel proto internal route selection or try to simply count available feasible routes instead of doing the full selection, just want to keep things as simple as possible. -- Further, looking at the code implementing RFC section 3.8.2.2 in babel_handle_update: if (!feasible && (metric != BABEL_INFINITY) && (!best || (r == best) || (metric < best->metric))) /*^ */ babel_generate_seqno_request(p, e, s->router_id, s->seqno + 1, nbr); It's unclear to me what the (!best) condition corresponds to from the RFC text. The r==best check clearly corresponds to "... when it receives an unfeasible update for a route that is currently selected" and (metric < best->metric) is for "... that is preferable to the currently selected route". But why do we always send a seqno request when there isn't a selected route when that isn't called for in the RFC? Thanks, --Daniel
Hello! On 2/26/23 19:17, dxld@darkboxed.org wrote:
Hi Ondrej, Toke, Juliusz and lists,
I'm working on v4 of my bird route selection patch[1] and I just have a couple babel spec and bird implementation questions.
[1]: http://trubka.network.cz/pipermail/bird-users/2023-January/016621.html
Just a quick recap for anyone not following my "Replace internal route selection" patch: the idea is to install every feasible route into the bird core and let it give us a callback (rt_notify) with the optimal route it picked instead of doing this in the babel protocol.
As Toke noted up to my v3 patch I break sending seqno requests when we loose the last feasible route (RFC8966 section 3.8.2.1.) because I accidentally removed the code which deals with that together with the babel_select_route function.
To fix this I tried to move that code to our rt_notify callback instead. In testing I've found a major problem with this approach however. Say the babel proto is currently exporting a route to the core, then a more preferred route is exported by some other protocol (say "static") and then the babel route expires.
In this case we ought to send a seqno request when the babel routes go away but rt_notify doesn't get called since the selected (static) route didn't change at all.
I don't think RFC8966 is really framed in bird's "multi protocol" mindset so it's unclear to me whether this is something we have to fix or not. Section 3.8.2.1. says:
A node that has lost all feasible routes to a given destination but still has unexpired unfeasible routes to that destination MUST send a seqno request;
I could for example read this as the above mentioned static route constituting a feasible route received from a bird "internal" babel neighbour which would make the behaviour described above perfectly fine, no?
From my point of view, this is perfectly fine.
@Bird folks: can anyone think of a way to be notified of any and all changes in rt_notify? I assume it's not possible from some light reading of the nest code but figured I might as well ask.
You may set channel ra_mode to RA_ANY, getting called rt_notify() for all route updates. Or in BIRD 3, you can override the channel logic at all and if you don't mind calling your filters yourself, you can just request "on any change, give me all the routes for one prefix at once" and do whatever you want with the routes. Yet these changes most probably won't get backported to BIRD 2. Maria
Hi Maria, On Sun, Feb 26, 2023 at 07:34:06PM +0100, Maria Matejka wrote:
I don't think RFC8966 is really framed in bird's "multi protocol" mindset so it's unclear to me whether this is something we have to fix or not. Section 3.8.2.1. says:
A node that has lost all feasible routes to a given destination but still has unexpired unfeasible routes to that destination MUST send a seqno request;
I could for example read this as the above mentioned static route constituting a feasible route received from a bird "internal" babel neighbour which would make the behaviour described above perfectly fine, no?
From my point of view, this is perfectly fine.
I did just do another reivew on the knock on effects of interpreting it like that and I do tend to agree it all looks fine. The (!best) thing I mentioned actually works in our favor here since getting rt_notifi'ed of another protocols route will then still trigger the 3.8.2.2, as it should, if we receive an unfeasible update.
@Bird folks: can anyone think of a way to be notified of any and all changes in rt_notify? I assume it's not possible from some light reading of the nest code but figured I might as well ask.
You may set channel ra_mode to RA_ANY, getting called rt_notify() for all route updates.
Or in BIRD 3, you can override the channel logic at all and if you don't mind calling your filters yourself, you can just request "on any change, give me all the routes for one prefix at once" and do whatever you want with the routes. Yet these changes most probably won't get backported to BIRD 2.
Hmm RA_ANY might actually be just fine if it comes to that, I'd still prefer not having to do it tho :) Thanks, --Daniel
I don't think RFC8966 is really framed in bird's "multi protocol" mindset
See the beginning of Section 3.7, which describes how a route redistributed from another protocol has router-id set to the local router's id. Babel updates for the same prefix are processed as usual, with the routes inserted and maintained in Babel's route table but not necessarily installed into the kernel's forwarding table.
so it's unclear to me whether this is something we have to fix or not.
I'm not familiar with BIRD's internals, so I don't know. There are two possible approaches: - drop all Babel routes for the static prefix ; or - maintain routes as usual, but don't select them. The latter is what babeld does (see xroute.c).
A node that has lost all feasible routes to a given destination but still has unexpired unfeasible routes to that destination MUST send a seqno request;
I could for example read this as the above mentioned static route constituting a feasible route received from a bird "internal" babel neighbour which would make the behaviour described above perfectly fine, no?
It is always safe to send a seqno request. However, unless it is known that there is a Babel speaker at that address, it's pointless: a non-Babel node will not know how to react to a Babel message. However, if there's a static route that's selected, there's not much point maintaining the feasibility of the Babel route; just let it become unfeasible, and optionally send a request when the static route is retracted.
Further, looking at the code implementing RFC section 3.8.2.2 in babel_handle_update:
if (!feasible && (metric != BABEL_INFINITY) && (!best || (r == best) || (metric < best->metric))) /*^ */ babel_generate_seqno_request(p, e, s->router_id, s->seqno + 1, nbr);
It's unclear to me what the (!best) condition corresponds to from the RFC text.
That used to be clear in RFC 6126 (Section 3.8.2.2): Additionally, a node SHOULD send a unicast seqno request whenever it receives an unfeasible update from a currently unselected neighbour that is "good enough", i.e., that would lead to the received route becoming selected were it feasible. It became less clear in RFC 8966: Additionally, since metric computation does not necessarily coincide with the delay in propagating updates, a node might receive an unfeasible update from a currently unselected neighbour that is preferable to the currently selected route (e.g., because it has a much smaller metric); in that case, the node SHOULD send a unicast seqno request to the neighbour that advertised the preferable update. It is definitely a good idea to send a request in this case, especially if you ignore unfeasible updates for unknown routes (first clause of Section 3.5.3). -- Juliusz
Hi Juliusz, On Mon, Feb 27, 2023 at 01:58:21PM +0100, Juliusz Chroboczek wrote:
I don't think RFC8966 is really framed in bird's "multi protocol" mindset
See the beginning of Section 3.7, which describes how a route redistributed from another protocol has router-id set to the local router's id. Babel updates for the same prefix are processed as usual, with the routes inserted and maintained in Babel's route table but not necessarily installed into the kernel's forwarding table.
Right, what was missing from my thinking was the realization that sending seqno requests for routes originated by us in this way doesn't really make sense :)
so it's unclear to me whether this is something we have to fix or not.
I'm not familiar with BIRD's internals, so I don't know. There are two possible approaches:
- drop all Babel routes for the static prefix ; or - maintain routes as usual, but don't select them.
The latter is what babeld does (see xroute.c).
I belive my current rt_notify approach also has this behaviour, so that should be fine then.
A node that has lost all feasible routes to a given destination but still has unexpired unfeasible routes to that destination MUST send a seqno request;
I could for example read this as the above mentioned static route constituting a feasible route received from a bird "internal" babel neighbour which would make the behaviour described above perfectly fine, no?
It is always safe to send a seqno request. However, unless it is known that there is a Babel speaker at that address, it's pointless: a non-Babel node will not know how to react to a Babel message.
However, if there's a static route that's selected, there's not much point maintaining the feasibility of the Babel route; just let it become unfeasible, and optionally send a request when the static route is retracted.
When the static route is retracted we'll only send a seqno request when we receive an unfeasable update. The problem in the implementation is we don't know the router-id/seqno for the request anymore at this point. I think this is fine since we're still sending seqno requests for any unfeasable updates received while the static route is installed, so sending another one when that is retracted shouldn't make a difference.
Further, looking at the code implementing RFC section 3.8.2.2 in babel_handle_update:
if (!feasible && (metric != BABEL_INFINITY) && (!best || (r == best) || (metric < best->metric))) /*^ */ babel_generate_seqno_request(p, e, s->router_id, s->seqno + 1, nbr);
It's unclear to me what the (!best) condition corresponds to from the RFC text.
That used to be clear in RFC 6126 (Section 3.8.2.2):
Additionally, a node SHOULD send a unicast seqno request whenever it receives an unfeasible update from a currently unselected neighbour that is "good enough", i.e., that would lead to the received route becoming selected were it feasible.
It became less clear in RFC 8966:
Additionally, since metric computation does not necessarily coincide with the delay in propagating updates, a node might receive an unfeasible update from a currently unselected neighbour that is preferable to the currently selected route (e.g., because it has a much smaller metric); in that case, the node SHOULD send a unicast seqno request to the neighbour that advertised the preferable update.
It is definitely a good idea to send a request in this case, especially if you ignore unfeasible updates for unknown routes (first clause of Section 3.5.3).
That's a bug in the new RFC text then ;) Actually I also missed another reason for the !best condition: to prevent (metric < best->metric) from segfaulting. Thanks, --Daniel
That's a bug in the new RFC text then ;)
Agreed. https://www.rfc-editor.org/how-to-report/ -- Juliusz
On Mon, Feb 27, 2023 at 07:25:13PM +0100, Juliusz Chroboczek wrote:
That's a bug in the new RFC text then ;)
Done :) --Daniel
participants (4)
-
Daniel Gröber -
dxld@darkboxed.org -
Juliusz Chroboczek -
Maria Matejka