[PATCH] babel: Initialise source seqno from incoming message
When creating a new babel_source object we initialise the seqno to 0. The caller will update the source object with the right metric and seqno value, for both newly created and old source objects. However if we initialise the source object seqno to 0 that may actually turn out to be a valid (higher) seqno than the one in the routing table, because of seqno wrapping. In this case the source metric will not be set properly, which breaks feasibility tracking for subsequent updates. To fix this, add a new initial_seqno argument to babel_get_source() which is used when allocating a new object, and set that to the seqno value of the update we're sending. Reported-by: Juliusz Chroboczek <jch@irif.fr> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> --- proto/babel/babel.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/proto/babel/babel.c b/proto/babel/babel.c index 25081c3bde2f..b7bb374206dc 100644 --- a/proto/babel/babel.c +++ b/proto/babel/babel.c @@ -105,7 +105,8 @@ babel_find_source(struct babel_entry *e, u64 router_id) } static struct babel_source * -babel_get_source(struct babel_proto *p, struct babel_entry *e, u64 router_id) +babel_get_source(struct babel_proto *p, struct babel_entry *e, u64 router_id, + u16 initial_seqno) { struct babel_source *s = babel_find_source(e, router_id); @@ -115,7 +116,7 @@ babel_get_source(struct babel_proto *p, struct babel_entry *e, u64 router_id) s = sl_allocz(p->source_slab); s->router_id = router_id; s->expires = current_time() + BABEL_GARBAGE_INTERVAL; - s->seqno = 0; + s->seqno = initial_seqno; s->metric = BABEL_INFINITY; add_tail(&e->sources, NODE s); @@ -1014,7 +1015,7 @@ babel_send_update_(struct babel_iface *ifa, btime changed, struct fib *rtable) /* Update feasibility distance for redistributed routes */ if (e->router_id != p->router_id) { - struct babel_source *s = babel_get_source(p, e, e->router_id); + struct babel_source *s = babel_get_source(p, e, e->router_id, msg.update.seqno); s->expires = current_time() + BABEL_GARBAGE_INTERVAL; if (gt_mod64k(msg.update.seqno, s->seqno) || -- 2.39.1
On Tue, Jan 31, 2023 at 11:55:50AM +0100, Toke Høiland-Jørgensen via Bird-users wrote:
When creating a new babel_source object we initialise the seqno to 0. The caller will update the source object with the right metric and seqno value, for both newly created and old source objects. However if we initialise the source object seqno to 0 that may actually turn out to be a valid (higher) seqno than the one in the routing table, because of seqno wrapping. In this case the source metric will not be set properly, which breaks feasibility tracking for subsequent updates.
To fix this, add a new initial_seqno argument to babel_get_source() which is used when allocating a new object, and set that to the seqno value of the update we're sending.
Thanks, merged: https://gitlab.nic.cz/labs/bird/-/commit/dc4c5f51f83f97100b207136ecfde8ff94e... The find/get function pair pattern is often used in BIRD, it has some advantages in keeping invariants, but it is cumbersome when nontrivial initialization of new objects is used. Sometimes it leads to branching in caller and doing additional initialization there, in such cases it make sense to switch to find/add function pair. (just a general remark) btw, the section 3.7.3 says: "Before sending an update (prefix, plen, router-id, seqno, metric) with finite metric (i.e., not a route retraction)" and "Note that the garbage-collection timer is not reset when a retraction is sent." But the whole section 'Update feasibility distance for redistributed routes' is done in all cases? Also, did you review BIRD Babel code whether we could switch RFC references from RFC 6126 to RFC 8966? I vaguely remember that there were such a patch. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej Zajicek <santiago@crfreenet.org> writes:
On Tue, Jan 31, 2023 at 11:55:50AM +0100, Toke Høiland-Jørgensen via Bird-users wrote:
When creating a new babel_source object we initialise the seqno to 0. The caller will update the source object with the right metric and seqno value, for both newly created and old source objects. However if we initialise the source object seqno to 0 that may actually turn out to be a valid (higher) seqno than the one in the routing table, because of seqno wrapping. In this case the source metric will not be set properly, which breaks feasibility tracking for subsequent updates.
To fix this, add a new initial_seqno argument to babel_get_source() which is used when allocating a new object, and set that to the seqno value of the update we're sending.
Thanks, merged:
https://gitlab.nic.cz/labs/bird/-/commit/dc4c5f51f83f97100b207136ecfde8ff94e...
The find/get function pair pattern is often used in BIRD, it has some advantages in keeping invariants, but it is cumbersome when nontrivial initialization of new objects is used. Sometimes it leads to branching in caller and doing additional initialization there, in such cases it make sense to switch to find/add function pair. (just a general remark)
Noted. I think find/get works fine here, though :)
btw, the section 3.7.3 says:
"Before sending an update (prefix, plen, router-id, seqno, metric) with finite metric (i.e., not a route retraction)" and "Note that the garbage-collection timer is not reset when a retraction is sent."
But the whole section 'Update feasibility distance for redistributed routes' is done in all cases?
This has been clarified in RFC8966 as: "Note that the feasibility distance is not updated and the garbage-collection timer is not reset when a retraction (an update with infinite metric) is sent." The feasibility distance is only updated if the metric is lower, which is never true for an (infinite-metric) retraction, so that's kinda implicit anyway. We do actually update the garbage collection time regardless of the route metric in the regular update function (but not when sending an explicit retraction). I think that's OK, though?
Also, did you review BIRD Babel code whether we could switch RFC references from RFC 6126 to RFC 8966? I vaguely remember that there were such a patch.
Yeah, it should be updated to RFC8966. We implement sub-TLVs as specified in that, and the extensions (source-specific routing, MAC) are not compatible with a pure RFC6126 implementation anyway. -Toke
This has been clarified in RFC8966 as: "Note that the feasibility distance is not updated and the garbage-collection timer is not reset when a retraction (an update with infinite metric) is sent."
The feasibility distance is only updated if the metric is lower, which is never true for an (infinite-metric) retraction, so that's kinda implicit anyway.
We do actually update the garbage collection time regardless of the route metric in the regular update function (but not when sending an explicit retraction). I think that's OK, though?
Does that mean that a retracted route never expires? -- Juliusz
Juliusz Chroboczek <jch@irif.fr> writes:
This has been clarified in RFC8966 as: "Note that the feasibility distance is not updated and the garbage-collection timer is not reset when a retraction (an update with infinite metric) is sent."
The feasibility distance is only updated if the metric is lower, which is never true for an (infinite-metric) retraction, so that's kinda implicit anyway.
We do actually update the garbage collection time regardless of the route metric in the regular update function (but not when sending an explicit retraction). I think that's OK, though?
Does that mean that a retracted route never expires?
Hmm, no, this is the garbage collection timer in the source table only. There are separate expiry timers on the route objects themselves, so all this means is that the source table entry will stick around slightly longer, and it may delay the garbage collection of a retracted route internally in Bird slightly... -Toke
We do actually update the garbage collection time regardless of the route metric in the regular update function (but not when sending an explicit retraction). I think that's OK, though?
Does that mean that a retracted route never expires?
Hmm, no, this is the garbage collection timer in the source table only.
Ah, okay then.
participants (3)
-
Juliusz Chroboczek -
Ondrej Zajicek -
Toke Høiland-Jørgensen