[PATCH] Babel: Remove unecessary FIB_ITERATE restart

Toke Høiland-Jørgensen toke at toke.dk
Mon Jan 30 23:36:19 CET 2023


dxld at darkboxed.org writes:

> Hi Toke,
>
> On Mon, Jan 30, 2023 at 10:50:14PM +0100, Toke Høiland-Jørgensen wrote:
>> Daniel Gröber <dxld at darkboxed.org> writes:
>> 
>> > The route expiration code appears to have been stolen from rip.c, in that
>> > code the rt_notify function actually does modify the rtable fib by calling
>> > fib_get. The babel code however does no such thing, so this inefficient
>> > restart is just entirely uneccesarry.
>> 
>> Erm, huh? The babel rt_notify() function calls babel_get_entry(), which
>> calls fib_get()? It's quite similar to the rip one, in fact (which, as
>> you note, is no coincidence).
>
> Rats, looks like I missed that codepath in my review. Interesting that my
> test code never blew up because of it though.
>
> Looking at it again it should blow up when rt_notify gets a route that's
> announced from another protocol, not babel, as babel_announce_rte will
> always already have a babel_entry in hand.
>
> Is it perhaps that the problematic sync callchain only happens with babel
> routes and so we always already have an entry?

Well, assuming babel_select_route()->babel_announce_rte()->rt_notify()
only ever results in rt_notify() receiving (or not, in case of filters)
its own routes back, I guess that fib_get() shouldn't actually modify
anything. However, that's an awful lot of assumptions to base things
on, so that doesn't quite seem safe... Especially since "blow up" here
probably means subtle data corruption rather than an outright crash?

>> > To prove this is true I add a bunch of checking code that can be removed
>> > later.
>> 
>> That doesn't really prove anything, though, unless you can also show
>> that you did an exhaustive torture test of all possible route
>> expiry/retraction possibilities? :)
>
> Prove is perhaps the wrong word, I just wanted to see if it blows up in my
> face haha.
>
>> More to the point, what problem are you really trying to solve here? Did
>> you observe any particular performance issue with the current code, for
>> instance?
>
> With my recent "announce all possible routes to the core" changes I'm
> concerned about cubic blowup here. Since now every expired route (not just
> the selected one) causes a restart.
>
> We can always go back to the solution I had before this: keep a
> singly-linked list of routes to be expired and do them all after
> FIB_ITERATE_PUT.

Yeah, that seems like a better solution, cf. the above :)

-Toke



More information about the Bird-users mailing list