Swapping routes without deletion

Pim van Pelt pim at ipng.nl
Sun May 21 12:23:25 CEST 2023


Hoi,

To close out my monologue -- I sent https://gerrit.fd.io/r/c/vpp/+/38854 to
make VPP's Linux Controlplane plugin aware of NLM_F_REPLACE messages.
Rolled that out at AS8283 this morning, and our duplicate FIB entry issue
is gone. Nothing to see here, moving along :)

groet,
Pim

On Sat, May 20, 2023 at 11:50 PM Pim van Pelt <pim at ipng.nl> wrote:

> Hoi,
>
> I think I've found the answer to my question by taking a look at git
> history in netlink handling.
>
> This commit:
>
> commit 8235c4747dcc92de2ea991f78cdf9c6b8fa7f522
>
> Author: Ondrej Zajicek (work) <santiago at crfreenet.org>
>
> Date:   Mon Jul 15 16:23:18 2019 +0200
>
>
>     Netlink: Use route replace for IPv4
>
>
> Started using NL_OP_REPLACE for IPv4, but it kept it disabled for IPv6,
> and then this commit:
>
>
> commit 722daa950046a7ad307fd7aca8e0506f30b3d000
>
> Author: Ondrej Zajicek <santiago at crfreenet.org>
>
> Date:   Mon Jul 25 00:11:40 2022 +0200
>
>
>     Netlink: Simplify handling of IPv6 ECMP routes
>
>
> started using for IPv6 as well, where this commit:
>
>
> commit ddb1bdf2819ce69248d5a51e71d803f13548b217
>
> Author: Ondrej Zajicek <santiago at crfreenet.org>
>
> Date:   Tue Jul 26 18:45:20 2022 +0200
>
>
>     Netlink: Restrict route replace for IPv6
>
>
> added a nice guard in nl_allow_replace() -- this explains the replace
> semantics (which 'ip monitor route' does not show), and answers my
> question. For my application, I'll have to take a good look at consuming
> messages with flag NLM_F_CREATE|NLM_F_REPLACE set; and otherwise perhaps
> add the ability to Bird2/Bird3 to holdback and
> issue NL_OP_DELETE + NL_OP_ADD.
>
> For the curious, the application is Vector Packet Processing [ref
> <https://ipng.ch/s/articles/2021/09/02/vpp-5.html>] which consumes
> Netlink messages from the Linux kernel, and uses them to program a
> userspace dataplane, see [Linux Control Plane
> <https://s3-docs.fd.io/vpp/23.06/developer/plugins/lcp.html>] for
> details. Until now, this system consumes RTM_NEWROUTE and RTM_DELROUTE but
> is not yet capable of consuming this replacing logic. I'll take a look at
> adding that.
>
> groet,
> Pim
>
>
>
> On Sat, May 20, 2023 at 11:10 PM Pim van Pelt <pim at ipng.nl> wrote:
>
>> Hoi,
>>
>> As a quick followup why I'm asking about versions -- on a Bird2.0.7, I do
>> see the delete-before-insert:
>>
>> root at chgtg0:~# ip -6 monitor route | grep 2001:678:d78::6
>>
>>
>> # Raise OSPFv3 cost to prefer tf-0-0
>>
>> *Deleted* 2001:678:d78::6 via fe80::21b:21ff:febd:c718 dev xe0-3.3102.20
>> proto bird metric 32 pref medium
>>
>> 2001:678:d78::6 via fe80::6eb3:11ff:fe20:e0c4 dev tf0-0 proto bird metric
>> 32 pref medium
>>
>>
>> # Lower OSPFv3 cost to prefer xe0-3.3102.20 again
>>
>> *Deleted* 2001:678:d78::6 via fe80::6eb3:11ff:fe20:e0c4 dev tf0-0 proto
>> bird metric 32 pref medium
>>
>> 2001:678:d78::6 via fe80::21b:21ff:febd:c718 dev xe0-3.3102.20 proto bird
>> metric 32 pref medium
>>
>> groet,
>> Pim
>>
>> On Sat, May 20, 2023 at 10:51 PM Pim van Pelt <pim at ipng.nl> wrote:
>>
>>> Hoi folks,
>>>
>>> At Coloclue AS8283, we upgraded from Bird1.6.8 to Bird2.0.12 this week.
>>> We use two separate processes, one for IPv4 and one for IPv6 - and 2.0.7 in
>>> Debian is missing the ability to select 'accept ipv4' and 'accept ipv6' in
>>> BFD, so we installed backports and version 2.0.12).
>>>
>>> I am wondering if Bird2 later than 2.0.7 perhaps has an optimization
>>> when swapping routes? I would expect a swap to be "delete + add" but I am
>>> seeing only "add with new nexthop" appear in Netlink.
>>>
>>> Considering the following topology with link names and OSPFv3 costs
>>> associated:
>>>
>>>   dcg-1  bond0.130 ---- bond0.130 eun-2
>>>
>>>    |               2000             |
>>>
>>> enp1s0f3                         enp1s0f2
>>>
>>>    |                                |
>>>
>>>    | 10                          10 |
>>>
>>>    |                                |
>>>
>>> enp1s0f3                         enp1s0f3
>>>
>>>    |               1000             |
>>>
>>>   dcg-2  eno2.3469 ---- eno2.3469 eun-3
>>>
>>> If I restart the OSPFv3 protocol, I see that the topology settles in the
>>> expected way.  What I observed with bird 2.0.12 is that there is a deletion
>>> of the currently selected route followed by one addition, when the shortest
>>> path reveales (dcg1 - dcg2 - eun3 - eun2, ospf_metric1 is 1020, this is
>>> fine):
>>>
>>> root at dcg-1:~# birdc -s /run/bird/bird6.ctl restart ospf1
>>>
>>> root at dcg-1:~# ip -6 monitor route | grep 2a02:898:0:300::3
>>>
>>> Deleted 2a02:898:0:300::3 via fe80::669d:99ff:feb1:31af dev bond0.130
>>> proto bird metric 32 pref medium
>>>
>>> 2a02:898:0:300::3 via fe80::669d:99ff:feb1:3910 dev enp1s0f3 proto bird
>>> metric 32 pref medium
>>>
>>> Now I lower the cost of the dcg-1 -- eun-2 link from 2000 to 100, so
>>> that it becomes preferred (cost ospf_metric is 120):
>>>
>>> root at dcg-1:~# birdc -s /run/bird/bird6.ctl reconfigure ospf1
>>>
>>> root at dcg-1:~# ip -6 monitor route | grep 2a02:898:0:300::3
>>>
>>> *[[ HERE ]]*
>>>
>>> 2a02:898:0:300::3 via fe80::669d:99ff:feb1:31af dev bond0.130 proto bird
>>> metric 32 pref medium
>>>
>>> I would expect this new addition of the installed route on bond0.130 to
>>> be *preceded by a deletion* of the previous route from enp1s0f3, but
>>> this is not the case (marked in red with [[ HERE ]]).
>>>
>>> To anyone's knowledge: *Has this behavior changed between 2.0.7 and
>>> 2.0.12 ?*
>>>
>>> groet,
>>> Pim
>>> --
>>> Pim van Pelt <pim at ipng.nl>
>>> PBVP1-RIPE - http://www.ipng.nl/
>>>
>>
>>
>> --
>> Pim van Pelt <pim at ipng.nl>
>> PBVP1-RIPE - http://www.ipng.nl/
>>
>
>
> --
> Pim van Pelt <pim at ipng.nl>
> PBVP1-RIPE - http://www.ipng.nl/
>


-- 
Pim van Pelt <pim at ipng.nl>
PBVP1-RIPE - http://www.ipng.nl/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20230521/ca6f0619/attachment.htm>


More information about the Bird-users mailing list