Hoi, I think I've found the answer to my question by taking a look at git history in netlink handling. This commit: commit 8235c4747dcc92de2ea991f78cdf9c6b8fa7f522 Author: Ondrej Zajicek (work) <santiago@crfreenet.org> Date: Mon Jul 15 16:23:18 2019 +0200 Netlink: Use route replace for IPv4 Started using NL_OP_REPLACE for IPv4, but it kept it disabled for IPv6, and then this commit: commit 722daa950046a7ad307fd7aca8e0506f30b3d000 Author: Ondrej Zajicek <santiago@crfreenet.org> Date: Mon Jul 25 00:11:40 2022 +0200 Netlink: Simplify handling of IPv6 ECMP routes started using for IPv6 as well, where this commit: commit ddb1bdf2819ce69248d5a51e71d803f13548b217 Author: Ondrej Zajicek <santiago@crfreenet.org> Date: Tue Jul 26 18:45:20 2022 +0200 Netlink: Restrict route replace for IPv6 added a nice guard in nl_allow_replace() -- this explains the replace semantics (which 'ip monitor route' does not show), and answers my question. For my application, I'll have to take a good look at consuming messages with flag NLM_F_CREATE|NLM_F_REPLACE set; and otherwise perhaps add the ability to Bird2/Bird3 to holdback and issue NL_OP_DELETE + NL_OP_ADD. For the curious, the application is Vector Packet Processing [ref <https://ipng.ch/s/articles/2021/09/02/vpp-5.html>] which consumes Netlink messages from the Linux kernel, and uses them to program a userspace dataplane, see [Linux Control Plane <https://s3-docs.fd.io/vpp/23.06/developer/plugins/lcp.html>] for details. Until now, this system consumes RTM_NEWROUTE and RTM_DELROUTE but is not yet capable of consuming this replacing logic. I'll take a look at adding that. groet, Pim On Sat, May 20, 2023 at 11:10 PM Pim van Pelt <pim@ipng.nl> wrote:
Hoi,
As a quick followup why I'm asking about versions -- on a Bird2.0.7, I do see the delete-before-insert:
root@chgtg0:~# ip -6 monitor route | grep 2001:678:d78::6
# Raise OSPFv3 cost to prefer tf-0-0
*Deleted* 2001:678:d78::6 via fe80::21b:21ff:febd:c718 dev xe0-3.3102.20 proto bird metric 32 pref medium
2001:678:d78::6 via fe80::6eb3:11ff:fe20:e0c4 dev tf0-0 proto bird metric 32 pref medium
# Lower OSPFv3 cost to prefer xe0-3.3102.20 again
*Deleted* 2001:678:d78::6 via fe80::6eb3:11ff:fe20:e0c4 dev tf0-0 proto bird metric 32 pref medium
2001:678:d78::6 via fe80::21b:21ff:febd:c718 dev xe0-3.3102.20 proto bird metric 32 pref medium
groet, Pim
On Sat, May 20, 2023 at 10:51 PM Pim van Pelt <pim@ipng.nl> wrote:
Hoi folks,
At Coloclue AS8283, we upgraded from Bird1.6.8 to Bird2.0.12 this week. We use two separate processes, one for IPv4 and one for IPv6 - and 2.0.7 in Debian is missing the ability to select 'accept ipv4' and 'accept ipv6' in BFD, so we installed backports and version 2.0.12).
I am wondering if Bird2 later than 2.0.7 perhaps has an optimization when swapping routes? I would expect a swap to be "delete + add" but I am seeing only "add with new nexthop" appear in Netlink.
Considering the following topology with link names and OSPFv3 costs associated:
dcg-1 bond0.130 ---- bond0.130 eun-2
| 2000 |
enp1s0f3 enp1s0f2
| |
| 10 10 |
| |
enp1s0f3 enp1s0f3
| 1000 |
dcg-2 eno2.3469 ---- eno2.3469 eun-3
If I restart the OSPFv3 protocol, I see that the topology settles in the expected way. What I observed with bird 2.0.12 is that there is a deletion of the currently selected route followed by one addition, when the shortest path reveales (dcg1 - dcg2 - eun3 - eun2, ospf_metric1 is 1020, this is fine):
root@dcg-1:~# birdc -s /run/bird/bird6.ctl restart ospf1
root@dcg-1:~# ip -6 monitor route | grep 2a02:898:0:300::3
Deleted 2a02:898:0:300::3 via fe80::669d:99ff:feb1:31af dev bond0.130 proto bird metric 32 pref medium
2a02:898:0:300::3 via fe80::669d:99ff:feb1:3910 dev enp1s0f3 proto bird metric 32 pref medium
Now I lower the cost of the dcg-1 -- eun-2 link from 2000 to 100, so that it becomes preferred (cost ospf_metric is 120):
root@dcg-1:~# birdc -s /run/bird/bird6.ctl reconfigure ospf1
root@dcg-1:~# ip -6 monitor route | grep 2a02:898:0:300::3
*[[ HERE ]]*
2a02:898:0:300::3 via fe80::669d:99ff:feb1:31af dev bond0.130 proto bird metric 32 pref medium
I would expect this new addition of the installed route on bond0.130 to be *preceded by a deletion* of the previous route from enp1s0f3, but this is not the case (marked in red with [[ HERE ]]).
To anyone's knowledge: *Has this behavior changed between 2.0.7 and 2.0.12 ?*
groet, Pim -- Pim van Pelt <pim@ipng.nl> PBVP1-RIPE - http://www.ipng.nl/
-- Pim van Pelt <pim@ipng.nl> PBVP1-RIPE - http://www.ipng.nl/
-- Pim van Pelt <pim@ipng.nl> PBVP1-RIPE - http://www.ipng.nl/