handling of the onlink next hop
Hi team, We've just got the update of bird (bird-2.16.2-1.el9.x86_64->bird-3.1.1-1.el9.x86_64) delivered via rpm updates and it broke our configuration with the following error message: `/etc/bird.conf:7:5 syntax error, unexpected ONLINK` By reading the migration guide, i see that it was a deliberate choice to stop supporting this attribute: "The|onlink|route attribute has been temporarily disabled until we find out how to implement it properly." We used the functionality implemented in 7144c9ca46f092da33a4e051bbce8f973a3bd8c4 . In particular, with the following configuration: ``` ifname = "tunl0"; onlink = true; gw = from; ``` Is there a plan to bring back support for it in the nearby future? If not, I will try to get in touch with the maintainer of the package in EPEL to see if it's possible to submit and keep maintaining a "bird2" branch of this package in EPEL. Regards, Radu
Hello Radu, yes, there are plans to fix it, unfortunately we didn't have time yet to do it. The actual problem is semantics of the nexthop attribute wrt. the configured VRFs, but thinking about that again, I'm almost tempted to say that all the original reasons are gone. So we'll check and let you know very soon. Thank you for the heads-up, and sorry for breaking this. Maria On May 27, 2025 3:28:36 AM GMT+02:00, Radu CARPA <carpa.radu@gmail.com> wrote:
Hi team,
We've just got the update of bird (bird-2.16.2-1.el9.x86_64->bird-3.1.1-1.el9.x86_64) delivered via rpm updates and it broke our configuration with the following error message:
`/etc/bird.conf:7:5 syntax error, unexpected ONLINK`
By reading the migration guide, i see that it was a deliberate choice to stop supporting this attribute: "The|onlink|route attribute has been temporarily disabled until we find out how to implement it properly."
We used the functionality implemented in 7144c9ca46f092da33a4e051bbce8f973a3bd8c4 . In particular, with the following configuration:
```
ifname = "tunl0"; onlink = true; gw = from;
```
Is there a plan to bring back support for it in the nearby future?
If not, I will try to get in touch with the maintainer of the package in EPEL to see if it's possible to submit and keep maintaining a "bird2" branch of this package in EPEL.
Regards,
Radu
-- Maria Matejka (she/her) | BIRD Team Leader | CZ.NIC, z.s.p.o.
Hello Radu,
`/etc/bird.conf:7:5 syntax error, unexpected ONLINK`
By reading the migration guide, i see that it was a deliberate choice to stop supporting this attribute: "The|onlink|route attribute has been temporarily disabled until we find out how to implement it properly."
Is there a plan to bring back support for it in the nearby future?
yes, there are plans to fix it, unfortunately we didn't have time yet to do it. The actual problem is semantics of the nexthop attribute wrt. the configured VRFs, but thinking about that again, I'm almost tempted to say that all the original reasons are gone. So we'll check and let you know very soon.
Well, could you please try the current thread-next branch? <https://gitlab.nic.cz/labs/bird/-/tree/thread-next> Unfortunately, you'll have to build it yourself for now. We are also in the process of implementing Rocky Linux package autobuild, so I would expect that in a short time, you may get even a package to test. (Not now though.) Have a nice day! Maria -- Maria Matejka (she/her) | BIRD Team Leader | CZ.NIC, z.s.p.o.
Hello Radu, On Tue, 27 May 2025, Maria Matejka via Bird-users wrote:
Well, could you please try the current thread-next branch?
<https://gitlab.nic.cz/labs/bird/-/tree/thread-next>
Unfortunately, you'll have to build it yourself for now.
https://koji.fedoraproject.org/koji/taskinfo?taskID=133248567 contains the following commits from thread-next branch: - https://gitlab.nic.cz/labs/bird/-/commit/36a4b74c1195b1a7e1630e0afd7b2e3c60d... - https://gitlab.nic.cz/labs/bird/-/commit/c1957b2ae0ee596965fed5d7a5a74e78e49... Please let me know if it works for you, then I'm happy to add both commits to a regular Fedora/EPEL build (and if Maria doesn't object). Regards, Robert
Hello Maria, Robert, On 5/27/25 12:44 PM, Maria Matejka wrote:
Well, could you please try the current thread-next branch?
https://gitlab.nic.cz/labs/bird/-/tree/thread-next <https://gitlab.nic.cz/labs/bird/-/tree/thread-next>
Thank you for the very quick patch and the rpm package. Unfortunately, we get the following error ``` bird[3287848]: filters, line 8: Argument 1 of RTA_SET must be of type ip, got type void bird[3287848]: Netlink: Network is unreachable bird[3287848]: Netlink: Network is unreachable bird[3287848]: ... ``` when running with the following configuration, which works perfectly on bird-2.16.2 ``` 1 2 router id from "-tun*", "*"; 3 4 filter from_kubernetes { 5 if ( net ~ 10.250.1.0/24 || net ~ 10.110.1.0/24 ) then { 6 ifname = "tunl0"; 7 onlink = true; 8 gw = from; 9 accept; 10 } 11 reject; 12 } ..... ``` Regards, Radu
Hello Radu,
bird[3287848]: filters, line 8: Argument 1 of RTA_SET must be of type ip,
when running with the following configuration, which works perfectly on bird-2.16.2
``` 1 2 router id from "-tun*", "*"; 3 4 filter from_kubernetes { 5 if ( net ~ 10.250.1.0/24 || net ~ 10.110.1.0/24 ) then { 6 ifname = "tunl0"; 7 onlink = true; 8 gw = from; 9 accept; 10 } 11 reject; 12 } ..... ```
Which protocol originates the routes which are filtered by `from_kubernetes`? It looks like the `from` attribute is not set (which is probably another bug). Thanks, Maria -- Maria Matejka (she/her) | BIRD Team Leader | CZ.NIC, z.s.p.o.
Hi, Maria, Hereafter is the (very slightly modified) full configuration. Please tell me if you need any more context information about it. Regards, Radu On 5/28/25 3:04 PM, Maria Matejka wrote:
Which protocol originates the routes which are filtered by |from_kubernetes|? It looks like the |from| attribute is not set (which is probably another bug).
``` router id from "-tun*", "*"; filter from_kubernetes { if ( net ~ 10.0.0.0/16 ) then { ifname = "tunl0"; onlink = true; gw = from; accept; } reject; } filter direct_tunl0 { if ( net ~ 10.0.0.0/16 && source = RTS_DEVICE ) then { accept; } reject; }; protocol direct { debug { states }; ipv4; interface "tunl0"; } protocol kernel { merge paths on limit 32; learn; persist; scan time 2; ipv4 { table master4; import all; export filter from_kubernetes; }; graceful restart; } protocol device { debug { states }; scan time 2; } template bgp bgp_template { passive on; # Kubernetes nodes will connect to us. debug { states }; password "password"; local as 64512; ipv4 { import filter from_kubernetes; export filter direct_tunl0; add paths on; }; graceful restart; connect delay time 2; connect retry time 5; error wait time 5,30; } protocol bgp kube_nodes from bgp_template { neighbor range 10.0.0.0/16 internal; } log syslog all; ```
Thanks, Radu! If I interpret this correctly, you run `from_kubernetes` in two places: - BGP import - kernel export I expect that in the table, there are routes from: - BGP - Direct - Kernel While the BGP routes (should) have the `from` attribute properly set (and therefore the `gw` setting works), the Direct and Kernel routes have no such attribute, and on export to kernel, the filter complains. There is actually an undocumented difference between BIRD 2 and 3 (mea culpa, gonna fix that soon in the documentation) → in BIRD 2, all routes have the `from` attribute available but ony BGP, Babel and RIP set it to a meaningful value. Notably, Direct and Kernel set `::` there. In BIRD 3, only BGP, Babel and RIP routes have the `from` attribute and in others, the attribute is undefined, therefore the filter complains. And because the filter is run on export to the kernel, it also runs on the Direct routes, and exactly as I checked with our test setup, the final behavior of the filter looks the same in v2 and v3. It lets BGP routes pass (because they have `from` set) and rejects Direct routes. The only difference is that with Direct routes, the :: nexthop is considered invalid in BIRD 2 (yielding the "Invalid gw address" error), whereas in BIRD 3, it's caught one step earlier and the report is more cryptic. The occassional "Invalid gw address" error in the logs, as well as the new error message should be possible to tame by this piece of filter: ``` if ! defined(from) || (from = ::) then reject "from not set for ", net; ``` … or just `… then reject;` if you want it silent. Now I hope that I have deciphered your situation correctly and this brings some help to you. I may obviously be wrong and there may be a bug, but to help locate it, I'll need a log with `debug channels all;` on toplevel of your config, and logs including the `trace` level. This will pin the error message to the actual offending route, and in turn to the source of that route which we can then investigate for missing the `from` attribute. Have a nice day! Maria -- Maria Matejka (she/her) | BIRD Team Leader | CZ.NIC, z.s.p.o. On Wed, May 28, 2025 at 05:31:16PM +0200, Radu CARPA wrote:
Hi, Maria,
Hereafter is the (very slightly modified) full configuration. Please tell me if you need any more context information about it.
Regards, Radu
On 5/28/25 3:04 PM, Maria Matejka wrote:
Which protocol originates the routes which are filtered by |from_kubernetes|? It looks like the |from| attribute is not set (which is probably another bug).
``` router id from "-tun*", "*";
filter from_kubernetes { if ( net ~ 10.0.0.0/16 ) then { ifname = "tunl0"; onlink = true; gw = from; accept; } reject; }
filter direct_tunl0 { if ( net ~ 10.0.0.0/16 && source = RTS_DEVICE ) then { accept; } reject; };
protocol direct { debug { states }; ipv4; interface "tunl0"; }
protocol kernel { merge paths on limit 32; learn; persist; scan time 2; ipv4 { table master4; import all; export filter from_kubernetes; }; graceful restart; }
protocol device { debug { states }; scan time 2; }
template bgp bgp_template { passive on; # Kubernetes nodes will connect to us. debug { states }; password "password"; local as 64512; ipv4 { import filter from_kubernetes; export filter direct_tunl0; add paths on; }; graceful restart; connect delay time 2; connect retry time 5; error wait time 5,30; }
protocol bgp kube_nodes from bgp_template { neighbor range 10.0.0.0/16 internal; }
log syslog all; ```
participants (3)
-
Maria Matejka -
Radu CARPA -
Robert Scheck