Re: IPv6 routes not imported into Kernel
Hi Robert, On 15/11/2023 22:58, Robert Finze wrote:
The Bird config on both systems is nearly identical (only IPs differ) and also the systems are setup in a similar manner.
It would be good to have a dump of the configuration of the non-working system (redact sensitive information such as passwords etc, but leave other information intact).
The routes are correctly learned from upstream and exported to the kernel, but the kernel is not "learning" them.
Interesting. The following dumps you sent might further help debugging the problem.
Netlink route 0000 00 04 03 38 00 00 00 00 00 00 00 00 00 00 00 00 ...8............ 0010 68 00 00 00 18 00 05 05 13 0d 66 02 00 00 00 00 h.........f..... 0020 0a 28 00 00 fe 0c 00 01 00 00 00 00 14 00 01 00 .(.............. 0030 26 07 ff 00 0b 00 00 00 00 00 00 00 00 00 00 00 &............... 0040 08 00 06 00 20 00 00 00 14 00 07 00 2a 0e 39 40 .... .......*.9@ 0050 10 00 00 00 00 00 00 00 00 00 00 02 08 00 04 00 ................ 0060 02 00 00 00 14 00 05 00 2a 0e 39 40 de ad 00 00 ........*.9@.... 0070 00 00 00 00 00 00 00 01 ........
This decodes to (Wireshark supports "Import from hexdump", as I found out): Linux rtnetlink (route netlink) protocol Netlink message header (type: Add network route) Length: 104 Message type: Add network route (24) Flags: 0x0505 Flags: 0x0505 Sequence: 40242451 Port ID: 0 Address family: AF_INET6 (10) Length of destination: 40 Length of source: 0 TOS filter: 0x00 Routing table ID: 254 Routing protocol: BIRD (0x0c) Route origin: global route (0x00) Route type: Gateway or direct route (0x01) Route flags: 0x00000000 Attribute: Route destination address Len: 20 Type: 0x0001, Route destination address (1) Data: 2607ff000b0000000000000000000000 Attribute: RTA_PRIORITY Len: 8 Type: 0x0006, RTA_PRIORITY (6) Data: 20000000 Attribute: RTA_PREFSRC Len: 20 Type: 0x0007, RTA_PREFSRC (7) Data: 2a0e3940100000000000000000000002 Attribute: Output interface index: 2 Len: 8 Type: 0x0004, Output interface index (4) Output interface index: 2 Attribute: Gateway of the route Len: 20 Type: 0x0005, Gateway of the route (5) Data: 2a0e3940dead00000000000000000001
0000 00 04 03 38 00 00 00 00 00 00 00 00 00 00 00 00 ...8............ 0010 7c 00 00 00 02 00 00 00 13 0d 66 02 7a 31 09 81 |.........f.z1.. 0020 ea ff ff ff 68 00 00 00 18 00 05 05 13 0d 66 02 ....h.........f. 0030 00 00 00 00 0a 28 00 00 fe 0c 00 01 00 00 00 00 .....(.......... 0040 14 00 01 00 26 07 ff 00 0b 00 00 00 00 00 00 00 ....&........... 0050 00 00 00 00 08 00 06 00 20 00 00 00 14 00 07 00 ........ ....... 0060 2a 0e 39 40 10 00 00 00 00 00 00 00 00 00 00 02 *.9@............ 0070 08 00 04 00 02 00 00 00 14 00 05 00 2a 0e 39 40 ............*.9@ 0080 de ad 00 00 00 00 00 00 00 00 00 01 ............
decodes as: Netlink message Netlink message header (type: Error) Length: 124 Message type: Error (0x0002) Flags: 0x0000 Sequence: 40242451 Port ID: 2164863354 Error code: Invalid argument (-EINVAL) (-22) Netlink message header (type: 0x0018) Length: 104 Message type: Protocol-specific (0x0018) Flags: 0x0505 Flags: 0x0505 Sequence: 40242451 Port ID: 0 The first message could probably be replicated by running: ip -6 route add 2607:ff00:b::/40 via 2a0e:3940:dead::1 table 254 protocol bird scope global src 2a0e:3940:1000::2 dev 2 - where dev 2 indicates the network interface with index 2, this is probably ens20 in your setup? - table 254 is most likely the main table (see /etc/iproute2/rt_tables) I'm unsure how to decode RTA_PRIORITY correctly here. Regardless, you could run this command on the non-working host. Perhaps `ip route` can tell you a bit more information. In a slightly modified case (I've replaced the `via ...` with a known gateway), I get: "Error: Invalid source address." (with: iproute2-6.5.0) My current hunch is that `src 2a0e:3940:1000::2` is not a valid address on your system. A closer read on your earlier comment:
The Bird config on both systems is nearly identical (only IPs differ)
suggests to look in this direction. Best regards, Gerdriaan Mulder
Hi Gerdriaan, thanks a lot for your input! I haven't had much time to continue on this until now. Please see my replies inline: On 01.01.24 19:15, Gerdriaan Mulder wrote:
Hi Robert,
On 15/11/2023 22:58, Robert Finze wrote:
The Bird config on both systems is nearly identical (only IPs differ) and also the systems are setup in a similar manner.
It would be good to have a dump of the configuration of the non-working system (redact sensitive information such as passwords etc, but leave other information intact).
I've attached the config.
The routes are correctly learned from upstream and exported to the kernel, but the kernel is not "learning" them.
Interesting. The following dumps you sent might further help debugging the problem.
Netlink route 0000 00 04 03 38 00 00 00 00 00 00 00 00 00 00 00 00 ...8............ 0010 68 00 00 00 18 00 05 05 13 0d 66 02 00 00 00 00 h.........f..... 0020 0a 28 00 00 fe 0c 00 01 00 00 00 00 14 00 01 00 .(.............. 0030 26 07 ff 00 0b 00 00 00 00 00 00 00 00 00 00 00 &............... 0040 08 00 06 00 20 00 00 00 14 00 07 00 2a 0e 39 40 .... .......*.9@ 0050 10 00 00 00 00 00 00 00 00 00 00 02 08 00 04 00 ................ 0060 02 00 00 00 14 00 05 00 2a 0e 39 40 de ad 00 00 ........*.9@.... 0070 00 00 00 00 00 00 00 01 ........
This decodes to (Wireshark supports "Import from hexdump", as I found out):
Linux rtnetlink (route netlink) protocol Netlink message header (type: Add network route) Length: 104 Message type: Add network route (24) Flags: 0x0505 Flags: 0x0505 Sequence: 40242451 Port ID: 0 Address family: AF_INET6 (10) Length of destination: 40 Length of source: 0 TOS filter: 0x00 Routing table ID: 254 Routing protocol: BIRD (0x0c) Route origin: global route (0x00) Route type: Gateway or direct route (0x01) Route flags: 0x00000000 Attribute: Route destination address Len: 20 Type: 0x0001, Route destination address (1) Data: 2607ff000b0000000000000000000000 Attribute: RTA_PRIORITY Len: 8 Type: 0x0006, RTA_PRIORITY (6) Data: 20000000 Attribute: RTA_PREFSRC Len: 20 Type: 0x0007, RTA_PREFSRC (7) Data: 2a0e3940100000000000000000000002 Attribute: Output interface index: 2 Len: 8 Type: 0x0004, Output interface index (4) Output interface index: 2 Attribute: Gateway of the route Len: 20 Type: 0x0005, Gateway of the route (5) Data: 2a0e3940dead00000000000000000001
0000 00 04 03 38 00 00 00 00 00 00 00 00 00 00 00 00 ...8............ 0010 7c 00 00 00 02 00 00 00 13 0d 66 02 7a 31 09 81 |.........f.z1.. 0020 ea ff ff ff 68 00 00 00 18 00 05 05 13 0d 66 02 ....h.........f. 0030 00 00 00 00 0a 28 00 00 fe 0c 00 01 00 00 00 00 .....(.......... 0040 14 00 01 00 26 07 ff 00 0b 00 00 00 00 00 00 00 ....&........... 0050 00 00 00 00 08 00 06 00 20 00 00 00 14 00 07 00 ........ ....... 0060 2a 0e 39 40 10 00 00 00 00 00 00 00 00 00 00 02 *.9@............ 0070 08 00 04 00 02 00 00 00 14 00 05 00 2a 0e 39 40 ............*.9@ 0080 de ad 00 00 00 00 00 00 00 00 00 01 ............
decodes as:
Netlink message Netlink message header (type: Error) Length: 124 Message type: Error (0x0002) Flags: 0x0000 Sequence: 40242451 Port ID: 2164863354 Error code: Invalid argument (-EINVAL) (-22) Netlink message header (type: 0x0018) Length: 104 Message type: Protocol-specific (0x0018) Flags: 0x0505 Flags: 0x0505 Sequence: 40242451 Port ID: 0
The first message could probably be replicated by running:
ip -6 route add 2607:ff00:b::/40 via 2a0e:3940:dead::1 table 254 protocol bird scope global src 2a0e:3940:1000::2 dev 2
this returns: RTNETLINK answers: No route to host
- where dev 2 indicates the network interface with index 2, this is probably ens20 in your setup?
It should be ens19. I'm currently not sure how to verify that. "ip a" shows: 1: lo 2: ens18 3: ens19 4: ens20 5: dummy0
- table 254 is most likely the main table (see /etc/iproute2/rt_tables)
Correct, this is 'main'.
I'm unsure how to decode RTA_PRIORITY correctly here. Regardless, you could run this command on the non-working host. Perhaps `ip route` can tell you a bit more information. In a slightly modified case (I've replaced the `via ...` with a known gateway), I get: "Error: Invalid source address." (with: iproute2-6.5.0)
My current hunch is that `src 2a0e:3940:1000::2` is not a valid address on your system. A closer read on your earlier comment:
This ip is bound on the dummy0 interface: 5: dummy0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000 link/ether ba:bc:b1:56:92:71 brd ff:ff:ff:ff:ff:ff inet 45.95.204.2/32 scope global dummy0 valid_lft forever preferred_lft forever inet6 2a0e:3940:1000::2/128 scope global tentative One difference here to the system running 20.04 is the state of the dummy interface, which is shown there as: 8: dummy0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 Both interfaces are configured using Netplan and the config there is the same (apart the IP address).
The Bird config on both systems is nearly identical (only IPs differ)
suggests to look in this direction.
Best regards, Gerdriaan Mulder
Not sure if this helps, but this is the current ip6 routing table: # ip -6 r ::1 dev lo proto kernel metric 256 pref medium 2a0e:3940:1000::2 dev dummy0 proto kernel metric 256 linkdown pref medium 2a0e:3940:1000::/36 dev ens19 proto bird metric 32 pref medium 2a0e:3940:2000::/36 dev ens19 proto bird metric 32 pref medium 2a0e:3940:dead::/64 dev ens18 proto kernel metric 256 pref medium fe80::/64 dev ens20 proto kernel metric 256 pref medium fe80::/64 dev ens19 proto kernel metric 256 pref medium fe80::/64 dev ens18 proto kernel metric 256 pref medium In the meantime I've setup a clean new VM with Ubuntu 22.04 and the same issues occurred. I've upgraded that new VM to 24.04 and still the same. Next I want to try a fresh 20.04 install and see what happens. Maybe I'll try the 3.0alpha and give that a shot. To be honest, I'm not even sure if this is a bird issue or a "linux" issue. But starting debugging this from the bird side seems sensible to me. Thanks a lot for the support! Best, Robert
Hi Robert, On 27/02/2024 22:58, Robert Finze wrote:
In the meantime I've setup a clean new VM with Ubuntu 22.04 and the same issues occurred. I've upgraded that new VM to 24.04 and still the same.
Next I want to try a fresh 20.04 install and see what happens.
I would try a fresh install of Ubuntu 20.04 with the same kernel as the machine that currently works, indeed. If the problem goes away, it might be an issue between Ubuntu 20.04 and 22.04. If the problem persists, it might be some subtle configuration difference. I wouldn't yet upgrade to BIRD 3.0alpha because that changes too many variables in order to debug the problem.
ip -6 route add 2607:ff00:b::/40 via 2a0e:3940:dead::1 table 254 protocol bird scope global src 2a0e:3940:1000::2 dev 2 - where dev 2 indicates the network interface with index 2, this is probably ens20 in your setup?
It should be ens19. I'm currently not sure how to verify that. "ip a" shows:
1: lo 2: ens18 3: ens19 4: ens20 5: dummy0
The number before ":" is the interface index. It seems BIRD wants to add the route on device ens18 (at least, at the time). Besides, in your initial post, you pasted a few routes from BIRD that were using protocols "upstream_1v6" and "upstream_2v6". They seem to be missing from the bird.conf you posted. The route addition in the netlink dump is different from the routes you showed in BIRD, which makes it more difficult to pinpoint the problem. I think it's a good idea to focus on getting just one route exported from BIRD to the kernel successfully. If it's possible in your setup, perhaps just configure 1 upstream, and only import 1 route from that upstream in BIRD, and export the same route through the kernel protocol. Best regards, Gerdriaan Mulder
Hi Gerdriaan, I've followed your advice and set up 2 VMs for testing. On 28.02.24 12:02, Gerdriaan Mulder wrote:
Next I want to try a fresh 20.04 install and see what happens.
I would try a fresh install of Ubuntu 20.04 with the same kernel as the machine that currently works, indeed. If the problem goes away, it might be an issue between Ubuntu 20.04 and 22.04. If the problem persists, it might be some subtle configuration difference. I wouldn't yet upgrade to BIRD 3.0alpha because that changes too many variables in order to debug the problem. A) Ubuntu 24.04 Linux moon2 6.6.0-14-generic #14-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 30 10:27:29 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Bird 3.0alpha2 Bird 2.14 Bird 2.13
B) Ubuntu 20.04 Linux moon3 5.4.0-172-generic #190-Ubuntu SMP Fri Feb 2 23:24:22 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux Bird 2.14 C) Ubuntu 20.04 Linux star 5.4.0-172-generic #190-Ubuntu SMP Fri Feb 2 23:24:22 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux Bird 2.14 VM C is my current router which is working fine and from which I'm exporting one route towards A and B. On VM A I've tested different bird version and all show the same behaviour. Before I've upgraded to 24.04 I've ran the tests on 22.04 and the results are the same. VM C is exporting one route towards A and B which is being accepted by bird but on A doesn't end up in the kernel. On B there's no issues and everything is working as expected. It seems that there is indeed a difference between 20.04 and 22.04 (and newer). I'm a bit stuck here. For now I'd be fine with running 20.04 on all routers, but eventually it'd be nice to upgrade.
Besides, in your initial post, you pasted a few routes from BIRD that were using protocols "upstream_1v6" and "upstream_2v6". They seem to be missing from the bird.conf you posted. The route addition in the netlink dump is different from the routes you showed in BIRD, which makes it more difficult to pinpoint the problem.
Apologies for this. Yes, there are 2 more upstreams configured, but are shut so that it's easier to troubleshoot.
I think it's a good idea to focus on getting just one route exported from BIRD to the kernel successfully. If it's possible in your setup, perhaps just configure 1 upstream, and only import 1 route from that upstream in BIRD, and export the same route through the kernel protocol.
Best regards, Gerdriaan Mulder
Cheers, Robert
participants (2)
-
Gerdriaan Mulder -
Robert Finze