link-local IPv6 address in BGP.next_hop
Hello all! I have an interesting case of link-local IPv6 address in BGP.next_hop and I would like to know your opinion about that because I cannot tell with 100% confidence if it’s a bug or a feature. Existence of these link-local addresses causes issues of interoperability between Bird and FRR. I have separate discussion about that with FRR folks. Here I would like to now a Bird perspective. Details below. On single router with Bird 2.15 I have multiple IPv4 and IPv6 eBGP sessions, which receives prefixes from the Internet, and IPv4 iBGP session, which forwards these prefixes to BGP collector with FRR, which is separate server somewhere in the Internet many hops away in separate ASN. Session with BGP collector uses both ipv4 and ipv6 channels to send both IPv4 and IPv6 prefixes. IPv6 prefixes received via eBGP have both global IPv6 address and link-local IPv6 address like in an example below: ::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i] via 2600:1488:6080::8 on ae2 Type: BGP univ BGP.origin: IGP BGP.as_path: 3356 BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d BGP.med: 4294967294 BGP.local_pref: 60 BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356) However, prefixes forwarded via iBGP to BGP collector also have both global and link-local addresses like below: ::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i] via 2600:1488:6080::8 on ae2 Type: BGP univ BGP.origin: IGP BGP.as_path: 3356 BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d BGP.med: 4294967294 BGP.local_pref: 60 BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356) (21357,600) On one hand, as per RFC 4271 NEXT_HOP is not changed when prefix is passed from eBGP to iBGP so what we see above it expected. But on the other hand, as per RFC 2545 link-local address must not be there because both sides of iBGP doesn’t share the same IPv6 subnet: “”” The link-local address shall be included in the Next Hop field if and only if the BGP speaker shares a common subnet with the entity identified by the global IPv6 address carried in the Network Address of Next Hop field and the peer the route is being advertised to. In all other cases a BGP speaker shall advertise to its peer in the Network Address field only the global IPv6 address of the next hop (the value of the Length of Network Address of Next Hop field shall be set to 16). “”” Who is right here? As far I know, both documents are still current standards, and both are implemented by Bird. I don’t see any clear guidelines how to make a clear judgement here. Personally, I would tell that RFC 4271 should be treated as general rule and RFC 2545 as more specific rule so in the end link-local should be removed. After all, link-local addresses do not make sense for multihop sessions. However, these documents don’t refer to each other and I don’t know if authors of these documents knew about each other statements. What do you think? Anyway, when BGP collector with FRR 8.5.2 receives BGP UPDATE for route like presented above, then FRR rejects such UPDATE with treat-as-withdrawn approach but also triggers additional error about invalid prefix length for AFI 1, which finally causes NOTIFICATION (UPDATE Message Error/Invalid Network Field) and session goes down. I cannot rule out implementation bug in FRR version that I use, and I discuss it with FRR folks already. Working workaround that I tested is to apply `next hop self` on Bird side. Probably `bgp_next_hop = bgp_next_hop` in Bird’s export policy will also work but I must test it yet. What do you think? It’s a bug or a feature? Regards, Grzegorz
Hi Grzegorz, I don’t quite understand what you mean, but maybe someone else can. It seems like it announced a non-existent local-link address to another router? Best, *Brandon Z.* HUIZE LTD www.huize.asia <https://huize.asia/>| www.ixp.su | Twitter This e-mail and any attachments or any reproduction of this e-mail in whatever manner are confidential and for the use of the addressee(s) only. HUIZE LTD can’t take any liability and guarantee of the text of the email message and virus. On Tue, 28 Jan 2025 at 01:44, Ponikierski, Grzegorz via Bird-users < bird-users@network.cz> wrote:
Hello all!
I have an interesting case of link-local IPv6 address in BGP.next_hop and I would like to know your opinion about that because I cannot tell with 100% confidence if it’s a bug or a feature. Existence of these link-local addresses causes issues of interoperability between Bird and FRR. I have separate discussion about that with FRR folks. Here I would like to now a Bird perspective. Details below.
On single router with Bird 2.15 I have multiple IPv4 and IPv6 eBGP sessions, which receives prefixes from the Internet, and IPv4 iBGP session, which forwards these prefixes to BGP collector with FRR, which is separate server somewhere in the Internet many hops away in separate ASN. Session with BGP collector uses both ipv4 and ipv6 channels to send both IPv4 and IPv6 prefixes. IPv6 prefixes received via eBGP have both global IPv6 address and link-local IPv6 address like in an example below:
::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i]
via 2600:1488:6080::8 on ae2
Type: BGP univ
BGP.origin: IGP
BGP.as_path: 3356
BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d
BGP.med: 4294967294
BGP.local_pref: 60
BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356)
However, prefixes forwarded via iBGP to BGP collector also have both global and link-local addresses like below:
::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i]
via 2600:1488:6080::8 on ae2
Type: BGP univ
BGP.origin: IGP
BGP.as_path: 3356
BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d
BGP.med: 4294967294
BGP.local_pref: 60
BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356) (21357,600)
On one hand, as per RFC 4271 NEXT_HOP is not changed when prefix is passed from eBGP to iBGP so what we see above it expected. But on the other hand, as per RFC 2545 link-local address must not be there because both sides of iBGP doesn’t share the same IPv6 subnet:
“””
The link-local address shall be included in the Next Hop field if and
only if the BGP speaker shares a common subnet with the entity
identified by the global IPv6 address carried in the Network Address
of Next Hop field and the peer the route is being advertised to.
In all other cases a BGP speaker shall advertise to its peer in the
Network Address field only the global IPv6 address of the next hop
(the value of the Length of Network Address of Next Hop field shall
be set to 16).
“””
Who is right here? As far I know, both documents are still current standards, and both are implemented by Bird. I don’t see any clear guidelines how to make a clear judgement here. Personally, I would tell that RFC 4271 should be treated as general rule and RFC 2545 as more specific rule so in the end link-local should be removed. After all, link-local addresses do not make sense for multihop sessions. However, these documents don’t refer to each other and I don’t know if authors of these documents knew about each other statements. What do you think?
Anyway, when BGP collector with FRR 8.5.2 receives BGP UPDATE for route like presented above, then FRR rejects such UPDATE with treat-as-withdrawn approach but also triggers additional error about invalid prefix length for AFI 1, which finally causes NOTIFICATION (UPDATE Message Error/Invalid Network Field) and session goes down. I cannot rule out implementation bug in FRR version that I use, and I discuss it with FRR folks already.
Working workaround that I tested is to apply `next hop self` on Bird side. Probably `bgp_next_hop = bgp_next_hop` in Bird’s export policy will also work but I must test it yet.
What do you think? It’s a bug or a feature?
Regards,
Grzegorz
I see I missed one detail which can be confusing. Problem is with sending link-local address from Bird to BGP speaker on remote side and this link-local doesn’t make sense for remote side because they don’t share common subnet. Belove how it looks like from FRR perspective. FRR: 2025/01/17 23:27:48 BGP: [PS8NX-WWXPH] 23.33.236.254 sent a v6 LL next-hop and there's no peer interface information. Hence, withdrawing FRR: 2025/01/17 23:27:48 BGP: [RWQFK-BA2JR][EC 33554488] 23.33.236.254: Attribute MP_REACH_NLRI, parse error - treating as withdrawal FRR: 2025/01/17 23:27:48 BGP: [QWG8G-NT6EJ][EC 33554455] 23.33.236.254(Unknown) rcvd UPDATE with errors in attr(s)!! Withdrawing route. FRR: 2025/01/17 23:27:48 BGP: [XC334-3GAQ8][EC 33554455] 23.33.236.254 [Error] Update packet error (wrong prefix length 64 for afi 1) FRR: 2025/01/17 23:27:48 BGP: [HJP7M-20X19][EC 33554455] 23.33.236.254 [Error] Error parsing NLRI FRR: 2025/01/17 23:27:48 BGP: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 23.33.236.254 3/10 (UPDATE Message Error/Invalid Network Field) 0 bytes Regards, Grzegorz From: "Brandon Z." <Brandon@huize.asia> Date: Tuesday, 28 January 2025 at 01:58 To: "Ponikierski, Grzegorz" <gponikie@akamai.com> Cc: bird-users <bird-users@network.cz> Subject: Re: link-local IPv6 address in BGP.next_hop Hi Grzegorz, I don’t quite understand what you mean, but maybe someone else can. It seems like it announced a non-existent local-link address to another router? Best, Brandon Z. HUIZE LTD www. huize. asia | www. ixp. su | Twitter This e-mail and ZjQcmQRYFpfptBannerStart This Message Is From an Untrusted Sender You have not previously corresponded with this sender. ZjQcmQRYFpfptBannerEnd Hi Grzegorz, I don’t quite understand what you mean, but maybe someone else can. It seems like it announced a non-existent local-link address to another router? Best, Brandon Z. HUIZE LTD www.huize.asia <https://urldefense.com/v3/__https:/huize.asia/__;!!GjvTz_vk!XyivVwgDc3QoWIOFFzClG1jQBYTYBXjPwglViKFx1CYU9iKVGrPgPRPToOkKRYoLldMsfMchRyjWkXIqjQ$> | www.ixp.su<https://urldefense.com/v3/__https:/www.ixp.su/__;!!GjvTz_vk!XyivVwgDc3QoWIOFFzClG1jQBYTYBXjPwglViKFx1CYU9iKVGrPgPRPToOkKRYoLldMsfMchRyg1_-iFlg$> | Twitter [Image removed by sender.] This e-mail and any attachments or any reproduction of this e-mail in whatever manner are confidential and for the use of the addressee(s) only. HUIZE LTD can’t take any liability and guarantee of the text of the email message and virus. On Tue, 28 Jan 2025 at 01:44, Ponikierski, Grzegorz via Bird-users <bird-users@network.cz<mailto:bird-users@network.cz>> wrote: Hello all! I have an interesting case of link-local IPv6 address in BGP.next_hop and I would like to know your opinion about that because I cannot tell with 100% confidence if it’s a bug or a feature. Existence of these link-local addresses causes issues of interoperability between Bird and FRR. I have separate discussion about that with FRR folks. Here I would like to now a Bird perspective. Details below. On single router with Bird 2.15 I have multiple IPv4 and IPv6 eBGP sessions, which receives prefixes from the Internet, and IPv4 iBGP session, which forwards these prefixes to BGP collector with FRR, which is separate server somewhere in the Internet many hops away in separate ASN. Session with BGP collector uses both ipv4 and ipv6 channels to send both IPv4 and IPv6 prefixes. IPv6 prefixes received via eBGP have both global IPv6 address and link-local IPv6 address like in an example below: ::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i] via 2600:1488:6080::8 on ae2 Type: BGP univ BGP.origin: IGP BGP.as_path: 3356 BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d BGP.med: 4294967294 BGP.local_pref: 60 BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356) However, prefixes forwarded via iBGP to BGP collector also have both global and link-local addresses like below: ::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i] via 2600:1488:6080::8 on ae2 Type: BGP univ BGP.origin: IGP BGP.as_path: 3356 BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d BGP.med: 4294967294 BGP.local_pref: 60 BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356) (21357,600) On one hand, as per RFC 4271 NEXT_HOP is not changed when prefix is passed from eBGP to iBGP so what we see above it expected. But on the other hand, as per RFC 2545 link-local address must not be there because both sides of iBGP doesn’t share the same IPv6 subnet: “”” The link-local address shall be included in the Next Hop field if and only if the BGP speaker shares a common subnet with the entity identified by the global IPv6 address carried in the Network Address of Next Hop field and the peer the route is being advertised to. In all other cases a BGP speaker shall advertise to its peer in the Network Address field only the global IPv6 address of the next hop (the value of the Length of Network Address of Next Hop field shall be set to 16). “”” Who is right here? As far I know, both documents are still current standards, and both are implemented by Bird. I don’t see any clear guidelines how to make a clear judgement here. Personally, I would tell that RFC 4271 should be treated as general rule and RFC 2545 as more specific rule so in the end link-local should be removed. After all, link-local addresses do not make sense for multihop sessions. However, these documents don’t refer to each other and I don’t know if authors of these documents knew about each other statements. What do you think? Anyway, when BGP collector with FRR 8.5.2 receives BGP UPDATE for route like presented above, then FRR rejects such UPDATE with treat-as-withdrawn approach but also triggers additional error about invalid prefix length for AFI 1, which finally causes NOTIFICATION (UPDATE Message Error/Invalid Network Field) and session goes down. I cannot rule out implementation bug in FRR version that I use, and I discuss it with FRR folks already. Working workaround that I tested is to apply `next hop self` on Bird side. Probably `bgp_next_hop = bgp_next_hop` in Bird’s export policy will also work but I must test it yet. What do you think? It’s a bug or a feature? Regards, Grzegorz
All, somehow related topic I see.. Would worth implementing https://datatracker.ietf.org/doc/html/draft-white-linklocal-capability for Bird (FRRouting already has it), which might also be a step forward. Thanks! On Tue, Jan 28, 2025 at 3:29 PM Ponikierski, Grzegorz via Bird-users < bird-users@network.cz> wrote:
I see I missed one detail which can be confusing. Problem is with sending link-local address from Bird to BGP speaker on remote side and this link-local doesn’t make sense for remote side because they don’t share common subnet. Belove how it looks like from FRR perspective.
FRR: 2025/01/17 23:27:48 BGP: [PS8NX-WWXPH] 23.33.236.254 sent a v6 LL next-hop and there's no peer interface information. Hence, withdrawing
FRR: 2025/01/17 23:27:48 BGP: [RWQFK-BA2JR][EC 33554488] 23.33.236.254: Attribute MP_REACH_NLRI, parse error - treating as withdrawal
FRR: 2025/01/17 23:27:48 BGP: [QWG8G-NT6EJ][EC 33554455] 23.33.236.254(Unknown) rcvd UPDATE with errors in attr(s)!! Withdrawing route.
FRR: 2025/01/17 23:27:48 BGP: [XC334-3GAQ8][EC 33554455] 23.33.236.254 [Error] Update packet error (wrong prefix length 64 for afi 1)
FRR: 2025/01/17 23:27:48 BGP: [HJP7M-20X19][EC 33554455] 23.33.236.254 [Error] Error parsing NLRI
FRR: 2025/01/17 23:27:48 BGP: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 23.33.236.254 3/10 (UPDATE Message Error/Invalid Network Field) 0 bytes
Regards,
Grzegorz
*From: *"Brandon Z." <Brandon@huize.asia> *Date: *Tuesday, 28 January 2025 at 01:58 *To: *"Ponikierski, Grzegorz" <gponikie@akamai.com> *Cc: *bird-users <bird-users@network.cz> *Subject: *Re: link-local IPv6 address in BGP.next_hop
Hi Grzegorz, I don’t quite understand what you mean, but maybe someone else can. It seems like it announced a non-existent local-link address to another router? Best, Brandon Z. HUIZE LTD www. huize. asia | www. ixp. su | Twitter This e-mail and
ZjQcmQRYFpfptBannerStart
*This Message Is From an Untrusted Sender *
You have not previously corresponded with this sender.
ZjQcmQRYFpfptBannerEnd
Hi Grzegorz,
I don’t quite understand what you mean, but maybe someone else can. It seems like it announced a non-existent local-link address to another router?
Best,
*Brandon Z.*
HUIZE LTD
www.huize.asia <https://urldefense.com/v3/__https:/huize.asia/__;!!GjvTz_vk!XyivVwgDc3QoWIOFFzClG1jQBYTYBXjPwglViKFx1CYU9iKVGrPgPRPToOkKRYoLldMsfMchRyjWkXIqjQ$> | www.ixp.su <https://urldefense.com/v3/__https:/www.ixp.su/__;!!GjvTz_vk!XyivVwgDc3QoWIOFFzClG1jQBYTYBXjPwglViKFx1CYU9iKVGrPgPRPToOkKRYoLldMsfMchRyg1_-iFlg$> | Twitter
[image: Image removed by sender.]
This e-mail and any attachments or any reproduction of this e-mail in whatever manner are confidential and for the use of the addressee(s) only. HUIZE LTD can’t take any liability and guarantee of the text of the email message and virus.
On Tue, 28 Jan 2025 at 01:44, Ponikierski, Grzegorz via Bird-users < bird-users@network.cz> wrote:
Hello all!
I have an interesting case of link-local IPv6 address in BGP.next_hop and I would like to know your opinion about that because I cannot tell with 100% confidence if it’s a bug or a feature. Existence of these link-local addresses causes issues of interoperability between Bird and FRR. I have separate discussion about that with FRR folks. Here I would like to now a Bird perspective. Details below.
On single router with Bird 2.15 I have multiple IPv4 and IPv6 eBGP sessions, which receives prefixes from the Internet, and IPv4 iBGP session, which forwards these prefixes to BGP collector with FRR, which is separate server somewhere in the Internet many hops away in separate ASN. Session with BGP collector uses both ipv4 and ipv6 channels to send both IPv4 and IPv6 prefixes. IPv6 prefixes received via eBGP have both global IPv6 address and link-local IPv6 address like in an example below:
::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i]
via 2600:1488:6080::8 on ae2
Type: BGP univ
BGP.origin: IGP
BGP.as_path: 3356
BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d
BGP.med: 4294967294
BGP.local_pref: 60
BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356)
However, prefixes forwarded via iBGP to BGP collector also have both global and link-local addresses like below:
::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i]
via 2600:1488:6080::8 on ae2
Type: BGP univ
BGP.origin: IGP
BGP.as_path: 3356
BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d
BGP.med: 4294967294
BGP.local_pref: 60
BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356) (21357,600)
On one hand, as per RFC 4271 NEXT_HOP is not changed when prefix is passed from eBGP to iBGP so what we see above it expected. But on the other hand, as per RFC 2545 link-local address must not be there because both sides of iBGP doesn’t share the same IPv6 subnet:
“””
The link-local address shall be included in the Next Hop field if and
only if the BGP speaker shares a common subnet with the entity
identified by the global IPv6 address carried in the Network Address
of Next Hop field and the peer the route is being advertised to.
In all other cases a BGP speaker shall advertise to its peer in the
Network Address field only the global IPv6 address of the next hop
(the value of the Length of Network Address of Next Hop field shall
be set to 16).
“””
Who is right here? As far I know, both documents are still current standards, and both are implemented by Bird. I don’t see any clear guidelines how to make a clear judgement here. Personally, I would tell that RFC 4271 should be treated as general rule and RFC 2545 as more specific rule so in the end link-local should be removed. After all, link-local addresses do not make sense for multihop sessions. However, these documents don’t refer to each other and I don’t know if authors of these documents knew about each other statements. What do you think?
Anyway, when BGP collector with FRR 8.5.2 receives BGP UPDATE for route like presented above, then FRR rejects such UPDATE with treat-as-withdrawn approach but also triggers additional error about invalid prefix length for AFI 1, which finally causes NOTIFICATION (UPDATE Message Error/Invalid Network Field) and session goes down. I cannot rule out implementation bug in FRR version that I use, and I discuss it with FRR folks already.
Working workaround that I tested is to apply `next hop self` on Bird side. Probably `bgp_next_hop = bgp_next_hop` in Bird’s export policy will also work but I must test it yet.
What do you think? It’s a bug or a feature?
Regards,
Grzegorz
-- Donatas
draft-white-linklocal-capability describes different scenario where BGP for point-to-point links can use only link-local addresses in NEXT_HOP without need for global addresses. This is nice scenario for data centers. However, scenario which I describe refers to multihop sessions across the Internet so we don’t have point-to-point links and link-local addresses don’t make sense in NEXT_HOP which is highlighted in RFC 2545. Regards, Grzegorz From: Donatas Abraitis <donatas.abraitis@gmail.com> Date: Tuesday, 28 January 2025 at 14:38 To: "Ponikierski, Grzegorz" <gponikie@akamai.com> Cc: "Brandon Z." <Brandon@huize.asia>, bird-users <bird-users@network.cz> Subject: Re: link-local IPv6 address in BGP.next_hop All, somehow related topic I see. . Would worth implementing https: //datatracker. ietf. org/doc/html/draft-white-linklocal-capability for Bird (FRRouting already has it), which might also be a step forward. Thanks! On Tue, Jan 28, 2025 at 3: 29 ZjQcmQRYFpfptBannerStart This Message Is From an Untrusted Sender You have not previously corresponded with this sender. ZjQcmQRYFpfptBannerEnd All, somehow related topic I see.. Would worth implementing https://datatracker.ietf.org/doc/html/draft-white-linklocal-capability<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-white-linklocal-capability__;!!GjvTz_vk!VZs3OqGsJcAkeO9NLgeZDp_mULn0oGTqNhX6xk59x0O19JLP4uHExzLrdDs2qtwGg4vy60dkYx7C96kmAp1QlM_D-g$> for Bird (FRRouting already has it), which might also be a step forward. Thanks! On Tue, Jan 28, 2025 at 3:29 PM Ponikierski, Grzegorz via Bird-users <bird-users@network.cz<mailto:bird-users@network.cz>> wrote: I see I missed one detail which can be confusing. Problem is with sending link-local address from Bird to BGP speaker on remote side and this link-local doesn’t make sense for remote side because they don’t share common subnet. Belove how it looks like from FRR perspective. FRR: 2025/01/17 23:27:48 BGP: [PS8NX-WWXPH] 23.33.236.254 sent a v6 LL next-hop and there's no peer interface information. Hence, withdrawing FRR: 2025/01/17 23:27:48 BGP: [RWQFK-BA2JR][EC 33554488] 23.33.236.254<https://urldefense.com/v3/__http:/23.33.236.254__;!!GjvTz_vk!VZs3OqGsJcAkeO9NLgeZDp_mULn0oGTqNhX6xk59x0O19JLP4uHExzLrdDs2qtwGg4vy60dkYx7C96kmAp3uFzG3gg$>: Attribute MP_REACH_NLRI, parse error - treating as withdrawal FRR: 2025/01/17 23:27:48 BGP: [QWG8G-NT6EJ][EC 33554455] 23.33.236.254(Unknown) rcvd UPDATE with errors in attr(s)!! Withdrawing route. FRR: 2025/01/17 23:27:48 BGP: [XC334-3GAQ8][EC 33554455] 23.33.236.254 [Error] Update packet error (wrong prefix length 64 for afi 1) FRR: 2025/01/17 23:27:48 BGP: [HJP7M-20X19][EC 33554455] 23.33.236.254 [Error] Error parsing NLRI FRR: 2025/01/17 23:27:48 BGP: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 23.33.236.254 3/10 (UPDATE Message Error/Invalid Network Field) 0 bytes Regards, Grzegorz From: "Brandon Z." <Brandon@huize.asia<mailto:Brandon@huize.asia>> Date: Tuesday, 28 January 2025 at 01:58 To: "Ponikierski, Grzegorz" <gponikie@akamai.com<mailto:gponikie@akamai.com>> Cc: bird-users <bird-users@network.cz<mailto:bird-users@network.cz>> Subject: Re: link-local IPv6 address in BGP.next_hop Hi Grzegorz, I don’t quite understand what you mean, but maybe someone else can. It seems like it announced a non-existent local-link address to another router? Best, Brandon Z. HUIZE LTD www. huize. asia | www. ixp. su | Twitter This e-mail and ZjQcmQRYFpfptBannerStart This Message Is From an Untrusted Sender You have not previously corresponded with this sender. ZjQcmQRYFpfptBannerEnd Hi Grzegorz, I don’t quite understand what you mean, but maybe someone else can. It seems like it announced a non-existent local-link address to another router? Best, Brandon Z. HUIZE LTD www.huize.asia <https://urldefense.com/v3/__https:/huize.asia/__;!!GjvTz_vk!XyivVwgDc3QoWIOFFzClG1jQBYTYBXjPwglViKFx1CYU9iKVGrPgPRPToOkKRYoLldMsfMchRyjWkXIqjQ$> | www.ixp.su<https://urldefense.com/v3/__https:/www.ixp.su/__;!!GjvTz_vk!XyivVwgDc3QoWIOFFzClG1jQBYTYBXjPwglViKFx1CYU9iKVGrPgPRPToOkKRYoLldMsfMchRyg1_-iFlg$> | Twitter Error! Filename not specified. This e-mail and any attachments or any reproduction of this e-mail in whatever manner are confidential and for the use of the addressee(s) only. HUIZE LTD can’t take any liability and guarantee of the text of the email message and virus. On Tue, 28 Jan 2025 at 01:44, Ponikierski, Grzegorz via Bird-users <bird-users@network.cz<mailto:bird-users@network.cz>> wrote: Hello all! I have an interesting case of link-local IPv6 address in BGP.next_hop and I would like to know your opinion about that because I cannot tell with 100% confidence if it’s a bug or a feature. Existence of these link-local addresses causes issues of interoperability between Bird and FRR. I have separate discussion about that with FRR folks. Here I would like to now a Bird perspective. Details below. On single router with Bird 2.15 I have multiple IPv4 and IPv6 eBGP sessions, which receives prefixes from the Internet, and IPv4 iBGP session, which forwards these prefixes to BGP collector with FRR, which is separate server somewhere in the Internet many hops away in separate ASN. Session with BGP collector uses both ipv4 and ipv6 channels to send both IPv4 and IPv6 prefixes. IPv6 prefixes received via eBGP have both global IPv6 address and link-local IPv6 address like in an example below: ::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i] via 2600:1488:6080::8 on ae2 Type: BGP univ BGP.origin: IGP BGP.as_path: 3356 BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d BGP.med: 4294967294 BGP.local_pref: 60 BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356) However, prefixes forwarded via iBGP to BGP collector also have both global and link-local addresses like below: ::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i] via 2600:1488:6080::8 on ae2 Type: BGP univ BGP.origin: IGP BGP.as_path: 3356 BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d BGP.med: 4294967294 BGP.local_pref: 60 BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356) (21357,600) On one hand, as per RFC 4271 NEXT_HOP is not changed when prefix is passed from eBGP to iBGP so what we see above it expected. But on the other hand, as per RFC 2545 link-local address must not be there because both sides of iBGP doesn’t share the same IPv6 subnet: “”” The link-local address shall be included in the Next Hop field if and only if the BGP speaker shares a common subnet with the entity identified by the global IPv6 address carried in the Network Address of Next Hop field and the peer the route is being advertised to. In all other cases a BGP speaker shall advertise to its peer in the Network Address field only the global IPv6 address of the next hop (the value of the Length of Network Address of Next Hop field shall be set to 16). “”” Who is right here? As far I know, both documents are still current standards, and both are implemented by Bird. I don’t see any clear guidelines how to make a clear judgement here. Personally, I would tell that RFC 4271 should be treated as general rule and RFC 2545 as more specific rule so in the end link-local should be removed. After all, link-local addresses do not make sense for multihop sessions. However, these documents don’t refer to each other and I don’t know if authors of these documents knew about each other statements. What do you think? Anyway, when BGP collector with FRR 8.5.2 receives BGP UPDATE for route like presented above, then FRR rejects such UPDATE with treat-as-withdrawn approach but also triggers additional error about invalid prefix length for AFI 1, which finally causes NOTIFICATION (UPDATE Message Error/Invalid Network Field) and session goes down. I cannot rule out implementation bug in FRR version that I use, and I discuss it with FRR folks already. Working workaround that I tested is to apply `next hop self` on Bird side. Probably `bgp_next_hop = bgp_next_hop` in Bird’s export policy will also work but I must test it yet. What do you think? It’s a bug or a feature? Regards, Grzegorz -- Donatas
link-local addresses don’t make sense in NEXT_HOP which is highlighted in RFC 2545.
Correct. On Tue, Jan 28, 2025 at 4:41 PM Ponikierski, Grzegorz <gponikie@akamai.com> wrote:
draft-white-linklocal-capability describes different scenario where BGP for point-to-point links can use only link-local addresses in NEXT_HOP without need for global addresses. This is nice scenario for data centers. However, scenario which I describe refers to multihop sessions across the Internet so we don’t have point-to-point links and link-local addresses don’t make sense in NEXT_HOP which is highlighted in RFC 2545.
Regards,
Grzegorz
*From: *Donatas Abraitis <donatas.abraitis@gmail.com> *Date: *Tuesday, 28 January 2025 at 14:38 *To: *"Ponikierski, Grzegorz" <gponikie@akamai.com> *Cc: *"Brandon Z." <Brandon@huize.asia>, bird-users <bird-users@network.cz
*Subject: *Re: link-local IPv6 address in BGP.next_hop
All, somehow related topic I see. . Would worth implementing https: //datatracker. ietf. org/doc/html/draft-white-linklocal-capability for Bird (FRRouting already has it), which might also be a step forward. Thanks! On Tue, Jan 28, 2025 at 3: 29
ZjQcmQRYFpfptBannerStart
*This Message Is From an Untrusted Sender *
You have not previously corresponded with this sender.
ZjQcmQRYFpfptBannerEnd
All,
somehow related topic I see.. Would worth implementing https://datatracker.ietf.org/doc/html/draft-white-linklocal-capability <https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-white-linklocal-capability__;!!GjvTz_vk!VZs3OqGsJcAkeO9NLgeZDp_mULn0oGTqNhX6xk59x0O19JLP4uHExzLrdDs2qtwGg4vy60dkYx7C96kmAp1QlM_D-g$> for Bird (FRRouting already has it), which might also be a step forward.
Thanks!
On Tue, Jan 28, 2025 at 3:29 PM Ponikierski, Grzegorz via Bird-users < bird-users@network.cz> wrote:
I see I missed one detail which can be confusing. Problem is with sending link-local address from Bird to BGP speaker on remote side and this link-local doesn’t make sense for remote side because they don’t share common subnet. Belove how it looks like from FRR perspective.
FRR: 2025/01/17 23:27:48 BGP: [PS8NX-WWXPH] 23.33.236.254 sent a v6 LL next-hop and there's no peer interface information. Hence, withdrawing
FRR: 2025/01/17 23:27:48 BGP: [RWQFK-BA2JR][EC 33554488] 23.33.236.254 <https://urldefense.com/v3/__http:/23.33.236.254__;!!GjvTz_vk!VZs3OqGsJcAkeO9NLgeZDp_mULn0oGTqNhX6xk59x0O19JLP4uHExzLrdDs2qtwGg4vy60dkYx7C96kmAp3uFzG3gg$>: Attribute MP_REACH_NLRI, parse error - treating as withdrawal
FRR: 2025/01/17 23:27:48 BGP: [QWG8G-NT6EJ][EC 33554455] 23.33.236.254(Unknown) rcvd UPDATE with errors in attr(s)!! Withdrawing route.
FRR: 2025/01/17 23:27:48 BGP: [XC334-3GAQ8][EC 33554455] 23.33.236.254 [Error] Update packet error (wrong prefix length 64 for afi 1)
FRR: 2025/01/17 23:27:48 BGP: [HJP7M-20X19][EC 33554455] 23.33.236.254 [Error] Error parsing NLRI
FRR: 2025/01/17 23:27:48 BGP: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 23.33.236.254 3/10 (UPDATE Message Error/Invalid Network Field) 0 bytes
Regards,
Grzegorz
*From: *"Brandon Z." <Brandon@huize.asia> *Date: *Tuesday, 28 January 2025 at 01:58 *To: *"Ponikierski, Grzegorz" <gponikie@akamai.com> *Cc: *bird-users <bird-users@network.cz> *Subject: *Re: link-local IPv6 address in BGP.next_hop
Hi Grzegorz, I don’t quite understand what you mean, but maybe someone else can. It seems like it announced a non-existent local-link address to another router? Best, Brandon Z. HUIZE LTD www. huize. asia | www. ixp. su | Twitter This e-mail and
ZjQcmQRYFpfptBannerStart
*This Message Is From an Untrusted Sender *
You have not previously corresponded with this sender.
ZjQcmQRYFpfptBannerEnd
Hi Grzegorz,
I don’t quite understand what you mean, but maybe someone else can. It seems like it announced a non-existent local-link address to another router?
Best,
*Brandon Z.*
HUIZE LTD
www.huize.asia <https://urldefense.com/v3/__https:/huize.asia/__;!!GjvTz_vk!XyivVwgDc3QoWIOFFzClG1jQBYTYBXjPwglViKFx1CYU9iKVGrPgPRPToOkKRYoLldMsfMchRyjWkXIqjQ$> | www.ixp.su <https://urldefense.com/v3/__https:/www.ixp.su/__;!!GjvTz_vk!XyivVwgDc3QoWIOFFzClG1jQBYTYBXjPwglViKFx1CYU9iKVGrPgPRPToOkKRYoLldMsfMchRyg1_-iFlg$> | Twitter
*Error! Filename not specified.*
This e-mail and any attachments or any reproduction of this e-mail in whatever manner are confidential and for the use of the addressee(s) only. HUIZE LTD can’t take any liability and guarantee of the text of the email message and virus.
On Tue, 28 Jan 2025 at 01:44, Ponikierski, Grzegorz via Bird-users < bird-users@network.cz> wrote:
Hello all!
I have an interesting case of link-local IPv6 address in BGP.next_hop and I would like to know your opinion about that because I cannot tell with 100% confidence if it’s a bug or a feature. Existence of these link-local addresses causes issues of interoperability between Bird and FRR. I have separate discussion about that with FRR folks. Here I would like to now a Bird perspective. Details below.
On single router with Bird 2.15 I have multiple IPv4 and IPv6 eBGP sessions, which receives prefixes from the Internet, and IPv4 iBGP session, which forwards these prefixes to BGP collector with FRR, which is separate server somewhere in the Internet many hops away in separate ASN. Session with BGP collector uses both ipv4 and ipv6 channels to send both IPv4 and IPv6 prefixes. IPv6 prefixes received via eBGP have both global IPv6 address and link-local IPv6 address like in an example below:
::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i]
via 2600:1488:6080::8 on ae2
Type: BGP univ
BGP.origin: IGP
BGP.as_path: 3356
BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d
BGP.med: 4294967294
BGP.local_pref: 60
BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356)
However, prefixes forwarded via iBGP to BGP collector also have both global and link-local addresses like below:
::/0 unicast [2600:1488:6080::8__r01.fra03.ien 2024-12-06] * (100) [AS3356i]
via 2600:1488:6080::8 on ae2
Type: BGP univ
BGP.origin: IGP
BGP.as_path: 3356
BGP.next_hop: 2600:1488:6080::8 fe80::7a4f:9bff:fed1:2e0d
BGP.med: 4294967294
BGP.local_pref: 60
BGP.community: (3356,2) (3356,501) (3356,601) (3356,2065) (20940,30403) (65502,3356) (21357,600)
On one hand, as per RFC 4271 NEXT_HOP is not changed when prefix is passed from eBGP to iBGP so what we see above it expected. But on the other hand, as per RFC 2545 link-local address must not be there because both sides of iBGP doesn’t share the same IPv6 subnet:
“””
The link-local address shall be included in the Next Hop field if and
only if the BGP speaker shares a common subnet with the entity
identified by the global IPv6 address carried in the Network Address
of Next Hop field and the peer the route is being advertised to.
In all other cases a BGP speaker shall advertise to its peer in the
Network Address field only the global IPv6 address of the next hop
(the value of the Length of Network Address of Next Hop field shall
be set to 16).
“””
Who is right here? As far I know, both documents are still current standards, and both are implemented by Bird. I don’t see any clear guidelines how to make a clear judgement here. Personally, I would tell that RFC 4271 should be treated as general rule and RFC 2545 as more specific rule so in the end link-local should be removed. After all, link-local addresses do not make sense for multihop sessions. However, these documents don’t refer to each other and I don’t know if authors of these documents knew about each other statements. What do you think?
Anyway, when BGP collector with FRR 8.5.2 receives BGP UPDATE for route like presented above, then FRR rejects such UPDATE with treat-as-withdrawn approach but also triggers additional error about invalid prefix length for AFI 1, which finally causes NOTIFICATION (UPDATE Message Error/Invalid Network Field) and session goes down. I cannot rule out implementation bug in FRR version that I use, and I discuss it with FRR folks already.
Working workaround that I tested is to apply `next hop self` on Bird side. Probably `bgp_next_hop = bgp_next_hop` in Bird’s export policy will also work but I must test it yet.
What do you think? It’s a bug or a feature?
Regards,
Grzegorz
--
Donatas
-- Donatas
On Mon, Jan 27, 2025 at 10:37:13PM +0000, Ponikierski, Grzegorz via Bird-users wrote:
Hello all!
I have an interesting case of link-local IPv6 address in BGP.next_hop and I would like to know your opinion about that because I cannot tell with 100% confidence if it’s a bug or a feature. Existence of these link-local addresses causes issues of interoperability between Bird and FRR. I have separate discussion about that with FRR folks. Here I would like to now a Bird perspective. Details below.
On single router with Bird 2.15 I have multiple IPv4 and IPv6 eBGP sessions, which receives prefixes from the Internet, and IPv4 iBGP session, which forwards these prefixes to BGP collector with FRR, which is separate server somewhere in the Internet many hops away in separate ASN. Session with BGP collector uses both ipv4 and ipv6 channels to send both IPv4 and IPv6 prefixes. IPv6 prefixes received via eBGP have both global IPv6 address and link-local IPv6 address like in an example below:
On one hand, as per RFC 4271 NEXT_HOP is not changed when prefix is passed from eBGP to iBGP so what we see above it expected. But on the other hand, as per RFC 2545 link-local address must not be there because both sides of iBGP doesn’t share the same IPv6 subnet:
Who is right here? As far I know, both documents are still current standards, and both are implemented by Bird. I don’t see any clear guidelines how to make a clear judgement here. Personally, I would tell that RFC 4271 should be treated as general rule and RFC 2545 as more specific rule so in the end link-local should be removed. After all, link-local addresses do not make sense for multihop sessions. However, these documents don’t refer to each other and I don’t know if authors of these documents knew about each other statements. What do you think?
Hello BIRD prefers to not change the NEXT_HOP when forwarded to IBGP. There are some reeasons for this: I think the condition in RFC 2545 makes sense for EBGP, but not for IBGP, as IBGP sessions are usually terminated on loopback addresses, so the BGP speaker cannot evaluate from session endpoints whether the peer shares a common subnet with the nexthop and the speaker. The condition specifies 'if and only if', so not sending link-local next hop to a peer that shares a common subnet is contrary to the condition in the same way as sending link-local next hop to a peer that do not share a common subnet. In practice, it is worse to not send link-local next hop when it should be used than send it when it should not. As the link-local next hop address is associated with the global next hop address, routers that do not share a common subnet would use the global next hop to resolve the next hop in IGP routing table and ignore the link-local one, only the routers that shares a common subnet would use the link-local address to construct the route gateway. For example, lets assume we have routers R0, R1, and R2 on the same subnet, R0 and R1 in the same AS and connected with IBGP on loopback addresses, while R0 and R2 have EBGP session. Here, the R1 should clearly receive link-local next hop so it could install the route to R2 with link-local next hop in its routing table in the same manner as R0. While a route reflector may be many hops away, not sharing the common subnet, and therefore clearly it should not receive link-local next hop according to the condition in RFC 2545, i think it is an oversight in RFC 2545 to not consider route reflectors, as the RS could send the route back to a router that shares a common subnet with the next hop. Lets assume routers R0, R1, and R2 from the example above, but now instead of IBGP session between R0 and R1, they will be connected through IBGP sessions to a RR several hops away. One could argue that R1 should get the same next hop for R2 like if it was connected directly to R0. OTOH, it is a question whether a common subnet can be clearly identified from a global next hop address. I could imagine configurations where this is not true, but that would break even when RFC 2545 condition is strictly advered, together with IBGP recursive next hop resolution. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) "To err is human -- to blame it on a computer is even more so."
Ondrej, does it mean that I can make statement as below? Statement: FRR should accept update with both global address and link-local address in NEXT_HOP without any error and put it into Adj-RIB-In. If link-local address is reachable (peer-to-peer link) then link-local address should be used as next hop in RIB. Otherwise, global address should be used. This logical can be reversed with FRR route-map action “set ipv6 next-hop prefer-global” which is equivalent of Bird channel option “next hop prefer global”. Would you agree with such statement? Or I miss some nuances? Regards, Grzegorz From: Ondrej Zajicek <santiago@crfreenet.org> Date: Tuesday, 28 January 2025 at 18:32 To: "Ponikierski, Grzegorz" <gponikie@akamai.com> Cc: bird-users <bird-users@network.cz> Subject: Re: link-local IPv6 address in BGP.next_hop !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! On Mon, Jan 27, 2025 at 10:37:13PM +0000, Ponikierski, Grzegorz via Bird-users wrote: Hello all! I have an interesting case of link-local IPv6 address in BGP.next_hop and I would like to know your opinion about that because I cannot tell with 100% confidence if it’s a bug or a feature. Existence of these link-local addresses causes issues of interoperability between Bird and FRR. I have separate discussion about that with FRR folks. Here I would like to now a Bird perspective. Details below. On single router with Bird 2.15 I have multiple IPv4 and IPv6 eBGP sessions, which receives prefixes from the Internet, and IPv4 iBGP session, which forwards these prefixes to BGP collector with FRR, which is separate server somewhere in the Internet many hops away in separate ASN. Session with BGP collector uses both ipv4 and ipv6 channels to send both IPv4 and IPv6 prefixes. IPv6 prefixes received via eBGP have both global IPv6 address and link-local IPv6 address like in an example below: On one hand, as per RFC 4271 NEXT_HOP is not changed when prefix is passed from eBGP to iBGP so what we see above it expected. But on the other hand, as per RFC 2545 link-local address must not be there because both sides of iBGP doesn’t share the same IPv6 subnet: Who is right here? As far I know, both documents are still current standards, and both are implemented by Bird. I don’t see any clear guidelines how to make a clear judgement here. Personally, I would tell that RFC 4271 should be treated as general rule and RFC 2545 as more specific rule so in the end link-local should be removed. After all, link-local addresses do not make sense for multihop sessions. However, these documents don’t refer to each other and I don’t know if authors of these documents knew about each other statements. What do you think? Hello BIRD prefers to not change the NEXT_HOP when forwarded to IBGP. There are some reeasons for this: I think the condition in RFC 2545 makes sense for EBGP, but not for IBGP, as IBGP sessions are usually terminated on loopback addresses, so the BGP speaker cannot evaluate from session endpoints whether the peer shares a common subnet with the nexthop and the speaker. The condition specifies 'if and only if', so not sending link-local next hop to a peer that shares a common subnet is contrary to the condition in the same way as sending link-local next hop to a peer that do not share a common subnet. In practice, it is worse to not send link-local next hop when it should be used than send it when it should not. As the link-local next hop address is associated with the global next hop address, routers that do not share a common subnet would use the global next hop to resolve the next hop in IGP routing table and ignore the link-local one, only the routers that shares a common subnet would use the link-local address to construct the route gateway. For example, lets assume we have routers R0, R1, and R2 on the same subnet, R0 and R1 in the same AS and connected with IBGP on loopback addresses, while R0 and R2 have EBGP session. Here, the R1 should clearly receive link-local next hop so it could install the route to R2 with link-local next hop in its routing table in the same manner as R0. While a route reflector may be many hops away, not sharing the common subnet, and therefore clearly it should not receive link-local next hop according to the condition in RFC 2545, i think it is an oversight in RFC 2545 to not consider route reflectors, as the RS could send the route back to a router that shares a common subnet with the next hop. Lets assume routers R0, R1, and R2 from the example above, but now instead of IBGP session between R0 and R1, they will be connected through IBGP sessions to a RR several hops away. One could argue that R1 should get the same next hop for R2 like if it was connected directly to R0. OTOH, it is a question whether a common subnet can be clearly identified from a global next hop address. I could imagine configurations where this is not true, but that would break even when RFC 2545 condition is strictly advered, together with IBGP recursive next hop resolution. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org<mailto:santiago@crfreenet.org>) "To err is human -- to blame it on a computer is even more so."
On Tue, Jan 28, 2025 at 06:17:10PM +0000, Ponikierski, Grzegorz wrote:
Ondrej, does it mean that I can make statement as below?
Statement: FRR should accept update with both global address and link-local address in NEXT_HOP without any error and put it into Adj-RIB-In. If link-local address is reachable (peer-to-peer link) then link-local address should be used as next hop in RIB. Otherwise, global address should be used. This logical can be reversed with FRR route-map action “set ipv6 next-hop prefer-global” which is equivalent of Bird channel option “next hop prefer global”.
Would you agree with such statement? Or I miss some nuances?
The global next hop is used to resolve IGP route, and only when it is resolved to direct/interface route, it (or link-local one) is used as a next hop in RIB, otherwise the indirect route next hop is used in RIB. So i would formulate it this way: FRR should accept an update with both global address and link-local address in NEXT_HOP without any error and put it into Adj-RIB-In. The global address should be used to resolve an IGP route. When it is resolved to an direct/interface route, the link-local address (or the global address [*]) should be used as next hop in RIB. When it is resolved to an indirect IGP route, the next hop from the IGP route should be used as next hop in RIB (and the link-local address in NEXT_HOP is ignored). [*] The global address should be used as next hop in RIB when link-local address is not available or when it is preferred with FRR route-map action “set ipv6 next-hop prefer-global” which is equivalent of Bird channel option “next hop prefer global”.
From: Ondrej Zajicek <santiago@crfreenet.org> Date: Tuesday, 28 January 2025 at 18:32 To: "Ponikierski, Grzegorz" <gponikie@akamai.com> Cc: bird-users <bird-users@network.cz> Subject: Re: link-local IPv6 address in BGP.next_hop
!-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------!
On Mon, Jan 27, 2025 at 10:37:13PM +0000, Ponikierski, Grzegorz via Bird-users wrote: Hello all!
I have an interesting case of link-local IPv6 address in BGP.next_hop and I would like to know your opinion about that because I cannot tell with 100% confidence if it’s a bug or a feature. Existence of these link-local addresses causes issues of interoperability between Bird and FRR. I have separate discussion about that with FRR folks. Here I would like to now a Bird perspective. Details below.
On single router with Bird 2.15 I have multiple IPv4 and IPv6 eBGP sessions, which receives prefixes from the Internet, and IPv4 iBGP session, which forwards these prefixes to BGP collector with FRR, which is separate server somewhere in the Internet many hops away in separate ASN. Session with BGP collector uses both ipv4 and ipv6 channels to send both IPv4 and IPv6 prefixes. IPv6 prefixes received via eBGP have both global IPv6 address and link-local IPv6 address like in an example below:
On one hand, as per RFC 4271 NEXT_HOP is not changed when prefix is passed from eBGP to iBGP so what we see above it expected. But on the other hand, as per RFC 2545 link-local address must not be there because both sides of iBGP doesn’t share the same IPv6 subnet:
Who is right here? As far I know, both documents are still current standards, and both are implemented by Bird. I don’t see any clear guidelines how to make a clear judgement here. Personally, I would tell that RFC 4271 should be treated as general rule and RFC 2545 as more specific rule so in the end link-local should be removed. After all, link-local addresses do not make sense for multihop sessions. However, these documents don’t refer to each other and I don’t know if authors of these documents knew about each other statements. What do you think?
Hello
BIRD prefers to not change the NEXT_HOP when forwarded to IBGP. There are some reeasons for this:
I think the condition in RFC 2545 makes sense for EBGP, but not for IBGP, as IBGP sessions are usually terminated on loopback addresses, so the BGP speaker cannot evaluate from session endpoints whether the peer shares a common subnet with the nexthop and the speaker.
The condition specifies 'if and only if', so not sending link-local next hop to a peer that shares a common subnet is contrary to the condition in the same way as sending link-local next hop to a peer that do not share a common subnet.
In practice, it is worse to not send link-local next hop when it should be used than send it when it should not. As the link-local next hop address is associated with the global next hop address, routers that do not share a common subnet would use the global next hop to resolve the next hop in IGP routing table and ignore the link-local one, only the routers that shares a common subnet would use the link-local address to construct the route gateway.
For example, lets assume we have routers R0, R1, and R2 on the same subnet, R0 and R1 in the same AS and connected with IBGP on loopback addresses, while R0 and R2 have EBGP session. Here, the R1 should clearly receive link-local next hop so it could install the route to R2 with link-local next hop in its routing table in the same manner as R0.
While a route reflector may be many hops away, not sharing the common subnet, and therefore clearly it should not receive link-local next hop according to the condition in RFC 2545, i think it is an oversight in RFC 2545 to not consider route reflectors, as the RS could send the route back to a router that shares a common subnet with the next hop.
Lets assume routers R0, R1, and R2 from the example above, but now instead of IBGP session between R0 and R1, they will be connected through IBGP sessions to a RR several hops away. One could argue that R1 should get the same next hop for R2 like if it was connected directly to R0.
OTOH, it is a question whether a common subnet can be clearly identified from a global next hop address. I could imagine configurations where this is not true, but that would break even when RFC 2545 condition is strictly advered, together with IBGP recursive next hop resolution.
-- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) "To err is human -- to blame it on a computer is even more so."
I’m convinced with your statement. I will share it with my FRR friend and see if it is convincing also to FRR. Thanks :) Regards, Grzegorz From: Ondrej Zajicek <santiago@crfreenet.org> Date: Tuesday, 28 January 2025 at 19:49 To: "Ponikierski, Grzegorz" <gponikie@akamai.com> Cc: bird-users <bird-users@network.cz> Subject: Re: link-local IPv6 address in BGP.next_hop !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! On Tue, Jan 28, 2025 at 06:17:10PM +0000, Ponikierski, Grzegorz wrote: Ondrej, does it mean that I can make statement as below? Statement: FRR should accept update with both global address and link-local address in NEXT_HOP without any error and put it into Adj-RIB-In. If link-local address is reachable (peer-to-peer link) then link-local address should be used as next hop in RIB. Otherwise, global address should be used. This logical can be reversed with FRR route-map action “set ipv6 next-hop prefer-global” which is equivalent of Bird channel option “next hop prefer global”. Would you agree with such statement? Or I miss some nuances? The global next hop is used to resolve IGP route, and only when it is resolved to direct/interface route, it (or link-local one) is used as a next hop in RIB, otherwise the indirect route next hop is used in RIB. So i would formulate it this way: FRR should accept an update with both global address and link-local address in NEXT_HOP without any error and put it into Adj-RIB-In. The global address should be used to resolve an IGP route. When it is resolved to an direct/interface route, the link-local address (or the global address [*]) should be used as next hop in RIB. When it is resolved to an indirect IGP route, the next hop from the IGP route should be used as next hop in RIB (and the link-local address in NEXT_HOP is ignored). [*] The global address should be used as next hop in RIB when link-local address is not available or when it is preferred with FRR route-map action “set ipv6 next-hop prefer-global” which is equivalent of Bird channel option “next hop prefer global”. From: Ondrej Zajicek <santiago@crfreenet.org<mailto:santiago@crfreenet.org>> Date: Tuesday, 28 January 2025 at 18:32 To: "Ponikierski, Grzegorz" <gponikie@akamai.com<mailto:gponikie@akamai.com>> Cc: bird-users <bird-users@network.cz<mailto:bird-users@network.cz>> Subject: Re: link-local IPv6 address in BGP.next_hop !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! On Mon, Jan 27, 2025 at 10:37:13PM +0000, Ponikierski, Grzegorz via Bird-users wrote: Hello all! I have an interesting case of link-local IPv6 address in BGP.next_hop and I would like to know your opinion about that because I cannot tell with 100% confidence if it’s a bug or a feature. Existence of these link-local addresses causes issues of interoperability between Bird and FRR. I have separate discussion about that with FRR folks. Here I would like to now a Bird perspective. Details below. On single router with Bird 2.15 I have multiple IPv4 and IPv6 eBGP sessions, which receives prefixes from the Internet, and IPv4 iBGP session, which forwards these prefixes to BGP collector with FRR, which is separate server somewhere in the Internet many hops away in separate ASN. Session with BGP collector uses both ipv4 and ipv6 channels to send both IPv4 and IPv6 prefixes. IPv6 prefixes received via eBGP have both global IPv6 address and link-local IPv6 address like in an example below: On one hand, as per RFC 4271 NEXT_HOP is not changed when prefix is passed from eBGP to iBGP so what we see above it expected. But on the other hand, as per RFC 2545 link-local address must not be there because both sides of iBGP doesn’t share the same IPv6 subnet: Who is right here? As far I know, both documents are still current standards, and both are implemented by Bird. I don’t see any clear guidelines how to make a clear judgement here. Personally, I would tell that RFC 4271 should be treated as general rule and RFC 2545 as more specific rule so in the end link-local should be removed. After all, link-local addresses do not make sense for multihop sessions. However, these documents don’t refer to each other and I don’t know if authors of these documents knew about each other statements. What do you think? Hello BIRD prefers to not change the NEXT_HOP when forwarded to IBGP. There are some reeasons for this: I think the condition in RFC 2545 makes sense for EBGP, but not for IBGP, as IBGP sessions are usually terminated on loopback addresses, so the BGP speaker cannot evaluate from session endpoints whether the peer shares a common subnet with the nexthop and the speaker. The condition specifies 'if and only if', so not sending link-local next hop to a peer that shares a common subnet is contrary to the condition in the same way as sending link-local next hop to a peer that do not share a common subnet. In practice, it is worse to not send link-local next hop when it should be used than send it when it should not. As the link-local next hop address is associated with the global next hop address, routers that do not share a common subnet would use the global next hop to resolve the next hop in IGP routing table and ignore the link-local one, only the routers that shares a common subnet would use the link-local address to construct the route gateway. For example, lets assume we have routers R0, R1, and R2 on the same subnet, R0 and R1 in the same AS and connected with IBGP on loopback addresses, while R0 and R2 have EBGP session. Here, the R1 should clearly receive link-local next hop so it could install the route to R2 with link-local next hop in its routing table in the same manner as R0. While a route reflector may be many hops away, not sharing the common subnet, and therefore clearly it should not receive link-local next hop according to the condition in RFC 2545, i think it is an oversight in RFC 2545 to not consider route reflectors, as the RS could send the route back to a router that shares a common subnet with the next hop. Lets assume routers R0, R1, and R2 from the example above, but now instead of IBGP session between R0 and R1, they will be connected through IBGP sessions to a RR several hops away. One could argue that R1 should get the same next hop for R2 like if it was connected directly to R0. OTOH, it is a question whether a common subnet can be clearly identified from a global next hop address. I could imagine configurations where this is not true, but that would break even when RFC 2545 condition is strictly advered, together with IBGP recursive next hop resolution. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org<mailto:santiago@crfreenet.org>) "To err is human -- to blame it on a computer is even more so."
participants (4)
-
Brandon Z. -
Donatas Abraitis -
Ondrej Zajicek -
Ponikierski, Grzegorz