IPv6, route reflectors and link-local nexthops
Hi list, maybe this question boils down to "How can I tell bird to always use global IPv6 addresses as nexthops". But let me give you a sketch of my problem: RR ---- R1 ---- R2 Addresses: RR 2001:db8:1::100, fe80:1::100%I1 R1 2001:db8:1::1, fe80:1::1%I1 2001:db8:2::1, fe80:2::1%I2 R2 2001:db8:2::2, fe80:2::2%I2 RR is an route reflector with * a direct iBGP session to R1 / 2001:db8:1::1, and * a multihop eBGP session to R2 / 2001:db8:2::2. As RR isn't supposed to forward any traffic, it has "missing lladdr ignore" set on the iBGP session. R1 and R2 don't talk to each other. Let's suppose R2 announces a route for 2001:db8:3::/64 RR sees this route:
bird> show route all 2001:db8:3::/64 2001:db8:3::/64 unicast [R2 2019-10-10 from 2001:db8:2::2] * (100/100) [AS4242424242?] via fe80:1::1 on I1 Type: BGP univ BGP.origin: Incomplete BGP.as_path: 4242424242 BGP.next_hop: 2001:db8:2::2 BGP.local_pref: 100
The nexthop is correctly set to the link-local address of R1 that RR can reach. Notice the missing link-local BGP.next_hop, however. Now RR exports this route to R1, and this is where things go wrong:
bird> show route all 2001:db8:3::/64> 2001:db8:3::/64 unicast [RR 2019-10-10 from 2001:db8:1::100] * (100/0) [AS4242424242?] via fe80:1::100 on I2 Type: BGP univ BGP.origin: Incomplete BGP.as_path: 4242424242 BGP.next_hop: 2001:db8:2::2 BGP.local_pref: 100
R1 correctly identifies the BGP.next_hop as being on-link for I2, but uses the link-local address of RR on the interface to R2 as nexthop. (Actually, it's even worse: R1 doesn't really use fe80:1::100 as nexthop, but the link-local address of the peer that was formerly in the place of RR. That address isn't even present anymore on any server anywhere!) I understand the R1 has no way of knowing the link-local address of R2, because it was never part of the BGP.next_hop. But the link-local address of RR was never part of BGP.next_hop, either! So, Questions: Why does bird use link-local nexthops at all, when all neighbors are configured using global addresses? How is it possible that bird uses the link-local address of RR decoupled from its interface? Is there anything I can do to make R1 use the global address (which is the only address in BGP.next_hop) of R2 as the nexthop? Best regards, Jan-Philipp Litza -- Jan-Philipp Litza PLUTEX GmbH Hermann-Ritter-Str. 108 28197 Bremen Hotline: 0800 100 400 800 Telefon: 0800 100 400 821 Telefax: 0800 100 400 888 E-Mail: support@plutex.de Internet: http://www.plutex.de USt-IdNr.: DE 815030856 Handelsregister: Amtsgericht Bremen, HRB 25144 Geschäftsführer: Torben Belz, Hendrik Lilienthal
On Thu, Oct 10, 2019 at 04:57:28PM +0200, Jan-Philipp Litza wrote:
Hi list,
maybe this question boils down to "How can I tell bird to always use global IPv6 addresses as nexthops". But let me give you a sketch of my problem:
RR ---- R1 ---- R2
Addresses: RR 2001:db8:1::100, fe80:1::100%I1 R1 2001:db8:1::1, fe80:1::1%I1 2001:db8:2::1, fe80:2::1%I2 R2 2001:db8:2::2, fe80:2::2%I2
RR is an route reflector with * a direct iBGP session to R1 / 2001:db8:1::1, and * a multihop eBGP session to R2 / 2001:db8:2::2. As RR isn't supposed to forward any traffic, it has "missing lladdr ignore" set on the iBGP session. R1 and R2 don't talk to each other.
Now RR exports this route to R1, and this is where things go wrong:
bird> show route all 2001:db8:3::/64> 2001:db8:3::/64 unicast [RR 2019-10-10 from 2001:db8:1::100] * (100/0) [AS4242424242?] via fe80:1::100 on I2 Type: BGP univ BGP.origin: Incomplete BGP.as_path: 4242424242 BGP.next_hop: 2001:db8:2::2 BGP.local_pref: 100
R1 correctly identifies the BGP.next_hop as being on-link for I2, but uses the link-local address of RR on the interface to R2 as nexthop.
(Actually, it's even worse: R1 doesn't really use fe80:1::100 as nexthop, but the link-local address of the peer that was formerly in the place of RR. That address isn't even present anymore on any server anywhere!)
Hi Not really sure how that might happen with direct session. What is your BIRD version and configs? What routes do you have in routing table? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Hi Ondrej,
Not really sure how that might happen with direct session. What is your BIRD version and configs? What routes do you have in routing table?
I'm using a self-compiled bird 2.0.6. RR and R1 have full-views (plus some internals), so I'm not sure what to answer to "what routes". The complete configs are a bit complicated (lots of templates and dozens of other BGP protocols), but the essentials (only the relevant IPv6 protocols and filters) are pretty standard I'd say: RR's config: protocol device { } protocol bgp R1 { local as my_asn; neighbor as my_asn; direct; graceful restart yes; rr client; ipv6 { import keep filtered; gateway recursive; missing lladdr ignore; import none; export filter { if source != RTS_BGP then reject; if net_local() then reject; if net_martian() then reject; accept; }; }; neighbor 2001:db8:1::1; } protocol bgp R2 { local as my_asn; graceful restart yes; neighbor 2001:db8:2::2 as 4242424242; multihop 2; ipv6 { import keep filtered; import filter { bgp_community.add((65535, 65281)); if net ~ [ 2001:db8::/32{64,128} ] then accept; reject; }; export filter { if net.len == 0 then accept; reject; }; }; } R1's config: protocol device { } protocol bgp RR { local as my_asn; neighbor as my_asn; direct; graceful restart yes; ipv6 { import keep filtered yes; next hop self; import all; export filter { if source = RTS_BGP then accept; reject; }; gateway recursive; }; neighbor 2001:db8:1::100; } If this destillation is missing something, let me know, I can send you the complete configs privately. But I doubt that there's anything relevant in there. Thanks for spending time on this! Jan-Philipp Litza -- Jan-Philipp Litza PLUTEX GmbH Hermann-Ritter-Str. 108 28197 Bremen Hotline: 0800 100 400 800 Telefon: 0800 100 400 821 Telefax: 0800 100 400 888 E-Mail: support@plutex.de Internet: http://www.plutex.de USt-IdNr.: DE 815030856 Handelsregister: Amtsgericht Bremen, HRB 25144 Geschäftsführer: Torben Belz, Hendrik Lilienthal
On Fri, Oct 11, 2019 at 09:22:36AM +0200, Jan-Philipp Litza wrote:
Hi Ondrej,
Not really sure how that might happen with direct session. What is your BIRD version and configs? What routes do you have in routing table?
I'm using a self-compiled bird 2.0.6. RR and R1 have full-views (plus some internals), so I'm not sure what to answer to "what routes". The complete configs are a bit complicated (lots of templates and dozens of other BGP protocols), but the essentials (only the relevant IPv6 protocols and filters) are pretty standard I'd say:
Hi Important info was 'gateway recursive' option on direct BGP sessions, so all three BGP sessions generate recursive routes.
so I'm not sure what to answer to "what routes"
For that i meant non-BGP routes that are used to resolve BGP next hops. Mainly 'show route for 2001:db8:2::2 all' to get route for 2001:db8:2::2 next hop from your first example. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Important info was 'gateway recursive' option on direct BGP sessions, so all three BGP sessions generate recursive routes.
There are only two sessions involved. Or do you mean sessions as in "protocol configurations"? And my understanding was that this setup cannot work without "gateway recursive". Can't read from your comment whether this is correct or not.
so I'm not sure what to answer to "what routes"
For that i meant non-BGP routes that are used to resolve BGP next hops.
Mainly 'show route for 2001:db8:2::2 all' to get route for 2001:db8:2::2 next hop from your first example.
Ah, sure: 2001:db8:2::/64 unicast [direct1 2019-09-24] * (240) dev I2 Type: device univ -- Jan-Philipp Litza PLUTEX GmbH Hermann-Ritter-Str. 108 28197 Bremen Hotline: 0800 100 400 800 Telefon: 0800 100 400 821 Telefax: 0800 100 400 888 E-Mail: support@plutex.de Internet: http://www.plutex.de USt-IdNr.: DE 815030856 Handelsregister: Amtsgericht Bremen, HRB 25144 Geschäftsführer: Torben Belz, Hendrik Lilienthal
On Fri, Oct 11, 2019 at 02:28:56PM +0200, Jan-Philipp Litza wrote:
Important info was 'gateway recursive' option on direct BGP sessions, so all three BGP sessions generate recursive routes.
There are only two sessions involved. Or do you mean sessions as in "protocol configurations"?
Two, you are right.
And my understanding was that this setup cannot work without "gateway recursive". Can't read from your comment whether this is correct or not.
so I'm not sure what to answer to "what routes"
For that i meant non-BGP routes that are used to resolve BGP next hops.
Mainly 'show route for 2001:db8:2::2 all' to get route for 2001:db8:2::2 next hop from your first example.
Ah, sure:
2001:db8:2::/64 unicast [direct1 2019-09-24] * (240) dev I2 Type: device univ
Hmm, i cannot imagine how you could end with gateway fe80:1::100. In this setup it should be 2001:db8:2::2. Don't you have e.g. a route for 2001:db8:2::2/128 with that gateway? Or any other route with gateway fe80:1::100? Do you get the same result even if you disable and enable RR-R1 session? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Mainly 'show route for 2001:db8:2::2 all' to get route for 2001:db8:2::2 next hop from your first example.
Ah, sure:
2001:db8:2::/64 unicast [direct1 2019-09-24] * (240) dev I2 Type: device univ
Hmm, i cannot imagine how you could end with gateway fe80:1::100. In this setup it should be 2001:db8:2::2.
Me neither. :-D
Don't you have e.g. a route for 2001:db8:2::2/128 with that gateway? Or any other route with gateway fe80:1::100?
The output in my last mail was for "show route for 2001:db8:2::2", so nope, no more specific than the /64 device route. Also, "show route where gw = fe80:1::100" yields exactly those routes that stem from R2 via RR.
Do you get the same result even if you disable and enable RR-R1 session?
I assume this is equivalent to a "restart"? That didn't change anything. I even restarted bird on RR. Note, though, that RR is a redundant system, so there is always another peer RR' that has the exact same route. Maybe the emphasis should be on the fact the fe80:1::100 isn't even the current link-local address of RR, but that of a former peer in that place. So neither RR nor its redundant partner RR' actually have fe80:1::100 as an address on any of its interfaces. And I checked via tcpdump, fe80:1::100 isn't contained in the BGP packets sent from RR (or RR'). So this address has to originate from somewhere inside bird, and it has to be cached because it isn't even configured anywhere anymore. My next (and only) idea would be to somehow inspect a coredump of bird where this address is stored. But I have no idea yet how that could work out. -- Jan-Philipp Litza PLUTEX GmbH Hermann-Ritter-Str. 108 28197 Bremen Hotline: 0800 100 400 800 Telefon: 0800 100 400 821 Telefax: 0800 100 400 888 E-Mail: support@plutex.de Internet: http://www.plutex.de USt-IdNr.: DE 815030856 Handelsregister: Amtsgericht Bremen, HRB 25144 Geschäftsführer: Torben Belz, Hendrik Lilienthal
On Mon, Oct 14, 2019 at 11:58:36AM +0200, Jan-Philipp Litza wrote:
I assume this is equivalent to a "restart"? That didn't change anything. I even restarted bird on RR. Note, though, that RR is a redundant system, so there is always another peer RR' that has the exact same route.
And if you restart R1?
Maybe the emphasis should be on the fact the fe80:1::100 isn't even the current link-local address of RR, but that of a former peer in that place. So neither RR nor its redundant partner RR' actually have fe80:1::100 as an address on any of its interfaces. And I checked via tcpdump, fe80:1::100 isn't contained in the BGP packets sent from RR (or RR').
So this address has to originate from somewhere inside bird, and it has to be cached because it isn't even configured anywhere anymore.
My next (and only) idea would be to somehow inspect a coredump of bird where this address is stored. But I have no idea yet how that could work out.
Could you try the current git master branch on R1? It has several fixes related to recursive routes. But it would be a good idea to first try just restarting R1 to see if the result is related to the code changes, or just the restart. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Could you try the current git master branch on R1? It has several fixes related to recursive routes. But it would be a good idea to first try just restarting R1 to see if the result is related to the code changes, or just the restart.
If the master you were talking about is now part of 2.0.7, then it probably fixed the issue. I didn't restart bird (this is a production system with much other traffic and routes) or even upgrade it yet. But I set up another, mostly identical machine with bird 2.0.7 and moved the routing for that particular interface to that machine. And everything works. So thanks for the effort in fixing this (even if it wasn't caused by my mail)! Best regards, Jan-Philipp Litza -- Jan-Philipp Litza PLUTEX GmbH Hermann-Ritter-Str. 108 28197 Bremen Hotline: 0800 100 400 800 Telefon: 0800 100 400 821 Telefax: 0800 100 400 888 E-Mail: support@plutex.de Internet: http://www.plutex.de USt-IdNr.: DE 815030856 Handelsregister: Amtsgericht Bremen, HRB 25144 Geschäftsführer: Torben Belz, Hendrik Lilienthal
participants (2)
-
Jan-Philipp Litza -
Ondrej Zajicek