On Thu, Feb 06, 2020 at 12:34:00PM +0000, Neil Jerram wrote:
> Good morning all!
>
> I'm debugging a situation where I'm seeing different IPv6 next hop
> behaviour in two setups with different versions of my team's software.
>
> In both setups:
> There are 3 routers A, B and C, all peered with another router X.
> They are all on the same L2 bridge, and have global IPv6 addresses in the
> 2001:20::/64 subnet.
> A, B and C all export a route for fd00:10:96::/112
> ...
> Any ideas? Can you advise where I should look or check next, to try to
> understand why the UPDATE message has two next hop addresses in one setup,
> but only one in the other?
Hi
Check code in IPv6 version of bgp_create_update(). It depends on how
sender get the routes (local or received, were they received alredy with
link-local next hop, were the next hop modified) and whether it is IBGP or
EBGP and whether next hop is the same as sender.
> Also, does the passing of two next hop addresses in setup #1 fully explain
> why the ECMP routes programmed into the kernel use link-local gateway
> addresses?
Yes, link-local next hop is preferered as direct gateway.
> Also, are the routes with global next hops more correct in some sense than
> those with link-local next hops; or vice versa? Would you expect them both
> to forward data correctly?
Well, it is a bit strange quirk of IPv6 BGP. In general, both global and
link-local next hops should be sent when sender, receiver and global next
hop are on the same subnet. Global next hop is used for recursive next
hop evalulation, while link-local is used for forwarding.
Thank you very much Ondrej for all this. I will work through understanding and checking the details that you have provided.
Best wishes,
Neil
Thanks again Ondrej, I found the root cause here, with your help. In both of my setups the peers were in fact directly connected, but one of the setups was configuring with "direct;" and the other setup with "multihop;".
Best wishes,
Neil