I tried to add a goto to the recursive section, but got segfaults, so that way may require more complex changes. I also tried it another way - just set the dest to unreachable. Looks like that it works as expected for me. Do you think that it is a reasonable "solution" and could we make an option for it? In attachment is a patch with proof of concept and my test setup (creates a couple of namespaces, links them and runs birds) to show what I am talking about. With vanilla bird I get these routes on bird b: 10.0.11.0/24 unicast [static1 18:56:55.705] * (200) via 192.168.0.1 on eth0 Type: static univ 10.0.12.0/24 unreachable [static1 18:56:55.705] * (200) Type: static univ 10.0.1.0/24 unicast [bgp1 18:57:00.085] * (100) [i] via 192.168.0.1 on eth0 Type: BGP univ BGP.origin: IGP BGP.as_path: BGP.next_hop: 192.168.0.1 BGP.local_pref: 100 Vanilla bird again, but I enabled "gateway recursive" in the channel. Route 10.0.2.0/24 appeared, but route 10.0.11.0/24 became unreachable because of double recursion. 10.0.11.0/24 unreachable [static1 18:59:44.735] * (200) Type: static univ 10.0.12.0/24 unreachable [static1 18:59:44.735] * (200) Type: static univ 10.0.1.0/24 unicast [bgp1 18:59:48.587] * (100) [i] via 192.168.0.1 on eth0 Type: BGP univ BGP.origin: IGP BGP.as_path: BGP.next_hop: 192.168.0.1 BGP.local_pref: 100 10.0.2.0/24 unreachable [bgp1 18:59:48.587 from 192.168.0.1] * (100) [i] Type: BGP univ BGP.origin: IGP BGP.as_path: BGP.next_hop: 192.168.1.1 BGP.local_pref: 100 Patched bird with gateway direct. Route 10.0.11.0/24 is reachable again and 10.0.2.0/24 is still here even with a "bad" next_hop. 10.0.11.0/24 unicast [static1 18:58:03.050] * (200) via 192.168.0.1 on eth0 Type: static univ 10.0.12.0/24 unreachable [static1 18:58:03.050] * (200) Type: static univ 10.0.1.0/24 unicast [bgp1 18:58:07.524] * (100) [i] via 192.168.0.1 on eth0 Type: BGP univ BGP.origin: IGP BGP.as_path: BGP.next_hop: 192.168.0.1 BGP.local_pref: 100 10.0.2.0/24 unreachable [bgp1 18:58:07.524 from 192.168.0.1] * (100) [i] Type: BGP univ BGP.origin: IGP BGP.as_path: BGP.next_hop: 192.168.1.1 BGP.local_pref: 100 On Thu, Oct 1, 2020 at 11:19 AM Alexander Zubkov <green@qrator.net> wrote:
Hello,
We have some routes that are passed via transit bird bgp-daemons without installing them to the kernel. And for some of these routes we need to keep their next hop attribute, because it is meaningful on the destination host, but may be unreachable on those transit hosts. If we have "gateway recursive" sessions with these transit hosts, then such routes are accepted, just marked unreachable, but we can work with them. And if we make the sessions "gateway direct", then such routes are ignored by the bird with "Invalid NEXT_HOP attribute" messages in log. Looks like it is caused by this piece of code in proto/bgp/packets.c in function bgp_apply_next_hop():
if (!nbr || (nbr->scope == SCOPE_HOST)) WITHDRAW(BAD_NEXT_HOP);
It seems reasonable, of course, but it breaks things for our case. So I want to suggest some changes to make our setup possible. At the moment, I consider the possibility of adding something like "gateway <direct_with_recursive_fallback>" so that it would behave like "gateway direct", but in the case of a non-reachable neighbour, it would switch to recursive logic. What do you think? Whether this is a good idea or maybe you can suggest some better variants?
We have a couple of reasons why we want "gateway direct": We want to avoid double recursion, which is not allowed - we might have some other recursive routes that use direct routes received in this session. We could make separate sessions for direct and transit routes, but I'm not sure that the whole thing would be simpler. And there is an issue with next-hop resolution when we have the same ips on different interfaces. As I understand, bird daemon currently does not support that case.