[PATCH] Babel: add RFC9229 (v4 via v6) support
From: Andreas Rammhold <andreas@rammhold.de> This implements [RFC9229] an IPv4 via IPv6 extension to the Babel routing protocol that allows annoncing routes to an IPv4 prefix with an IPv6 next-hop, which makes it possible for IPv4 traffic to flow through interfaces that have not been assigned an IPv4 address. The implementation is compatible with the current Babeld version (the relevant changes can be seen in the [babeld PR]). I've verified this with a few VMs in the following setup: Bird <- v4 only -> Bird <- v6 only -> Babeld <- v4 only -> Babeld Each routing daemon was running on their own VM and had L2 connectivity to only its direct neighbors. At the nodes at the edges v4 networks have been announced and full end-to-end communication was possible over the mixed AF network. The v6 only link between Babel and Bird (at the "center" of the above setup) did transport the v4 packets via the v6 link-local next hop addresses just as expected. Thanks to Toke Høiland-Jørgensen for early review on this work. [RFC9229]: https://www.ietf.org/rfc/rfc9229.txt [babeld PR]: https://github.com/jech/babeld/pull/56 Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> --- This is a rebase of the patches from the previous two RFC rounds on this ML: * https://trubka.network.cz/pipermail/bird-users/2022-April/016046.html <20201215020627.30942-1-andreas@rammhold.de> * http://trubka.network.cz/pipermail/bird-users/2020-December/015082.html <20201215020627.30942-1-andreas@rammhold.de> I'd like to drive this to the finish goal soon. I've been using it ever since and know of other users as well. Is there anything that should still be changed? doc/bird.sgml | 6 +++ proto/babel/babel.c | 28 ++++++++--- proto/babel/babel.h | 3 ++ proto/babel/config.Y | 5 +- proto/babel/packets.c | 106 +++++++++++++++++++++++++++++------------- 5 files changed, 109 insertions(+), 39 deletions(-) diff --git a/doc/bird.sgml b/doc/bird.sgml index 50657ebf..a7cedaee 100644 --- a/doc/bird.sgml +++ b/doc/bird.sgml @@ -1934,6 +1934,7 @@ protocol babel [<name>] { algorithm ( hmac sha1 | hmac sha256 | hmac sha384 | hmac sha512 | blake2s128 | blake2s256 | blake2b256 | blake2b512 ); }; + ipv4 via ipv6 <switch>; }; } </code> @@ -2042,6 +2043,11 @@ protocol babel [<name>] { protocol will only accept HMAC-based algorithms or one of the Blake algorithms, and the length of the supplied password string must match the key size used by the selected algorithm. + + <tag><label id="babel-ipv4-via-ipv6">ipv4 via ipv6 <m/switch/</tag> + If enabled, Bird will accept and emit IPv4-via-IPv6 routes when IPv4 + addresses are absent from the interface as described in RFC9229. + Default: yes. </descrip> <sect1>Attributes diff --git a/proto/babel/babel.c b/proto/babel/babel.c index ecfd2763..52a27256 100644 --- a/proto/babel/babel.c +++ b/proto/babel/babel.c @@ -981,8 +981,17 @@ babel_send_update_(struct babel_iface *ifa, btime changed, struct fib *rtable) msg.update.router_id = e->router_id; net_copy(&msg.update.net, e->n.addr); - msg.update.next_hop = ((e->n.addr->type == NET_IP4) ? - ifa->next_hop_ip4 : ifa->next_hop_ip6); + if (e->n.addr->type == NET_IP4) { + /* Always prefer v4 nexthop if set */ + if(!ipa_zero(ifa->next_hop_ip4)) + msg.update.next_hop = ifa->next_hop_ip4; + + /* Only send v4-via-v6 if enabled */ + else if (ifa->cf->ip4_via_ip6) + msg.update.next_hop = ifa->next_hop_ip6; + } else { + msg.update.next_hop = ifa->next_hop_ip6; + } /* Do not send route if next hop is unknown, e.g. no configured IPv4 address */ if (ipa_zero(msg.update.next_hop)) @@ -1277,6 +1286,12 @@ babel_handle_update(union babel_msg *m, struct babel_iface *ifa) return; } + /* Reject IPv4 via IPv6 routes if disabled */ + if (!ifa->cf->ip4_via_ip6 && msg->net.type == NET_IP4 && ipa_is_ip6(msg->next_hop)) { + DBG("Babel: Ignoring disabled IPv4 via IPv6 route.\n"); + return; + } + /* Regular update */ e = babel_get_entry(p, &msg->net); r = babel_get_route(p, e, nbr); /* the route entry indexed by neighbour */ @@ -1682,7 +1697,7 @@ babel_iface_update_addr4(struct babel_iface *ifa) ip_addr addr4 = ifa->iface->addr4 ? ifa->iface->addr4->ip : IPA_NONE; ifa->next_hop_ip4 = ipa_nonzero(ifa->cf->next_hop_ip4) ? ifa->cf->next_hop_ip4 : addr4; - if (ipa_zero(ifa->next_hop_ip4) && p->ip4_channel) + if (ipa_zero(ifa->next_hop_ip4) && p->ip4_channel && !ifa->cf->ip4_via_ip6) log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname); if (ifa->up) @@ -2074,8 +2089,8 @@ babel_show_interfaces(struct proto *P, const char *iff) } cli_msg(-1023, "%s:", p->p.name); - cli_msg(-1023, "%-10s %-6s %-5s %7s %6s %7s %-15s %s", - "Interface", "State", "Auth", "RX cost", "Nbrs", "Timer", + cli_msg(-1023, "%-10s %-6s %-5s %-12s %7s %6s %7s %-15s %s", + "Interface", "State", "Auth", "IPv4 via Ip6", "RX cost", "Nbrs", "Timer", "Next hop (v4)", "Next hop (v6)"); WALK_LIST(ifa, p->interfaces) @@ -2088,10 +2103,11 @@ babel_show_interfaces(struct proto *P, const char *iff) nbrs++; btime timer = MIN(ifa->next_regular, ifa->next_hello) - current_time(); - cli_msg(-1023, "%-10s %-6s %-5s %7u %6u %7t %-15I %I", + cli_msg(-1023, "%-10s %-6s %-5s %-12s %7u %6u %7t %-15I %I", ifa->iface->name, (ifa->up ? "Up" : "Down"), (ifa->cf->auth_type == BABEL_AUTH_MAC ? (ifa->cf->auth_permissive ? "Perm" : "Yes") : "No"), + (ifa->cf->ip4_via_ip6 ? "Yes" : "No"), ifa->cf->rxcost, nbrs, MAX(timer, 0), ifa->next_hop_ip4, ifa->next_hop_ip6); } diff --git a/proto/babel/babel.h b/proto/babel/babel.h index 8b6da3c8..034a17db 100644 --- a/proto/babel/babel.h +++ b/proto/babel/babel.h @@ -112,6 +112,7 @@ enum babel_ae_type { BABEL_AE_IP4 = 1, BABEL_AE_IP6 = 2, BABEL_AE_IP6_LL = 3, + BABEL_AE_IPV4_VIA_IPV6 = 4, BABEL_AE_MAX }; @@ -151,6 +152,8 @@ struct babel_iface_config { uint mac_num_keys; /* Number of configured HMAC keys */ uint mac_total_len; /* Total digest length for all configured keys */ list *passwords; /* Passwords for authentication */ + + u8 ip4_via_ip6; /* Enable IPv4 via IPv6 */ }; struct babel_proto { diff --git a/proto/babel/config.Y b/proto/babel/config.Y index 05210fa4..7190f099 100644 --- a/proto/babel/config.Y +++ b/proto/babel/config.Y @@ -25,7 +25,8 @@ CF_DECLS CF_KEYWORDS(BABEL, INTERFACE, METRIC, RXCOST, HELLO, UPDATE, INTERVAL, PORT, TYPE, WIRED, WIRELESS, RX, TX, BUFFER, PRIORITY, LENGTH, CHECK, LINK, NEXT, HOP, IPV4, IPV6, BABEL_METRIC, SHOW, INTERFACES, NEIGHBORS, - ENTRIES, RANDOMIZE, ROUTER, ID, AUTHENTICATION, NONE, MAC, PERMISSIVE) + ENTRIES, RANDOMIZE, ROUTER, ID, AUTHENTICATION, NONE, MAC, PERMISSIVE, + VIA) CF_GRAMMAR @@ -67,6 +68,7 @@ babel_iface_start: BABEL_IFACE->tx_tos = IP_PREC_INTERNET_CONTROL; BABEL_IFACE->tx_priority = sk_priority_control; BABEL_IFACE->check_link = 1; + BABEL_IFACE->ip4_via_ip6 = 1; }; @@ -143,6 +145,7 @@ babel_iface_item: | CHECK LINK bool { BABEL_IFACE->check_link = $3; } | NEXT HOP IPV4 ipa { BABEL_IFACE->next_hop_ip4 = $4; if (!ipa_is_ip4($4)) cf_error("Must be an IPv4 address"); } | NEXT HOP IPV6 ipa { BABEL_IFACE->next_hop_ip6 = $4; if (!ipa_is_ip6($4)) cf_error("Must be an IPv6 address"); } + | IPV4 VIA IPV6 bool { BABEL_IFACE->ip4_via_ip6 = $4; } | AUTHENTICATION NONE { BABEL_IFACE->auth_type = BABEL_AUTH_NONE; } | AUTHENTICATION MAC { BABEL_IFACE->auth_type = BABEL_AUTH_MAC; BABEL_IFACE->auth_permissive = 0; } | AUTHENTICATION MAC PERMISSIVE { BABEL_IFACE->auth_type = BABEL_AUTH_MAC; BABEL_IFACE->auth_permissive = 1; } diff --git a/proto/babel/packets.c b/proto/babel/packets.c index d4acc170..9a71673a 100644 --- a/proto/babel/packets.c +++ b/proto/babel/packets.c @@ -167,9 +167,11 @@ struct babel_parse_state { u64 router_id; /* Router ID used in subsequent updates */ u8 def_ip6_prefix[16]; /* Implicit IPv6 prefix in network order */ u8 def_ip4_prefix[4]; /* Implicit IPv4 prefix in network order */ + u8 def_ip4viaip6_prefix[4]; /* Implicit IPv4 prefix in network order */ u8 router_id_seen; /* router_id field is valid */ u8 def_ip6_prefix_seen; /* def_ip6_prefix is valid */ u8 def_ip4_prefix_seen; /* def_ip4_prefix is valid */ + u8 def_ip4viaip6_prefix_seen; /* def_ip4viaip6_prefix is valid*/ u8 current_tlv_endpos; /* End of self-terminating TLVs (offset from start) */ u8 sadr_enabled; u8 is_unicast; @@ -515,9 +517,6 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m, msg->addr = IPA_NONE; msg->sender = state->saddr; - if (msg->ae >= BABEL_AE_MAX) - return PARSE_IGNORE; - /* * We only actually read link-local IPs. In every other case, the addr field * will be 0 but validation will succeed. The handler takes care of these @@ -530,13 +529,13 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m, if (TLV_OPT_LENGTH(tlv) < 4) return PARSE_ERROR; state->current_tlv_endpos += 4; - break; + return PARSE_SUCCESS; case BABEL_AE_IP6: if (TLV_OPT_LENGTH(tlv) < 16) return PARSE_ERROR; state->current_tlv_endpos += 16; - break; + return PARSE_SUCCESS; case BABEL_AE_IP6_LL: if (TLV_OPT_LENGTH(tlv) < 8) @@ -544,10 +543,10 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m, msg->addr = ipa_from_ip6(get_ip6_ll(&tlv->addr)); state->current_tlv_endpos += 8; - break; + return PARSE_SUCCESS; } - return PARSE_SUCCESS; + return PARSE_IGNORE; } static uint @@ -647,6 +646,47 @@ babel_read_next_hop(struct babel_tlv *hdr, union babel_msg *m UNUSED, return PARSE_IGNORE; } + +/* This is called directly from babel_read_next_hop to handle the ip4 address + compression state */ +static int +babel_get_ip4_prefix(struct babel_tlv_update *tlv, + struct babel_msg_update *msg, + u8 *def_prefix_seen, + u8 (*def_prefix)[], + ip_addr next_hop, + int len) +{ + if (tlv->plen > IP4_MAX_PREFIX_LENGTH) + return PARSE_ERROR; + + /* Cannot omit data if there is no saved prefix */ + if (tlv->omitted && !*def_prefix_seen) + return PARSE_ERROR; + + /* Update must have next hop, unless it is retraction */ + if (ipa_zero(next_hop) && msg->metric != BABEL_INFINITY) + return PARSE_IGNORE; + + /* Merge saved prefix and received prefix parts */ + u8 buf[4] = {}; + memcpy(buf, *def_prefix, tlv->omitted); + memcpy(buf + tlv->omitted, tlv->addr, len); + + ip4_addr prefix4 = get_ip4(buf); + net_fill_ip4(&msg->net, prefix4, tlv->plen); + + if (tlv->flags & BABEL_UF_DEF_PREFIX) + { + put_ip4(*def_prefix, prefix4); + *def_prefix_seen = 1; + } + + msg->next_hop = next_hop; + + return PARSE_SUCCESS; +} + /* This is called directly from babel_write_update() and returns -1 if a next hop should be written but there is not enough space. */ static int @@ -696,6 +736,7 @@ static int babel_read_update(struct babel_tlv *hdr, union babel_msg *m, struct babel_parse_state *state) { + int rc; struct babel_tlv_update *tlv = (void *) hdr; struct babel_msg_update *msg = &m->update; @@ -706,7 +747,6 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m, /* Length of received prefix data without omitted part */ int len = BYTES(tlv->plen) - (int) tlv->omitted; - u8 buf[16] = {}; if ((len < 0) || ((uint) len > TLV_OPT_LENGTH(tlv))) return PARSE_ERROR; @@ -724,31 +764,20 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m, break; case BABEL_AE_IP4: - if (tlv->plen > IP4_MAX_PREFIX_LENGTH) - return PARSE_ERROR; + rc = babel_get_ip4_prefix(tlv, msg, &state->def_ip4_prefix_seen, + &state->def_ip4_prefix, state->next_hop_ip4, + len); + if (rc != PARSE_SUCCESS) + return rc; - /* Cannot omit data if there is no saved prefix */ - if (tlv->omitted && !state->def_ip4_prefix_seen) - return PARSE_ERROR; - - /* Update must have next hop, unless it is retraction */ - if (ipa_zero(state->next_hop_ip4) && (msg->metric != BABEL_INFINITY)) - return PARSE_IGNORE; - - /* Merge saved prefix and received prefix parts */ - memcpy(buf, state->def_ip4_prefix, tlv->omitted); - memcpy(buf + tlv->omitted, tlv->addr, len); - - ip4_addr prefix4 = get_ip4(buf); - net_fill_ip4(&msg->net, prefix4, tlv->plen); - - if (tlv->flags & BABEL_UF_DEF_PREFIX) - { - put_ip4(state->def_ip4_prefix, prefix4); - state->def_ip4_prefix_seen = 1; - } + break; - msg->next_hop = state->next_hop_ip4; + case BABEL_AE_IPV4_VIA_IPV6: + rc = babel_get_ip4_prefix(tlv, msg, &state->def_ip4viaip6_prefix_seen, + &state->def_ip4viaip6_prefix, state->next_hop_ip6, + len); + if (rc != PARSE_SUCCESS) + return rc; break; @@ -761,6 +790,7 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m, return PARSE_ERROR; /* Merge saved prefix and received prefix parts */ + u8 buf[16] = {}; memcpy(buf, state->def_ip6_prefix, tlv->omitted); memcpy(buf + tlv->omitted, tlv->addr, len); @@ -863,7 +893,11 @@ babel_write_update(struct babel_tlv *hdr, union babel_msg *m, } else if (msg->net.type == NET_IP4) { - tlv->ae = BABEL_AE_IP4; + if (!ipa_zero(msg->next_hop) && ipa_is_ip6(msg->next_hop)) + tlv->ae = BABEL_AE_IPV4_VIA_IPV6; + else + tlv->ae = BABEL_AE_IP4; + tlv->plen = net4_pxlen(&msg->net); put_ip4_px(tlv->addr, &msg->net); } @@ -931,7 +965,11 @@ babel_read_route_request(struct babel_tlv *hdr, union babel_msg *m, msg->full = 1; return PARSE_SUCCESS; + /* RFC9229 section 2.3: When receiving requests, AEs 1 (IPv4) and 4 + * (v4-via-v6) MUST be treated in the same manner. + */ case BABEL_AE_IP4: + case BABEL_AE_IPV4_VIA_IPV6: if (tlv->plen > IP4_MAX_PREFIX_LENGTH) return PARSE_ERROR; @@ -1032,7 +1070,11 @@ babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *m, case BABEL_AE_WILDCARD: return PARSE_ERROR; + /* RFC9229 section 2.3: When receiving requests, AEs 1 (IPv4) and 4 + * (v4-via-v6) MUST be treated in the same manner. + */ case BABEL_AE_IP4: + case BABEL_AE_IPV4_VIA_IPV6: if (tlv->plen > IP4_MAX_PREFIX_LENGTH) return PARSE_ERROR; -- 2.38.1
On Sun, Dec 18, 2022 at 06:57:48PM +0100, andreas--- via Bird-users wrote:
From: Andreas Rammhold <andreas@rammhold.de>
This implements [RFC9229] an IPv4 via IPv6 extension to the Babel routing protocol that allows annoncing routes to an IPv4 prefix with an IPv6 next-hop, which makes it possible for IPv4 traffic to flow through interfaces that have not been assigned an IPv4 address.
I'd like to drive this to the finish goal soon. I've been using it ever since and know of other users as well.
Hi Thanks for the patch, will check that during this week. Just few random notes: The RFC says "ordinary IPv4 announcements are preferred to v4-via-v6 announcements when the outgoing interface has an assigned IPv4 address", but there is also the third option - using an IPv4 address from another interface as a next hop. What happens if route is announced with IPv6 next hop and the interface later? What do you think about changing the option name to 'extended next hop', like the equivalent BGP option? On the one hand, it improves consistency within BIRD, on the other hand, 'v4-via-v6' is used by the RFC. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej Zajicek <santiago@crfreenet.org> writes:
Just few random notes:
The RFC says "ordinary IPv4 announcements are preferred to v4-via-v6 announcements when the outgoing interface has an assigned IPv4 address", but there is also the third option - using an IPv4 address from another interface as a next hop.
Are you proposing letting the administrator selecting an IPv4 address for that case? I don't know how this could work reliable in an automated way. Imagine a situation where you have three routers connected like so: Router 1 <-- IPv4 --> Router 2 <-- IPv6 --> Router 3 Router 1 and Router 2 have an IPv4 link, Router 2 and Router 3 don't. In your scenario would Router 2 send packets to Router 3 with the IPv4 next hop from the Router1<>Router2 link?
What happens if route is announced with IPv6 next hop and the interface later?
You mean what happens if an IPv4 (or IPv4+IPv6) connection transitions to being just an IPv6 link? Let's look at this example with two machines that are connected via both IPv4 and IPv6 and have v4-via-v6 enabled: On the first machine called "kappa":
kappa# birdc show babel interfaces "wg-bertha" wg_backbone: Interface State Auth IPv4 via Ip6 RX cost Nbrs Timer Next hop (v4) Next hop (v6) wg-bertha Up No Yes 96 1 1.399 192.168.0.2 fe80::12
On the second machine called "bertha":
bertha# birdc show babel interfaces "wg-kappa" wg_backbone: Interface State Auth IPv4 via Ip6 RX cost Nbrs Timer Next hop (v4) Next hop (v6) wg-kappa Up No Yes 96 1 2.603 192.0.2.1 fe80::1
"kappa" has the address 172.20.25.12/32 on a loopback interfaces and is announcing that to all its babel neighbors. Since the link between the two machines has both IPv4 and IPv6 available the code prefers IPv4 (as it should do). Bertha learns the IPv4 next hop:
bertha# birdc show babel routes | grep 172.20.25.12/32 | grep kappa Prefix Nexthop Interface Metric F Seqno Expires 172.20.25.12/32 192.168.0.2 wg-kappa 96 * 1 50.130
In the event that the IPv4 address goes missing from the link. The route will fail over to an ipv4-via-ipv6 route after a couple of seconds:
kappa# ip addr remove 192.0.2.2/24 dev wg-bertha
bertha# birdc show babel routes | grep 172.20.25.12/32 | grep kappa 172.20.25.12/32 fe80::12 wg-kappa 96 * 1 53.360
And in my case that is also properly reflected in the kernel routing table:
bertha# ip route get 172.20.25.12 172.20.25.12 via inet6 fe80::12 dev wg-kappa src 172.20.25.4 uid 0
The IPv4 source address selected here is configured via `krt_prefsrc` in my bird configuration. It would work without, but traceroute wouldn't be as readable. If the IPv4 address reappears on the interface between the two the next hop will be updated to the IPv4 address again. Does this answer your question?
What do you think about changing the option name to 'extended next hop', like the equivalent BGP option? On the one hand, it improves consistency within BIRD, on the other hand, 'v4-via-v6' is used by the RFC.
My personal preference would be to stick with the babel terminology. Other babel implementations (e.g. babled) are using the same wording. I imagine that someone interfacing with other implementation is going to be happier about the protocol-level flags being consistent within the protocol (across devices) rather than having consistency with BGP (which they might not have knowledge in any way). Correct me if I am wrong but only Juniper and BIRD have picked up `extended-nexthop` as an option whereas e.g. Cisco went with something different. I recall these inconsistencies across devices (while staying within the same protocol) to be annoying, but I haven't been involved in any commercial networking for some years now. I'll *not* die on this hill if you think that consistency within BIRD is more important from your perspective. :) Andi
The RFC says "ordinary IPv4 announcements are preferred to v4-via-v6 announcements when the outgoing interface has an assigned IPv4 address", but there is also the third option - using an IPv4 address from another interface as a next hop.
The ways in which routing may fail are different. If you announce a v4-via-v6 route and the peer doesn't support v4-via-v6, then the v4-via-v6 announcements will be silently ignored, and the routed packets will either take a longer v4 route, or follow the default route. If, on the other hand, you announce an IPv4 address from a different interface, you might have grabbed an address from a management interface that's firewalled away from your transit network. The packets using that address as a next-hop are going to be silently dropped, which will cause a blackhole. A related point: Linux supports assigning the same IPv4 address to multiple interfaces, so if what you want is sharing a single IPv4 next-hop address between interfaces, you can already do that. -- Juliusz
On Wed, Jan 11, 2023 at 03:57:47PM +0100, Andreas Rammhold wrote:
Hi,
Ondrej Zajicek <santiago@crfreenet.org> writes:
Thanks for the patch, will check that during this week.
Any chance you could give this a look yet?
Hi Sorry, we received some reports about an important bug in 2.0.11, we are now working on 2.0.12 with several bugfixes (hopefully this week), after that i will check your patch. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On 11.01.23 17:16, Ondrej Zajicek via Bird-users wrote:
Hi
Sorry, we received some reports about an important bug in 2.0.11, we are now working on 2.0.12 with several bugfixes (hopefully this week), after that i will check your patch.
Hi, thanks for the patch. I'm running it on a few machines and it simplifies address management significantly, since it allows us to deallocate lots of address pairs from all tunnel interfaces. Want to repeat the question by the author: Is there a chance this can be reviewed? A lot is happening on the list, but this patch seems to have dropped by the wayside, which is a shame honestly. Thanks for providing Bird. Martin
On Tue, Jan 31, 2023 at 05:59:25PM +0100, Martin Weinelt wrote:
Hi,
thanks for the patch. I'm running it on a few machines and it simplifies address management significantly, since it allows us to deallocate lots of address pairs from all tunnel interfaces.
Want to repeat the question by the author:
Is there a chance this can be reviewed?
A lot is happening on the list, but this patch seems to have dropped by the wayside, which is a shame honestly.
Hi Sorry, we released 2.0.12 just last week, (as usual) it took more time than expected, as there were several complex bugs. On the list i try to immediately handle small patches or do cursory reviews, while final handling of more complex patches put on (weighted) queue. But you are still the first on the queue :-) -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Wed, Jan 11, 2023 at 03:57:47PM +0100, Andreas Rammhold wrote:
Any chance you could give this a look yet?
Hi Finally i got to process the patch and prepare it for merging. I did some changes, mainly: 1) Changed the name of the option to 'extended next hop', for consistency with BGP (and in the future also with other protocols). As the option is enabled by default, the name likely does not matter that much. 2) Removed its output from 'babel show interfaces'. In these brief tabular outputs horizontal space is at a premium, so i would rather limit it to the most important information. There were also minor changes/fixes, mainly 1) Add Explicit PARSE_ERROR for BABEL_AE_IP4_VIA_IP6 in IHU and Next Hop TLVs 2) Keep PARSE_SUCCESS for BABEL_AE_WILDCARD in IHU, your patch implicitly (and probably unintentionally) changed that to PARSE_IGNORE. 3) In babel_handle_update(), move condition 'Reject IPv4 via IPv6 routes if disabled' above 'Retraction'. In your patch it was below, so retractions in AE 4 were accepted even on interfaces with this feature disabled. 4) Conditions for 'Missing IPv4 next hop address' warning were fixed in babel_iface_update_addr4(), but not in babel_add_iface() / babel_reconfigure_iface(). and some cosmetic changes. The modified patch is here: https://gitlab.nic.cz/labs/bird/-/commit/eecc3f02e41bcb91d463c4c1189fd56bc44... Please check it, if it is acceptable for you, i will merge it to the master branch. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej Zajicek <santiago@crfreenet.org> writes: Hi,
On Wed, Jan 11, 2023 at 03:57:47PM +0100, Andreas Rammhold wrote:
Any chance you could give this a look yet?
Hi
Finally i got to process the patch and prepare it for merging. I did some changes, mainly:
1) Changed the name of the option to 'extended next hop', for consistency with BGP (and in the future also with other protocols). As the option is enabled by default, the name likely does not matter that much.
2) Removed its output from 'babel show interfaces'. In these brief tabular outputs horizontal space is at a premium, so i would rather limit it to the most important information.
There were also minor changes/fixes, mainly
1) Add Explicit PARSE_ERROR for BABEL_AE_IP4_VIA_IP6 in IHU and Next Hop TLVs
2) Keep PARSE_SUCCESS for BABEL_AE_WILDCARD in IHU, your patch implicitly (and probably unintentionally) changed that to PARSE_IGNORE.
3) In babel_handle_update(), move condition 'Reject IPv4 via IPv6 routes if disabled' above 'Retraction'. In your patch it was below, so retractions in AE 4 were accepted even on interfaces with this feature disabled.
4) Conditions for 'Missing IPv4 next hop address' warning were fixed in babel_iface_update_addr4(), but not in babel_add_iface() / babel_reconfigure_iface().
and some cosmetic changes.
The modified patch is here:
https://gitlab.nic.cz/labs/bird/-/commit/eecc3f02e41bcb91d463c4c1189fd56bc44...
Please check it, if it is acceptable for you, i will merge it to the master branch.
These all sound fine. I've run it through my bird<->bird<->babeld<->babeld test network and it passed. The changes look fine if this it what it takes to get this applied :-) Regards, Andi
On Tue, Feb 14, 2023 at 06:04:33PM +0100, Andreas Rammhold wrote:
Please check it, if it is acceptable for you, i will merge it to the master branch.
These all sound fine. I've run it through my bird<->bird<->babeld<->babeld test network and it passed.
The changes look fine if this it what it takes to get this applied :-)
OK, merged. I just though that the default value for the option is enabled, but perhaps it should be enabled only if such routes are supported by platform code (i.e. enabled on Linux, but disabled on BSD, as we do not support such routes on BSD). -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
I just though that the default value for the option is enabled, but perhaps it should be enabled only if such routes are supported by platform code (i.e. enabled on Linux, but disabled on BSD, as we do not support such routes on BSD).
IMHO, it should not be possible to enable v4-via-v6 unless the platform supports it. Otherwise, you risk creating persistent blackholes. (There's also the PMTUD problem described in RFC 9229 Section 3.) -- Juliusz
On 14.02.23 22:08, Juliusz Chroboczek wrote:
(There's also the PMTUD problem described in RFC 9229 Section 3.)
Hey, Juliusz, do you, or any one else, have info on: How does ${vendor} behave when reverse path filters are enabled? I did some "research" aka an afternoon of web-search on that topic a while ago, and I only found: * Linux does just treat the whole 192.0.0.0/24 as "special" is is not aware of the meaning of 192.0.0.8/32 * Cisco is not "aware" of it as well, as far as I could tell * IIRC Juniper had no docu on that topic... I see the point, that on the Internet it would do more harm then anything else to allow/transport packets with this source address, but on internal networks it could be useful. However, it clashes as soon as you enable "strict" or even any RPF... How do people treat this address in operational networks? Thanks for any input. Bernd
(There's also the PMTUD problem described in RFC 9229 Section 3.)
Juliusz, do you, or any one else, have info on: How does ${vendor} behave when reverse path filters are enabled?
I was under the impression that some kinds of ICMP pakets are not subject to RPF. See RFC 4890 Section 4.3.1. I cannot find a similar recommendation for ICMPv4, though, so perhaps I'm wrong.
* Linux does just treat the whole 192.0.0.0/24 as "special" is is not aware of the meaning of 192.0.0.8/32
That's new to me. Could you please describe how 192/24 is special? 192.0.0.8 requires special treatment, we need to cajole/bribe/bully Toke into helping.
However, it clashes as soon as you enable "strict" or even any RPF... How do people treat this address in operational networks?
Personal opinion here, please let me know if you think I'm wrong. Router-originated ICMP is inherently unreliable: just because there is a route from A to B that goes through C, there is no reason to assume that there is a route from C to A: A --------> C -------> B ^ . ......... That's why source quench (which relies on router-originated ICMP) has been replaced with ECN (which only does end-to-end). That's why ICMP-based PMTUd doesn't work. So in practice I'd expect people to use various hacks to work around the lack of PMTUd in the Real Interenet. Note that this problem is in no way specific to 192.0.0.8: strict RPF is almost as likely to break ICMP from a global address as it is to break ICMP from 192.168.0.8. -- Juliusz
1) Changed the name of the option to 'extended next hop', for consistency with BGP (and in the future also with other protocols). As the option is enabled by default, the name likely does not matter that much.
I rather like v4-via-v6, which succintly and clearly states what it is about. "Extended next hop" means nothing unless you already know what it means.
3) In babel_handle_update(), move condition 'Reject IPv4 via IPv6 routes if disabled' above 'Retraction'. In your patch it was below, so retractions in AE 4 were accepted even on interfaces with this feature disabled.
That's reasonable enough, it shouldn't matter either way. Perhaps the former behaviour was more robust in the case where a node loses its IPv4 address and then immediately retracts the routes that it previously announced. Just a minor inconsistency, v4-via-v6 requests are accepted even if the option is disabled (line 1084). -- Juliusz
On Tue, Feb 14, 2023 at 06:25:33PM +0100, Juliusz Chroboczek wrote:
1) Changed the name of the option to 'extended next hop', for consistency with BGP (and in the future also with other protocols). As the option is enabled by default, the name likely does not matter that much.
I rather like v4-via-v6, which succintly and clearly states what it is about.
I am not sure about that. Without knowing the context and that it is about next hops, i would guess that 'v4-via-v6' is some kind of automatic tunneling (like 4over6) provisioned by Babel. In general, we prefer to use the same name for equivalent features between protocols, that is not first such case. There are cases where it was not done, and it annoys me everytime when i mistake one option name in place where the other name should be used. Also, there is a possibility of reverse usage (IPv6 routes with IPv4 next hops, perhaps encoded as IPv4-mapped IPv6 addresses in regular AE 2 ?). But that is not useful now, as we do not (AFAIR) support third-party next hops anyways.
3) In babel_handle_update(), move condition 'Reject IPv4 via IPv6 routes if disabled' above 'Retraction'. In your patch it was below, so retractions in AE 4 were accepted even on interfaces with this feature disabled.
That's reasonable enough, it shouldn't matter either way. Perhaps the former behaviour was more robust in the case where a node loses its IPv4 address and then immediately retracts the routes that it previously announced.
I thought that next hops are sent in retractions, but now i see they are not used here, in that case it would make more sense to ignore it, perhaps on the parser level. btw, there is one question that i noticed. If an Update is ignored for semantic reasons (e.g. update with valid metric, but missing next hop or router id), should it update last prefix with P-flag? RFC says that for sub-TLV errors but not for other errors: "Update TLVs with a matching address encoding within the same packet, even if this TLV is otherwise ignored due to an unknown mandatory sub-TLV" BIRD currently does not remember prefix of ignored update for missing next hop, but does remember prefix of ignored update for missing router id.
Just a minor inconsistency, v4-via-v6 requests are accepted even if the option is disabled (line 1084).
Good point. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
btw, there is one question that i noticed. If an Update is ignored for semantic reasons (e.g. update with valid metric, but missing next hop or router id), should it update last prefix with P-flag?
Such a packet would be incorrect. What to do in presence of an incorrect packet is left to the implementation. Babeld will log an error and stop parsing the packet at this point and discard all remaining TLVs. Other solutions would include dropping the whole packet, dropping the whole adjacency, or even adding the neighbour to a blacklist of buggy neighbours.
RFC says that for sub-TLV errors but not for other errors:
A packet with an unknown sub-TLV is correct, it's just that the implementation doesn't know how to interpret this particular sub-TLV. The behaviour is well-defined in this case (drop the sub-TLV if the mandatory bit is unset, drop the TLV otherwise). This is unlike the case described above, which indicates a buggy neighbour. -- Juliusz
participants (6)
-
Andreas Rammhold -
andreas@rammhold.de -
Bernd Naumann -
Juliusz Chroboczek -
Martin Weinelt -
Ondrej Zajicek