This is a first attempt at implementing draft-ietf-babel-v4viav6-00 as IPv4 via IPv6 extension to the Babel routing protocol that allows announcing routes to an IPv4 prefix with an IPv6 next-hop, which makes it possible for IPv4 traffic to flow through interfaces that have not been assigned an IPv4 address. I've tried my best to be compliant with the current RFC version. One point that is not entirely clear is how to implement in section 2.2. regarding fallback when the device doesn't support v4-via-v6 routes. (See notes below) The implementation is compatible with the current Babeld PR (https://github.com/jech/babeld/pull/56). I've verified this with a few VMs in the following setup: Bird <- v4 only -> Bird <- v6 only -> Babeld <- v4 only -> Babeld Each routing daemon was running on their own VM and had L2 connectivity to only its direct neighbors. At the nodes at the edges v4 networks have been announced and full end-to-end communication was possible over the mixed AF network. The v6 only link between Babel and Bird (at the "center" of the above setup) did transport the v4 packets via the v6 link-local next hop addresses just as expected. Thanks to Toke Høiland-Jørgensen for early review on this work. -----< notes >------ (My current notes on the current implementation regarding compliance with the current draft) Bird doesn't implement compression for outgoing (send) IPv4 prefixes and thus this also hasn't been implemented for this AE. Best guess is that 4 bytes are cheap enough to not bother with the added complexity? In the RX path compression is supported for IPv4 prefixes (for the IPv4 AE as well as for IPv4-via-IPv6 AE). Especially the part last paragraph of 2.2. remains to be discussed/solved: * If a node cannot install v4-over-v6 routes, eg., due to hardware or software limitations, then routes to an IPv4 prefix with an IPv6 next-hop MUST NOT be selected, as described in Section 3.5.3 of [RFC6126bis]. * What if the kernel doesn't accept the RTA_VIA value we gave it? Does BIRD generally handle this already? One example is hitting: "ipv4: use IS_ENABLED instead of ifdef" (id:20201115224509.2020651-1-flokli@flokli.de @ linux-netdev) Where the kernel does support it but due to a bug fails to add those routes when CONFIG_IPV6=m. Kernel version comparison is probably a bad idea. Which kind of route can we attempt to install to test this functionality without breaking any setup? Section 5: I am using the proposed value 4 to test interoperability with the Babled PR (https://github.com/jech/babeld/pull/56/commits). --- doc/bird.sgml | 7 +++ proto/babel/babel.c | 26 +++++++--- proto/babel/babel.h | 3 ++ proto/babel/config.Y | 4 +- proto/babel/packets.c | 116 ++++++++++++++++++++++++++++++------------ 5 files changed, 117 insertions(+), 39 deletions(-) diff --git a/doc/bird.sgml b/doc/bird.sgml index 5408cb2a..ee895a47 100644 --- a/doc/bird.sgml +++ b/doc/bird.sgml @@ -1770,6 +1770,7 @@ protocol babel [<name>] { check link <switch>; next hop ipv4 <address>; next hop ipv6 <address>; + ipv4 via ipv6 <switch>; }; } </code> @@ -1860,6 +1861,12 @@ protocol babel [<name>] { interface. If not set, the same link-local address that is used as the source for Babel packets will be used. In normal operation, it should not be necessary to set this option. + + <tag><label id="babel-ipv4-via-ipv6">ipv4 via ipv6 <m/switch/</tag> + If enabled, Bird will accept and emit IPv4-via-IPv6 routes when IPv4 + addresses are absent from the interface as described in draft-ietf-babel-v4viav6. + Default: yes. + </descrip> <sect1>Attributes diff --git a/proto/babel/babel.c b/proto/babel/babel.c index 4b6b9d7f..64a88601 100644 --- a/proto/babel/babel.c +++ b/proto/babel/babel.c @@ -954,8 +954,15 @@ babel_send_update_(struct babel_iface *ifa, btime changed, struct fib *rtable) msg.update.router_id = e->router_id; net_copy(&msg.update.net, e->n.addr); - msg.update.next_hop = ((e->n.addr->type == NET_IP4) ? - ifa->next_hop_ip4 : ifa->next_hop_ip6); + if (e->n.addr->type == NET_IP4 && !ipa_zero(ifa->next_hop_ip4)) { + msg.update.next_hop = ifa->next_hop_ip4; + } else if (e->n.addr->type == NET_IP6 && !ipa_zero(ifa->next_hop_ip6)) { + msg.update.next_hop = ifa->next_hop_ip6; + /* If there is no IPv4 address on the interface (and we are supporting v4viav6) + * send the v6 next hop instead */ + } else if (ifa->cf->ip4_via_ip6 && e->n.addr->type == NET_IP4 && ipa_zero(ifa->next_hop_ip4) && !ipa_zero(ifa->next_hop_ip6)) { + msg.update.next_hop = ifa->next_hop_ip6; + } /* Do not send route if next hop is unknown, e.g. no configured IPv4 address */ if (ipa_zero(msg.update.next_hop)) @@ -1247,6 +1254,12 @@ babel_handle_update(union babel_msg *m, struct babel_iface *ifa) return; } + /* Reject IPv4 via IPv6 routes if disabled */ + if (!ifa->cf->ip4_via_ip6 && msg->net.type == NET_IP4 && ipa_is_ip6(msg->next_hop)) { + DBG("Babel: Ignoring disabled IPv4 via IPv6 route.\n"); + return; + } + /* Regular update */ e = babel_get_entry(p, &msg->net); r = babel_get_route(p, e, nbr); /* the route entry indexed by neighbour */ @@ -1529,7 +1542,7 @@ babel_iface_update_addr4(struct babel_iface *ifa) ifa->next_hop_ip4 = ipa_nonzero(ifa->cf->next_hop_ip4) ? ifa->cf->next_hop_ip4 : addr4; if (ipa_zero(ifa->next_hop_ip4) && p->ip4_channel) - log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname); + log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname); // FIXME: make sure we do not log this when it is expected if (ifa->up) babel_iface_kick_timer(ifa); @@ -1894,8 +1907,8 @@ babel_show_interfaces(struct proto *P, const char *iff) } cli_msg(-1023, "%s:", p->p.name); - cli_msg(-1023, "%-10s %-6s %7s %6s %7s %-15s %s", - "Interface", "State", "RX cost", "Nbrs", "Timer", + cli_msg(-1023, "%-10s %-6s %-12s %7s %6s %7s %-15s %s", + "Interface", "State", "IPv4 via Ip6", "RX cost", "Nbrs", "Timer", "Next hop (v4)", "Next hop (v6)"); WALK_LIST(ifa, p->interfaces) @@ -1908,8 +1921,9 @@ babel_show_interfaces(struct proto *P, const char *iff) nbrs++; btime timer = MIN(ifa->next_regular, ifa->next_hello) - current_time(); - cli_msg(-1023, "%-10s %-6s %7u %6u %7t %-15I %I", + cli_msg(-1023, "%-10s %-6s %-12s %7u %6u %7t %-15I %I", ifa->iface->name, (ifa->up ? "Up" : "Down"), + (ifa->cf->ip4_via_ip6 ? "Yes" : "No"), ifa->cf->rxcost, nbrs, MAX(timer, 0), ifa->next_hop_ip4, ifa->next_hop_ip6); } diff --git a/proto/babel/babel.h b/proto/babel/babel.h index e075024c..146d28f5 100644 --- a/proto/babel/babel.h +++ b/proto/babel/babel.h @@ -104,6 +104,7 @@ enum babel_ae_type { BABEL_AE_IP4 = 1, BABEL_AE_IP6 = 2, BABEL_AE_IP6_LL = 3, + BABEL_AE_IPV4_VIA_IPV6 = 4, BABEL_AE_MAX }; @@ -137,6 +138,8 @@ struct babel_iface_config { ip_addr next_hop_ip4; ip_addr next_hop_ip6; + + u8 ip4_via_ip6; /* Enabe IPv4 via IPv6 */ }; struct babel_proto { diff --git a/proto/babel/config.Y b/proto/babel/config.Y index 2f3b637b..f17884eb 100644 --- a/proto/babel/config.Y +++ b/proto/babel/config.Y @@ -25,7 +25,7 @@ CF_DECLS CF_KEYWORDS(BABEL, INTERFACE, METRIC, RXCOST, HELLO, UPDATE, INTERVAL, PORT, TYPE, WIRED, WIRELESS, RX, TX, BUFFER, PRIORITY, LENGTH, CHECK, LINK, NEXT, HOP, IPV4, IPV6, BABEL_METRIC, SHOW, INTERFACES, NEIGHBORS, - ENTRIES, RANDOMIZE, ROUTER, ID) + ENTRIES, RANDOMIZE, ROUTER, ID, VIA) CF_GRAMMAR @@ -65,6 +65,7 @@ babel_iface_start: BABEL_IFACE->tx_tos = IP_PREC_INTERNET_CONTROL; BABEL_IFACE->tx_priority = sk_priority_control; BABEL_IFACE->check_link = 1; + BABEL_IFACE->ip4_via_ip6 = 1; }; @@ -109,6 +110,7 @@ babel_iface_item: | CHECK LINK bool { BABEL_IFACE->check_link = $3; } | NEXT HOP IPV4 ipa { BABEL_IFACE->next_hop_ip4 = $4; if (!ipa_is_ip4($4)) cf_error("Must be an IPv4 address"); } | NEXT HOP IPV6 ipa { BABEL_IFACE->next_hop_ip6 = $4; if (!ipa_is_ip6($4)) cf_error("Must be an IPv6 address"); } + | IPV4 VIA IPV6 bool { BABEL_IFACE->ip4_via_ip6 = $4; } ; babel_iface_opts: diff --git a/proto/babel/packets.c b/proto/babel/packets.c index 415ac3f9..d4c66a53 100644 --- a/proto/babel/packets.c +++ b/proto/babel/packets.c @@ -130,9 +130,11 @@ struct babel_parse_state { u64 router_id; /* Router ID used in subsequent updates */ u8 def_ip6_prefix[16]; /* Implicit IPv6 prefix in network order */ u8 def_ip4_prefix[4]; /* Implicit IPv4 prefix in network order */ + u8 def_ip4viaip6_prefix[4]; /* Implicit IPv4 prefix in network order */ u8 router_id_seen; /* router_id field is valid */ u8 def_ip6_prefix_seen; /* def_ip6_prefix is valid */ u8 def_ip4_prefix_seen; /* def_ip4_prefix is valid */ + u8 def_ip4viaip6_prefix_seen; /* def_ip4viaip6_prefix is valid*/ u8 current_tlv_endpos; /* End of self-terminating TLVs (offset from start) */ u8 sadr_enabled; }; @@ -246,6 +248,8 @@ static int babel_read_update(struct babel_tlv *hdr, union babel_msg *msg, struct static int babel_read_route_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state); static int babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state); static int babel_read_source_prefix(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state); +static int babel_read_update_ip4_prefix(struct babel_tlv_update *tlv, struct babel_msg_update *msg, u8 *def_prefix_seen, + u8 (*def_prefix)[4], ip_addr * next_hop, int len); static uint babel_write_ack(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len); static uint babel_write_hello(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len); @@ -398,9 +402,6 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m, msg->addr = IPA_NONE; msg->sender = state->saddr; - if (msg->ae >= BABEL_AE_MAX) - return PARSE_IGNORE; - /* * We only actually read link-local IPs. In every other case, the addr field * will be 0 but validation will succeed. The handler takes care of these @@ -413,13 +414,13 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m, if (TLV_OPT_LENGTH(tlv) < 4) return PARSE_ERROR; state->current_tlv_endpos += 4; - break; + return PARSE_SUCCESS; case BABEL_AE_IP6: if (TLV_OPT_LENGTH(tlv) < 16) return PARSE_ERROR; state->current_tlv_endpos += 16; - break; + return PARSE_SUCCESS; case BABEL_AE_IP6_LL: if (TLV_OPT_LENGTH(tlv) < 8) @@ -427,10 +428,10 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m, msg->addr = ipa_from_ip6(get_ip6_ll(&tlv->addr)); state->current_tlv_endpos += 8; - break; + return PARSE_SUCCESS; } - return PARSE_SUCCESS; + return PARSE_IGNORE; } static uint @@ -530,6 +531,47 @@ babel_read_next_hop(struct babel_tlv *hdr, union babel_msg *m UNUSED, return PARSE_IGNORE; } + +/* This is called directrly from babel_read_next_hop to handle the ip4 address + compression state */ +static int +babel_read_update_ip4_prefix(struct babel_tlv_update *tlv, + struct babel_msg_update *msg, + u8 *def_prefix_seen, + u8 (*def_prefix)[4], + ip_addr *next_hop, + int len) +{ + if (tlv->plen > IP4_MAX_PREFIX_LENGTH) + return PARSE_ERROR; + + /* Cannot omit data if there is no saved prefix */ + if (tlv->omitted && !*def_prefix_seen) + return PARSE_ERROR; + + /* Update must have next hop, unless it is retraction */ + if (ipa_zero(*next_hop) && msg->metric != BABEL_INFINITY) + return PARSE_IGNORE; + + /* Merge saved prefix and received prefix parts */ + u8 buf[4] = {}; + memcpy(buf, *def_prefix, tlv->omitted); + memcpy(buf + tlv->omitted, tlv->addr, len); + + ip4_addr prefix4 = get_ip4(buf); + net_fill_ip4(&msg->net, prefix4, tlv->plen); + + if (tlv->flags & BABEL_UF_DEF_PREFIX) + { + put_ip4(*def_prefix, prefix4); + *def_prefix_seen = 1; + } + + msg->next_hop = *next_hop; + + return PARSE_SUCCESS; +} + /* This is called directly from babel_write_update() and returns -1 if a next hop should be written but there is not enough space. */ static int @@ -579,6 +621,7 @@ static int babel_read_update(struct babel_tlv *hdr, union babel_msg *m, struct babel_parse_state *state) { + int rc; struct babel_tlv_update *tlv = (void *) hdr; struct babel_msg_update *msg = &m->update; @@ -589,7 +632,6 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m, /* Length of received prefix data without omitted part */ int len = BYTES(tlv->plen) - (int) tlv->omitted; - u8 buf[16] = {}; if ((len < 0) || ((uint) len > TLV_OPT_LENGTH(tlv))) return PARSE_ERROR; @@ -607,31 +649,20 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m, break; case BABEL_AE_IP4: - if (tlv->plen > IP4_MAX_PREFIX_LENGTH) - return PARSE_ERROR; - - /* Cannot omit data if there is no saved prefix */ - if (tlv->omitted && !state->def_ip4_prefix_seen) - return PARSE_ERROR; - - /* Update must have next hop, unless it is retraction */ - if (ipa_zero(state->next_hop_ip4) && (msg->metric != BABEL_INFINITY)) - return PARSE_IGNORE; - - /* Merge saved prefix and received prefix parts */ - memcpy(buf, state->def_ip4_prefix, tlv->omitted); - memcpy(buf + tlv->omitted, tlv->addr, len); - - ip4_addr prefix4 = get_ip4(buf); - net_fill_ip4(&msg->net, prefix4, tlv->plen); + rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4_prefix_seen, + &state->def_ip4_prefix, &state->next_hop_ip4, + len); + if (rc != PARSE_SUCCESS) + return rc; - if (tlv->flags & BABEL_UF_DEF_PREFIX) - { - put_ip4(state->def_ip4_prefix, prefix4); - state->def_ip4_prefix_seen = 1; - } + break; - msg->next_hop = state->next_hop_ip4; + case BABEL_AE_IPV4_VIA_IPV6: + rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4viaip6_prefix_seen, + &state->def_ip4viaip6_prefix, &state->next_hop_ip6, + len); + if (rc != PARSE_SUCCESS) + return rc; break; @@ -644,6 +675,7 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m, return PARSE_ERROR; /* Merge saved prefix and received prefix parts */ + u8 buf[16] = {}; memcpy(buf, state->def_ip6_prefix, tlv->omitted); memcpy(buf + tlv->omitted, tlv->addr, len); @@ -746,7 +778,11 @@ babel_write_update(struct babel_tlv *hdr, union babel_msg *m, } else if (msg->net.type == NET_IP4) { - tlv->ae = BABEL_AE_IP4; + if (!ipa_zero(msg->next_hop) && ipa_is_ip6(msg->next_hop)) + tlv->ae = BABEL_AE_IPV4_VIA_IPV6; + else + tlv->ae = BABEL_AE_IP4; + tlv->plen = net4_pxlen(&msg->net); put_ip4_px(tlv->addr, &msg->net); } @@ -814,7 +850,15 @@ babel_read_route_request(struct babel_tlv *hdr, union babel_msg *m, msg->full = 1; return PARSE_SUCCESS; + /* + * When receiving requests, AEs 1 (IPv4) and 4 (v4-via-v6) MUST be + * treated in the same manner: the receiver processes the request as + * described in Section 3.8 of [RFC6126bis]. If an Update is sent, then + * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above, + * irrespective of which AE was used in the request. + */ case BABEL_AE_IP4: + case BABEL_AE_IPV4_VIA_IPV6: if (tlv->plen > IP4_MAX_PREFIX_LENGTH) return PARSE_ERROR; @@ -915,7 +959,15 @@ babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *m, case BABEL_AE_WILDCARD: return PARSE_ERROR; + /* + * When receiving requests, AEs 1 (IPv4) and TBD (v4-via-v6) MUST be + * treated in the same manner: the receiver processes the request as + * described in Section 3.8 of [RFC6126bis]. If an Update is sent, then + * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above, + * irrespective of which AE was used in the request. + */ case BABEL_AE_IP4: + case BABEL_AE_IPV4_VIA_IPV6: if (tlv->plen > IP4_MAX_PREFIX_LENGTH) return PARSE_ERROR; -- 2.29.2
Andreas Rammhold <andreas@rammhold.de> writes:
This is a first attempt at implementing draft-ietf-babel-v4viav6-00 as IPv4 via IPv6 extension to the Babel routing protocol that allows announcing routes to an IPv4 prefix with an IPv6 next-hop, which makes it possible for IPv4 traffic to flow through interfaces that have not been assigned an IPv4 address.
I've tried my best to be compliant with the current RFC version. One point that is not entirely clear is how to implement in section 2.2. regarding fallback when the device doesn't support v4-via-v6 routes. (See notes below)
The implementation is compatible with the current Babeld PR (https://github.com/jech/babeld/pull/56). I've verified this with a few VMs in the following setup:
Bird <- v4 only -> Bird <- v6 only -> Babeld <- v4 only -> Babeld
Each routing daemon was running on their own VM and had L2 connectivity to only its direct neighbors. At the nodes at the edges v4 networks have been announced and full end-to-end communication was possible over the mixed AF network. The v6 only link between Babel and Bird (at the "center" of the above setup) did transport the v4 packets via the v6 link-local next hop addresses just as expected.
Thanks to Toke Høiland-Jørgensen for early review on this work.
-----< notes >------
(My current notes on the current implementation regarding compliance with the current draft)
Bird doesn't implement compression for outgoing (send) IPv4 prefixes and thus this also hasn't been implemented for this AE. Best guess is that 4 bytes are cheap enough to not bother with the added complexity? In the RX path compression is supported for IPv4 prefixes (for the IPv4 AE as well as for IPv4-via-IPv6 AE).
Especially the part last paragraph of 2.2. remains to be discussed/solved:
* If a node cannot install v4-over-v6 routes, eg., due to hardware or software limitations, then routes to an IPv4 prefix with an IPv6 next-hop MUST NOT be selected, as described in Section 3.5.3 of [RFC6126bis].
* What if the kernel doesn't accept the RTA_VIA value we gave it? Does BIRD generally handle this already? One example is hitting: "ipv4: use IS_ENABLED instead of ifdef" (id:20201115224509.2020651-1-flokli@flokli.de @ linux-netdev) Where the kernel does support it but due to a bug fails to add those routes when CONFIG_IPV6=m. Kernel version comparison is probably a bad idea. Which kind of route can we attempt to install to test this functionality without breaking any setup?
Can't we just have Bird do the equivalent of: ip r add 192.0.2.1/32 via inet6 ::2 dev lo onlink ip r del 192.0.2.1/32 via inet6 ::2 dev lo onlink on startup, and set a system flag depending on whether the operation fails or not? Other than this, I think the patch generally looks good (with a few nits below); but we should probably wait until the draft progresses to the point where it actually has an AE assigned in the text before we merge this :)
Section 5: I am using the proposed value 4 to test interoperability with the Babled PR (https://github.com/jech/babeld/pull/56/commits). --- doc/bird.sgml | 7 +++ proto/babel/babel.c | 26 +++++++--- proto/babel/babel.h | 3 ++ proto/babel/config.Y | 4 +- proto/babel/packets.c | 116 ++++++++++++++++++++++++++++++------------ 5 files changed, 117 insertions(+), 39 deletions(-)
diff --git a/doc/bird.sgml b/doc/bird.sgml index 5408cb2a..ee895a47 100644 --- a/doc/bird.sgml +++ b/doc/bird.sgml @@ -1770,6 +1770,7 @@ protocol babel [<name>] { check link <switch>; next hop ipv4 <address>; next hop ipv6 <address>; + ipv4 via ipv6 <switch>; }; } </code> @@ -1860,6 +1861,12 @@ protocol babel [<name>] { interface. If not set, the same link-local address that is used as the source for Babel packets will be used. In normal operation, it should not be necessary to set this option. + + <tag><label id="babel-ipv4-via-ipv6">ipv4 via ipv6 <m/switch/</tag> + If enabled, Bird will accept and emit IPv4-via-IPv6 routes when IPv4 + addresses are absent from the interface as described in draft-ietf-babel-v4viav6. + Default: yes. + </descrip>
<sect1>Attributes diff --git a/proto/babel/babel.c b/proto/babel/babel.c index 4b6b9d7f..64a88601 100644 --- a/proto/babel/babel.c +++ b/proto/babel/babel.c @@ -954,8 +954,15 @@ babel_send_update_(struct babel_iface *ifa, btime changed, struct fib *rtable) msg.update.router_id = e->router_id; net_copy(&msg.update.net, e->n.addr);
- msg.update.next_hop = ((e->n.addr->type == NET_IP4) ? - ifa->next_hop_ip4 : ifa->next_hop_ip6); + if (e->n.addr->type == NET_IP4 && !ipa_zero(ifa->next_hop_ip4)) { + msg.update.next_hop = ifa->next_hop_ip4; + } else if (e->n.addr->type == NET_IP6 && !ipa_zero(ifa->next_hop_ip6)) { + msg.update.next_hop = ifa->next_hop_ip6; + /* If there is no IPv4 address on the interface (and we are supporting v4viav6) + * send the v6 next hop instead */ + } else if (ifa->cf->ip4_via_ip6 && e->n.addr->type == NET_IP4 && ipa_zero(ifa->next_hop_ip4) && !ipa_zero(ifa->next_hop_ip6)) { + msg.update.next_hop = ifa->next_hop_ip6; + }
You're doing a bit of redundant checking here, since there's a check for ipa_zero() just below. How about: if (e->n.addr->type == NET_IP4) { /* Always prefer v4 nexthop if set */ if(!ipa_zero(ifa->next_hop_ip4)) msg.update.next_hop = ifa->next_hop_ip4; /* Only send v4-via-v6 if enabled */ else if (ifa->cf->ip4_via_ip6) msg.update.next_hop = ifa->next_hop_ip6; } else { msg.update.next_hop = ifa->next_hop_ip6; }
/* Do not send route if next hop is unknown, e.g. no configured IPv4 address */ if (ipa_zero(msg.update.next_hop)) @@ -1247,6 +1254,12 @@ babel_handle_update(union babel_msg *m, struct babel_iface *ifa) return; }
+ /* Reject IPv4 via IPv6 routes if disabled */ + if (!ifa->cf->ip4_via_ip6 && msg->net.type == NET_IP4 && ipa_is_ip6(msg->next_hop)) { + DBG("Babel: Ignoring disabled IPv4 via IPv6 route.\n"); + return; + } +
/* Regular update */ e = babel_get_entry(p, &msg->net); r = babel_get_route(p, e, nbr); /* the route entry indexed by neighbour */ @@ -1529,7 +1542,7 @@ babel_iface_update_addr4(struct babel_iface *ifa) ifa->next_hop_ip4 = ipa_nonzero(ifa->cf->next_hop_ip4) ? ifa->cf->next_hop_ip4 : addr4;
if (ipa_zero(ifa->next_hop_ip4) && p->ip4_channel) - log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname); + log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname); // FIXME: make sure we do not log this when it is expected
Just check ifa->cf->ip4_via_ip6 here as well, and make sure it gets set to false if the feature is not supported on the system?
if (ifa->up) babel_iface_kick_timer(ifa); @@ -1894,8 +1907,8 @@ babel_show_interfaces(struct proto *P, const char *iff) }
cli_msg(-1023, "%s:", p->p.name); - cli_msg(-1023, "%-10s %-6s %7s %6s %7s %-15s %s", - "Interface", "State", "RX cost", "Nbrs", "Timer", + cli_msg(-1023, "%-10s %-6s %-12s %7s %6s %7s %-15s %s", + "Interface", "State", "IPv4 via Ip6", "RX cost", "Nbrs", "Timer", "Next hop (v4)", "Next hop (v6)");
WALK_LIST(ifa, p->interfaces) @@ -1908,8 +1921,9 @@ babel_show_interfaces(struct proto *P, const char *iff) nbrs++;
btime timer = MIN(ifa->next_regular, ifa->next_hello) - current_time(); - cli_msg(-1023, "%-10s %-6s %7u %6u %7t %-15I %I", + cli_msg(-1023, "%-10s %-6s %-12s %7u %6u %7t %-15I %I", ifa->iface->name, (ifa->up ? "Up" : "Down"), + (ifa->cf->ip4_via_ip6 ? "Yes" : "No"), ifa->cf->rxcost, nbrs, MAX(timer, 0), ifa->next_hop_ip4, ifa->next_hop_ip6); } diff --git a/proto/babel/babel.h b/proto/babel/babel.h index e075024c..146d28f5 100644 --- a/proto/babel/babel.h +++ b/proto/babel/babel.h @@ -104,6 +104,7 @@ enum babel_ae_type { BABEL_AE_IP4 = 1, BABEL_AE_IP6 = 2, BABEL_AE_IP6_LL = 3, + BABEL_AE_IPV4_VIA_IPV6 = 4, BABEL_AE_MAX };
@@ -137,6 +138,8 @@ struct babel_iface_config {
ip_addr next_hop_ip4; ip_addr next_hop_ip6; + + u8 ip4_via_ip6; /* Enabe IPv4 via IPv6 */
Typo in comment
};
struct babel_proto { diff --git a/proto/babel/config.Y b/proto/babel/config.Y index 2f3b637b..f17884eb 100644 --- a/proto/babel/config.Y +++ b/proto/babel/config.Y @@ -25,7 +25,7 @@ CF_DECLS CF_KEYWORDS(BABEL, INTERFACE, METRIC, RXCOST, HELLO, UPDATE, INTERVAL, PORT, TYPE, WIRED, WIRELESS, RX, TX, BUFFER, PRIORITY, LENGTH, CHECK, LINK, NEXT, HOP, IPV4, IPV6, BABEL_METRIC, SHOW, INTERFACES, NEIGHBORS, - ENTRIES, RANDOMIZE, ROUTER, ID) + ENTRIES, RANDOMIZE, ROUTER, ID, VIA)
CF_GRAMMAR
@@ -65,6 +65,7 @@ babel_iface_start: BABEL_IFACE->tx_tos = IP_PREC_INTERNET_CONTROL; BABEL_IFACE->tx_priority = sk_priority_control; BABEL_IFACE->check_link = 1; + BABEL_IFACE->ip4_via_ip6 = 1; };
@@ -109,6 +110,7 @@ babel_iface_item: | CHECK LINK bool { BABEL_IFACE->check_link = $3; } | NEXT HOP IPV4 ipa { BABEL_IFACE->next_hop_ip4 = $4; if (!ipa_is_ip4($4)) cf_error("Must be an IPv4 address"); } | NEXT HOP IPV6 ipa { BABEL_IFACE->next_hop_ip6 = $4; if (!ipa_is_ip6($4)) cf_error("Must be an IPv6 address"); } + | IPV4 VIA IPV6 bool { BABEL_IFACE->ip4_via_ip6 = $4; } ;
babel_iface_opts: diff --git a/proto/babel/packets.c b/proto/babel/packets.c index 415ac3f9..d4c66a53 100644 --- a/proto/babel/packets.c +++ b/proto/babel/packets.c @@ -130,9 +130,11 @@ struct babel_parse_state { u64 router_id; /* Router ID used in subsequent updates */ u8 def_ip6_prefix[16]; /* Implicit IPv6 prefix in network order */ u8 def_ip4_prefix[4]; /* Implicit IPv4 prefix in network order */ + u8 def_ip4viaip6_prefix[4]; /* Implicit IPv4 prefix in network order */ u8 router_id_seen; /* router_id field is valid */ u8 def_ip6_prefix_seen; /* def_ip6_prefix is valid */ u8 def_ip4_prefix_seen; /* def_ip4_prefix is valid */ + u8 def_ip4viaip6_prefix_seen; /* def_ip4viaip6_prefix is valid*/ u8 current_tlv_endpos; /* End of self-terminating TLVs (offset from start) */ u8 sadr_enabled; }; @@ -246,6 +248,8 @@ static int babel_read_update(struct babel_tlv *hdr, union babel_msg *msg, struct static int babel_read_route_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state); static int babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state); static int babel_read_source_prefix(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state); +static int babel_read_update_ip4_prefix(struct babel_tlv_update *tlv, struct babel_msg_update *msg, u8 *def_prefix_seen, + u8 (*def_prefix)[4], ip_addr * next_hop, int len);
No need to forward-declare this function; you're only using it after it was defined.
static uint babel_write_ack(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len); static uint babel_write_hello(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len); @@ -398,9 +402,6 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m, msg->addr = IPA_NONE; msg->sender = state->saddr;
- if (msg->ae >= BABEL_AE_MAX) - return PARSE_IGNORE; - /* * We only actually read link-local IPs. In every other case, the addr field * will be 0 but validation will succeed. The handler takes care of these @@ -413,13 +414,13 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m, if (TLV_OPT_LENGTH(tlv) < 4) return PARSE_ERROR; state->current_tlv_endpos += 4; - break; + return PARSE_SUCCESS;
case BABEL_AE_IP6: if (TLV_OPT_LENGTH(tlv) < 16) return PARSE_ERROR; state->current_tlv_endpos += 16; - break; + return PARSE_SUCCESS;
case BABEL_AE_IP6_LL: if (TLV_OPT_LENGTH(tlv) < 8) @@ -427,10 +428,10 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
msg->addr = ipa_from_ip6(get_ip6_ll(&tlv->addr)); state->current_tlv_endpos += 8; - break; + return PARSE_SUCCESS; }
- return PARSE_SUCCESS; + return PARSE_IGNORE; }
static uint @@ -530,6 +531,47 @@ babel_read_next_hop(struct babel_tlv *hdr, union babel_msg *m UNUSED, return PARSE_IGNORE; }
+ +/* This is called directrly from babel_read_next_hop to handle the ip4 address + compression state */
Typo in comment
+static int +babel_read_update_ip4_prefix(struct babel_tlv_update *tlv,
The babel_read_* functions generally read a particular type of TLV, so maybe change the name to make it obvious this is a helper function. babel_get_ip4_prefix() ?
+ struct babel_msg_update *msg, + u8 *def_prefix_seen, + u8 (*def_prefix)[4], + ip_addr *next_hop, + int len)
I don't think the "pointer with array size" argument type is generally used in the Bird code; is it even supported by all compilers? Also, I don't think it's necessary to pass a pointer to next_hop; it's a fairly small value, so copying it is fine, and if it's not a pointer it's obvious that it's not modified.
+{ + if (tlv->plen > IP4_MAX_PREFIX_LENGTH) + return PARSE_ERROR; + + /* Cannot omit data if there is no saved prefix */ + if (tlv->omitted && !*def_prefix_seen) + return PARSE_ERROR; + + /* Update must have next hop, unless it is retraction */ + if (ipa_zero(*next_hop) && msg->metric != BABEL_INFINITY) + return PARSE_IGNORE; + + /* Merge saved prefix and received prefix parts */ + u8 buf[4] = {}; + memcpy(buf, *def_prefix, tlv->omitted); + memcpy(buf + tlv->omitted, tlv->addr, len); + + ip4_addr prefix4 = get_ip4(buf); + net_fill_ip4(&msg->net, prefix4, tlv->plen); + + if (tlv->flags & BABEL_UF_DEF_PREFIX) + { + put_ip4(*def_prefix, prefix4); + *def_prefix_seen = 1; + } + + msg->next_hop = *next_hop; + + return PARSE_SUCCESS; +} + /* This is called directly from babel_write_update() and returns -1 if a next hop should be written but there is not enough space. */ static int @@ -579,6 +621,7 @@ static int babel_read_update(struct babel_tlv *hdr, union babel_msg *m, struct babel_parse_state *state) { + int rc; struct babel_tlv_update *tlv = (void *) hdr; struct babel_msg_update *msg = &m->update;
@@ -589,7 +632,6 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
/* Length of received prefix data without omitted part */ int len = BYTES(tlv->plen) - (int) tlv->omitted; - u8 buf[16] = {};
if ((len < 0) || ((uint) len > TLV_OPT_LENGTH(tlv))) return PARSE_ERROR; @@ -607,31 +649,20 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m, break;
case BABEL_AE_IP4: - if (tlv->plen > IP4_MAX_PREFIX_LENGTH) - return PARSE_ERROR; - - /* Cannot omit data if there is no saved prefix */ - if (tlv->omitted && !state->def_ip4_prefix_seen) - return PARSE_ERROR; - - /* Update must have next hop, unless it is retraction */ - if (ipa_zero(state->next_hop_ip4) && (msg->metric != BABEL_INFINITY)) - return PARSE_IGNORE; - - /* Merge saved prefix and received prefix parts */ - memcpy(buf, state->def_ip4_prefix, tlv->omitted); - memcpy(buf + tlv->omitted, tlv->addr, len); - - ip4_addr prefix4 = get_ip4(buf); - net_fill_ip4(&msg->net, prefix4, tlv->plen); + rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4_prefix_seen, + &state->def_ip4_prefix, &state->next_hop_ip4, + len); + if (rc != PARSE_SUCCESS) + return rc;
- if (tlv->flags & BABEL_UF_DEF_PREFIX) - { - put_ip4(state->def_ip4_prefix, prefix4); - state->def_ip4_prefix_seen = 1; - } + break;
- msg->next_hop = state->next_hop_ip4; + case BABEL_AE_IPV4_VIA_IPV6: + rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4viaip6_prefix_seen, + &state->def_ip4viaip6_prefix, &state->next_hop_ip6, + len); + if (rc != PARSE_SUCCESS) + return rc;
break;
@@ -644,6 +675,7 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m, return PARSE_ERROR;
/* Merge saved prefix and received prefix parts */ + u8 buf[16] = {}; memcpy(buf, state->def_ip6_prefix, tlv->omitted); memcpy(buf + tlv->omitted, tlv->addr, len);
@@ -746,7 +778,11 @@ babel_write_update(struct babel_tlv *hdr, union babel_msg *m, } else if (msg->net.type == NET_IP4) { - tlv->ae = BABEL_AE_IP4; + if (!ipa_zero(msg->next_hop) && ipa_is_ip6(msg->next_hop)) + tlv->ae = BABEL_AE_IPV4_VIA_IPV6; + else + tlv->ae = BABEL_AE_IP4; + tlv->plen = net4_pxlen(&msg->net); put_ip4_px(tlv->addr, &msg->net); } @@ -814,7 +850,15 @@ babel_read_route_request(struct babel_tlv *hdr, union babel_msg *m, msg->full = 1; return PARSE_SUCCESS;
+ /* + * When receiving requests, AEs 1 (IPv4) and 4 (v4-via-v6) MUST be + * treated in the same manner: the receiver processes the request as + * described in Section 3.8 of [RFC6126bis]. If an Update is sent, then + * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above, + * irrespective of which AE was used in the request. + */
You write 4 in the beginning and TBD later here. Should obviously be changed to the right number when it's assigned.
case BABEL_AE_IP4: + case BABEL_AE_IPV4_VIA_IPV6: if (tlv->plen > IP4_MAX_PREFIX_LENGTH) return PARSE_ERROR;
@@ -915,7 +959,15 @@ babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *m, case BABEL_AE_WILDCARD: return PARSE_ERROR;
+ /* + * When receiving requests, AEs 1 (IPv4) and TBD (v4-via-v6) MUST be + * treated in the same manner: the receiver processes the request as + * described in Section 3.8 of [RFC6126bis]. If an Update is sent, then + * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above, + * irrespective of which AE was used in the request. + */
Here it's TBD everywhere, but see above.
case BABEL_AE_IP4: + case BABEL_AE_IPV4_VIA_IPV6: if (tlv->plen > IP4_MAX_PREFIX_LENGTH) return PARSE_ERROR;
-- 2.29.2
On 13:05 15.12.20, Toke Høiland-Jørgensen wrote:
Andreas Rammhold <andreas@rammhold.de> writes:
This is a first attempt at implementing draft-ietf-babel-v4viav6-00 as IPv4 via IPv6 extension to the Babel routing protocol that allows announcing routes to an IPv4 prefix with an IPv6 next-hop, which makes it possible for IPv4 traffic to flow through interfaces that have not been assigned an IPv4 address.
I've tried my best to be compliant with the current RFC version. One point that is not entirely clear is how to implement in section 2.2. regarding fallback when the device doesn't support v4-via-v6 routes. (See notes below)
The implementation is compatible with the current Babeld PR (https://github.com/jech/babeld/pull/56). I've verified this with a few VMs in the following setup:
Bird <- v4 only -> Bird <- v6 only -> Babeld <- v4 only -> Babeld
Each routing daemon was running on their own VM and had L2 connectivity to only its direct neighbors. At the nodes at the edges v4 networks have been announced and full end-to-end communication was possible over the mixed AF network. The v6 only link between Babel and Bird (at the "center" of the above setup) did transport the v4 packets via the v6 link-local next hop addresses just as expected.
Thanks to Toke Høiland-Jørgensen for early review on this work.
-----< notes >------
(My current notes on the current implementation regarding compliance with the current draft)
Bird doesn't implement compression for outgoing (send) IPv4 prefixes and thus this also hasn't been implemented for this AE. Best guess is that 4 bytes are cheap enough to not bother with the added complexity? In the RX path compression is supported for IPv4 prefixes (for the IPv4 AE as well as for IPv4-via-IPv6 AE).
Especially the part last paragraph of 2.2. remains to be discussed/solved:
* If a node cannot install v4-over-v6 routes, eg., due to hardware or software limitations, then routes to an IPv4 prefix with an IPv6 next-hop MUST NOT be selected, as described in Section 3.5.3 of [RFC6126bis].
* What if the kernel doesn't accept the RTA_VIA value we gave it? Does BIRD generally handle this already? One example is hitting: "ipv4: use IS_ENABLED instead of ifdef" (id:20201115224509.2020651-1-flokli@flokli.de @ linux-netdev) Where the kernel does support it but due to a bug fails to add those routes when CONFIG_IPV6=m. Kernel version comparison is probably a bad idea. Which kind of route can we attempt to install to test this functionality without breaking any setup?
Can't we just have Bird do the equivalent of:
ip r add 192.0.2.1/32 via inet6 ::2 dev lo onlink ip r del 192.0.2.1/32 via inet6 ::2 dev lo onlink
on startup, and set a system flag depending on whether the operation fails or not?
That could work. I am not happy to use the documentation prefix for this but I am also not aware of any other subnet that might be better suited for this task. If someone is using the documentation prefix they will probably not like this tho (but that is hardly our fault…). Still very keen on the opinion of the Bird maintainers on this as I've not seen any other code like this in the repository.
Other than this, I think the patch generally looks good (with a few nits below); but we should probably wait until the draft progresses to the point where it actually has an AE assigned in the text before we merge this :)
Agreed. Just felt like pushing this out for comments and to avoid me sitting on the working patch while I might be occupied with other stuff in the future.
Section 5: I am using the proposed value 4 to test interoperability with the Babled PR (https://github.com/jech/babeld/pull/56/commits). --- doc/bird.sgml | 7 +++ proto/babel/babel.c | 26 +++++++--- proto/babel/babel.h | 3 ++ proto/babel/config.Y | 4 +- proto/babel/packets.c | 116 ++++++++++++++++++++++++++++++------------ 5 files changed, 117 insertions(+), 39 deletions(-)
diff --git a/doc/bird.sgml b/doc/bird.sgml index 5408cb2a..ee895a47 100644 --- a/doc/bird.sgml +++ b/doc/bird.sgml @@ -1770,6 +1770,7 @@ protocol babel [<name>] { check link <switch>; next hop ipv4 <address>; next hop ipv6 <address>; + ipv4 via ipv6 <switch>; }; } </code> @@ -1860,6 +1861,12 @@ protocol babel [<name>] { interface. If not set, the same link-local address that is used as the source for Babel packets will be used. In normal operation, it should not be necessary to set this option. + + <tag><label id="babel-ipv4-via-ipv6">ipv4 via ipv6 <m/switch/</tag> + If enabled, Bird will accept and emit IPv4-via-IPv6 routes when IPv4 + addresses are absent from the interface as described in draft-ietf-babel-v4viav6. + Default: yes. + </descrip>
<sect1>Attributes diff --git a/proto/babel/babel.c b/proto/babel/babel.c index 4b6b9d7f..64a88601 100644 --- a/proto/babel/babel.c +++ b/proto/babel/babel.c @@ -954,8 +954,15 @@ babel_send_update_(struct babel_iface *ifa, btime changed, struct fib *rtable) msg.update.router_id = e->router_id; net_copy(&msg.update.net, e->n.addr);
- msg.update.next_hop = ((e->n.addr->type == NET_IP4) ? - ifa->next_hop_ip4 : ifa->next_hop_ip6); + if (e->n.addr->type == NET_IP4 && !ipa_zero(ifa->next_hop_ip4)) { + msg.update.next_hop = ifa->next_hop_ip4; + } else if (e->n.addr->type == NET_IP6 && !ipa_zero(ifa->next_hop_ip6)) { + msg.update.next_hop = ifa->next_hop_ip6; + /* If there is no IPv4 address on the interface (and we are supporting v4viav6) + * send the v6 next hop instead */ + } else if (ifa->cf->ip4_via_ip6 && e->n.addr->type == NET_IP4 && ipa_zero(ifa->next_hop_ip4) && !ipa_zero(ifa->next_hop_ip6)) { + msg.update.next_hop = ifa->next_hop_ip6; + }
You're doing a bit of redundant checking here, since there's a check for ipa_zero() just below. How about:
if (e->n.addr->type == NET_IP4) { /* Always prefer v4 nexthop if set */ if(!ipa_zero(ifa->next_hop_ip4)) msg.update.next_hop = ifa->next_hop_ip4;
/* Only send v4-via-v6 if enabled */ else if (ifa->cf->ip4_via_ip6) msg.update.next_hop = ifa->next_hop_ip6; } else { msg.update.next_hop = ifa->next_hop_ip6; }
/* Do not send route if next hop is unknown, e.g. no configured IPv4 address */ if (ipa_zero(msg.update.next_hop)) @@ -1247,6 +1254,12 @@ babel_handle_update(union babel_msg *m, struct babel_iface *ifa) return; }
+ /* Reject IPv4 via IPv6 routes if disabled */ + if (!ifa->cf->ip4_via_ip6 && msg->net.type == NET_IP4 && ipa_is_ip6(msg->next_hop)) { + DBG("Babel: Ignoring disabled IPv4 via IPv6 route.\n"); + return; + } +
/* Regular update */ e = babel_get_entry(p, &msg->net); r = babel_get_route(p, e, nbr); /* the route entry indexed by neighbour */ @@ -1529,7 +1542,7 @@ babel_iface_update_addr4(struct babel_iface *ifa) ifa->next_hop_ip4 = ipa_nonzero(ifa->cf->next_hop_ip4) ? ifa->cf->next_hop_ip4 : addr4;
if (ipa_zero(ifa->next_hop_ip4) && p->ip4_channel) - log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname); + log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname); // FIXME: make sure we do not log this when it is expected
Just check ifa->cf->ip4_via_ip6 here as well, and make sure it gets set to false if the feature is not supported on the system?
if (ifa->up) babel_iface_kick_timer(ifa); @@ -1894,8 +1907,8 @@ babel_show_interfaces(struct proto *P, const char *iff) }
cli_msg(-1023, "%s:", p->p.name); - cli_msg(-1023, "%-10s %-6s %7s %6s %7s %-15s %s", - "Interface", "State", "RX cost", "Nbrs", "Timer", + cli_msg(-1023, "%-10s %-6s %-12s %7s %6s %7s %-15s %s", + "Interface", "State", "IPv4 via Ip6", "RX cost", "Nbrs", "Timer", "Next hop (v4)", "Next hop (v6)");
WALK_LIST(ifa, p->interfaces) @@ -1908,8 +1921,9 @@ babel_show_interfaces(struct proto *P, const char *iff) nbrs++;
btime timer = MIN(ifa->next_regular, ifa->next_hello) - current_time(); - cli_msg(-1023, "%-10s %-6s %7u %6u %7t %-15I %I", + cli_msg(-1023, "%-10s %-6s %-12s %7u %6u %7t %-15I %I", ifa->iface->name, (ifa->up ? "Up" : "Down"), + (ifa->cf->ip4_via_ip6 ? "Yes" : "No"), ifa->cf->rxcost, nbrs, MAX(timer, 0), ifa->next_hop_ip4, ifa->next_hop_ip6); } diff --git a/proto/babel/babel.h b/proto/babel/babel.h index e075024c..146d28f5 100644 --- a/proto/babel/babel.h +++ b/proto/babel/babel.h @@ -104,6 +104,7 @@ enum babel_ae_type { BABEL_AE_IP4 = 1, BABEL_AE_IP6 = 2, BABEL_AE_IP6_LL = 3, + BABEL_AE_IPV4_VIA_IPV6 = 4, BABEL_AE_MAX };
@@ -137,6 +138,8 @@ struct babel_iface_config {
ip_addr next_hop_ip4; ip_addr next_hop_ip6; + + u8 ip4_via_ip6; /* Enabe IPv4 via IPv6 */
Typo in comment
};
struct babel_proto { diff --git a/proto/babel/config.Y b/proto/babel/config.Y index 2f3b637b..f17884eb 100644 --- a/proto/babel/config.Y +++ b/proto/babel/config.Y @@ -25,7 +25,7 @@ CF_DECLS CF_KEYWORDS(BABEL, INTERFACE, METRIC, RXCOST, HELLO, UPDATE, INTERVAL, PORT, TYPE, WIRED, WIRELESS, RX, TX, BUFFER, PRIORITY, LENGTH, CHECK, LINK, NEXT, HOP, IPV4, IPV6, BABEL_METRIC, SHOW, INTERFACES, NEIGHBORS, - ENTRIES, RANDOMIZE, ROUTER, ID) + ENTRIES, RANDOMIZE, ROUTER, ID, VIA)
CF_GRAMMAR
@@ -65,6 +65,7 @@ babel_iface_start: BABEL_IFACE->tx_tos = IP_PREC_INTERNET_CONTROL; BABEL_IFACE->tx_priority = sk_priority_control; BABEL_IFACE->check_link = 1; + BABEL_IFACE->ip4_via_ip6 = 1; };
@@ -109,6 +110,7 @@ babel_iface_item: | CHECK LINK bool { BABEL_IFACE->check_link = $3; } | NEXT HOP IPV4 ipa { BABEL_IFACE->next_hop_ip4 = $4; if (!ipa_is_ip4($4)) cf_error("Must be an IPv4 address"); } | NEXT HOP IPV6 ipa { BABEL_IFACE->next_hop_ip6 = $4; if (!ipa_is_ip6($4)) cf_error("Must be an IPv6 address"); } + | IPV4 VIA IPV6 bool { BABEL_IFACE->ip4_via_ip6 = $4; } ;
babel_iface_opts: diff --git a/proto/babel/packets.c b/proto/babel/packets.c index 415ac3f9..d4c66a53 100644 --- a/proto/babel/packets.c +++ b/proto/babel/packets.c @@ -130,9 +130,11 @@ struct babel_parse_state { u64 router_id; /* Router ID used in subsequent updates */ u8 def_ip6_prefix[16]; /* Implicit IPv6 prefix in network order */ u8 def_ip4_prefix[4]; /* Implicit IPv4 prefix in network order */ + u8 def_ip4viaip6_prefix[4]; /* Implicit IPv4 prefix in network order */ u8 router_id_seen; /* router_id field is valid */ u8 def_ip6_prefix_seen; /* def_ip6_prefix is valid */ u8 def_ip4_prefix_seen; /* def_ip4_prefix is valid */ + u8 def_ip4viaip6_prefix_seen; /* def_ip4viaip6_prefix is valid*/ u8 current_tlv_endpos; /* End of self-terminating TLVs (offset from start) */ u8 sadr_enabled; }; @@ -246,6 +248,8 @@ static int babel_read_update(struct babel_tlv *hdr, union babel_msg *msg, struct static int babel_read_route_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state); static int babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state); static int babel_read_source_prefix(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state); +static int babel_read_update_ip4_prefix(struct babel_tlv_update *tlv, struct babel_msg_update *msg, u8 *def_prefix_seen, + u8 (*def_prefix)[4], ip_addr * next_hop, int len);
No need to forward-declare this function; you're only using it after it was defined.
static uint babel_write_ack(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len); static uint babel_write_hello(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len); @@ -398,9 +402,6 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m, msg->addr = IPA_NONE; msg->sender = state->saddr;
- if (msg->ae >= BABEL_AE_MAX) - return PARSE_IGNORE; - /* * We only actually read link-local IPs. In every other case, the addr field * will be 0 but validation will succeed. The handler takes care of these @@ -413,13 +414,13 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m, if (TLV_OPT_LENGTH(tlv) < 4) return PARSE_ERROR; state->current_tlv_endpos += 4; - break; + return PARSE_SUCCESS;
case BABEL_AE_IP6: if (TLV_OPT_LENGTH(tlv) < 16) return PARSE_ERROR; state->current_tlv_endpos += 16; - break; + return PARSE_SUCCESS;
case BABEL_AE_IP6_LL: if (TLV_OPT_LENGTH(tlv) < 8) @@ -427,10 +428,10 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
msg->addr = ipa_from_ip6(get_ip6_ll(&tlv->addr)); state->current_tlv_endpos += 8; - break; + return PARSE_SUCCESS; }
- return PARSE_SUCCESS; + return PARSE_IGNORE; }
static uint @@ -530,6 +531,47 @@ babel_read_next_hop(struct babel_tlv *hdr, union babel_msg *m UNUSED, return PARSE_IGNORE; }
+ +/* This is called directrly from babel_read_next_hop to handle the ip4 address + compression state */
Typo in comment
+static int +babel_read_update_ip4_prefix(struct babel_tlv_update *tlv,
The babel_read_* functions generally read a particular type of TLV, so maybe change the name to make it obvious this is a helper function. babel_get_ip4_prefix() ?
+ struct babel_msg_update *msg, + u8 *def_prefix_seen, + u8 (*def_prefix)[4], + ip_addr *next_hop, + int len)
I don't think the "pointer with array size" argument type is generally used in the Bird code; is it even supported by all compilers?
I had the understanding that most compilers would not barf on this one but only a few would use the extra information for (very simple) static analysis but I could be wrong. Would you rather see a pointer to just u8 (the first element) or just the size of the array being removed? Either is fine with me just want to avoid adding a cast there. :-)
Also, I don't think it's necessary to pass a pointer to next_hop; it's a fairly small value, so copying it is fine, and if it's not a pointer it's obvious that it's not modified.
Good point.
+{ + if (tlv->plen > IP4_MAX_PREFIX_LENGTH) + return PARSE_ERROR; + + /* Cannot omit data if there is no saved prefix */ + if (tlv->omitted && !*def_prefix_seen) + return PARSE_ERROR; + + /* Update must have next hop, unless it is retraction */ + if (ipa_zero(*next_hop) && msg->metric != BABEL_INFINITY) + return PARSE_IGNORE; + + /* Merge saved prefix and received prefix parts */ + u8 buf[4] = {}; + memcpy(buf, *def_prefix, tlv->omitted); + memcpy(buf + tlv->omitted, tlv->addr, len); + + ip4_addr prefix4 = get_ip4(buf); + net_fill_ip4(&msg->net, prefix4, tlv->plen); + + if (tlv->flags & BABEL_UF_DEF_PREFIX) + { + put_ip4(*def_prefix, prefix4); + *def_prefix_seen = 1; + } + + msg->next_hop = *next_hop; + + return PARSE_SUCCESS; +} + /* This is called directly from babel_write_update() and returns -1 if a next hop should be written but there is not enough space. */ static int @@ -579,6 +621,7 @@ static int babel_read_update(struct babel_tlv *hdr, union babel_msg *m, struct babel_parse_state *state) { + int rc; struct babel_tlv_update *tlv = (void *) hdr; struct babel_msg_update *msg = &m->update;
@@ -589,7 +632,6 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
/* Length of received prefix data without omitted part */ int len = BYTES(tlv->plen) - (int) tlv->omitted; - u8 buf[16] = {};
if ((len < 0) || ((uint) len > TLV_OPT_LENGTH(tlv))) return PARSE_ERROR; @@ -607,31 +649,20 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m, break;
case BABEL_AE_IP4: - if (tlv->plen > IP4_MAX_PREFIX_LENGTH) - return PARSE_ERROR; - - /* Cannot omit data if there is no saved prefix */ - if (tlv->omitted && !state->def_ip4_prefix_seen) - return PARSE_ERROR; - - /* Update must have next hop, unless it is retraction */ - if (ipa_zero(state->next_hop_ip4) && (msg->metric != BABEL_INFINITY)) - return PARSE_IGNORE; - - /* Merge saved prefix and received prefix parts */ - memcpy(buf, state->def_ip4_prefix, tlv->omitted); - memcpy(buf + tlv->omitted, tlv->addr, len); - - ip4_addr prefix4 = get_ip4(buf); - net_fill_ip4(&msg->net, prefix4, tlv->plen); + rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4_prefix_seen, + &state->def_ip4_prefix, &state->next_hop_ip4, + len); + if (rc != PARSE_SUCCESS) + return rc;
- if (tlv->flags & BABEL_UF_DEF_PREFIX) - { - put_ip4(state->def_ip4_prefix, prefix4); - state->def_ip4_prefix_seen = 1; - } + break;
- msg->next_hop = state->next_hop_ip4; + case BABEL_AE_IPV4_VIA_IPV6: + rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4viaip6_prefix_seen, + &state->def_ip4viaip6_prefix, &state->next_hop_ip6, + len); + if (rc != PARSE_SUCCESS) + return rc;
break;
@@ -644,6 +675,7 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m, return PARSE_ERROR;
/* Merge saved prefix and received prefix parts */ + u8 buf[16] = {}; memcpy(buf, state->def_ip6_prefix, tlv->omitted); memcpy(buf + tlv->omitted, tlv->addr, len);
@@ -746,7 +778,11 @@ babel_write_update(struct babel_tlv *hdr, union babel_msg *m, } else if (msg->net.type == NET_IP4) { - tlv->ae = BABEL_AE_IP4; + if (!ipa_zero(msg->next_hop) && ipa_is_ip6(msg->next_hop)) + tlv->ae = BABEL_AE_IPV4_VIA_IPV6; + else + tlv->ae = BABEL_AE_IP4; + tlv->plen = net4_pxlen(&msg->net); put_ip4_px(tlv->addr, &msg->net); } @@ -814,7 +850,15 @@ babel_read_route_request(struct babel_tlv *hdr, union babel_msg *m, msg->full = 1; return PARSE_SUCCESS;
+ /* + * When receiving requests, AEs 1 (IPv4) and 4 (v4-via-v6) MUST be + * treated in the same manner: the receiver processes the request as + * described in Section 3.8 of [RFC6126bis]. If an Update is sent, then + * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above, + * irrespective of which AE was used in the request. + */
You write 4 in the beginning and TBD later here. Should obviously be changed to the right number when it's assigned.
Yeah. I did intend to update that when the actual draft moves forward. The text I copied from the draft itself might also change so those might also need updating besides the TBD value. I've changed all of the TBDs to 4 now. Hopefully that will be the final value.
case BABEL_AE_IP4: + case BABEL_AE_IPV4_VIA_IPV6: if (tlv->plen > IP4_MAX_PREFIX_LENGTH) return PARSE_ERROR;
@@ -915,7 +959,15 @@ babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *m, case BABEL_AE_WILDCARD: return PARSE_ERROR;
+ /* + * When receiving requests, AEs 1 (IPv4) and TBD (v4-via-v6) MUST be + * treated in the same manner: the receiver processes the request as + * described in Section 3.8 of [RFC6126bis]. If an Update is sent, then + * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above, + * irrespective of which AE was used in the request. + */
Here it's TBD everywhere, but see above.
case BABEL_AE_IP4: + case BABEL_AE_IPV4_VIA_IPV6: if (tlv->plen > IP4_MAX_PREFIX_LENGTH) return PARSE_ERROR;
-- 2.29.2
I'll wait for more feedback before publishing a v2. My current development branch is at https://git.sr.ht/~andir/bird/log/babel-v4-via-v6 (including a commit for the VM based test that I've been running).
Andreas Rammhold <andreas@rammhold.de> writes:
On 13:05 15.12.20, Toke Høiland-Jørgensen wrote:
Andreas Rammhold <andreas@rammhold.de> writes:
This is a first attempt at implementing draft-ietf-babel-v4viav6-00 as IPv4 via IPv6 extension to the Babel routing protocol that allows announcing routes to an IPv4 prefix with an IPv6 next-hop, which makes it possible for IPv4 traffic to flow through interfaces that have not been assigned an IPv4 address.
I've tried my best to be compliant with the current RFC version. One point that is not entirely clear is how to implement in section 2.2. regarding fallback when the device doesn't support v4-via-v6 routes. (See notes below)
The implementation is compatible with the current Babeld PR (https://github.com/jech/babeld/pull/56). I've verified this with a few VMs in the following setup:
Bird <- v4 only -> Bird <- v6 only -> Babeld <- v4 only -> Babeld
Each routing daemon was running on their own VM and had L2 connectivity to only its direct neighbors. At the nodes at the edges v4 networks have been announced and full end-to-end communication was possible over the mixed AF network. The v6 only link between Babel and Bird (at the "center" of the above setup) did transport the v4 packets via the v6 link-local next hop addresses just as expected.
Thanks to Toke Høiland-Jørgensen for early review on this work.
-----< notes >------
(My current notes on the current implementation regarding compliance with the current draft)
Bird doesn't implement compression for outgoing (send) IPv4 prefixes and thus this also hasn't been implemented for this AE. Best guess is that 4 bytes are cheap enough to not bother with the added complexity? In the RX path compression is supported for IPv4 prefixes (for the IPv4 AE as well as for IPv4-via-IPv6 AE).
Especially the part last paragraph of 2.2. remains to be discussed/solved:
* If a node cannot install v4-over-v6 routes, eg., due to hardware or software limitations, then routes to an IPv4 prefix with an IPv6 next-hop MUST NOT be selected, as described in Section 3.5.3 of [RFC6126bis].
* What if the kernel doesn't accept the RTA_VIA value we gave it? Does BIRD generally handle this already? One example is hitting: "ipv4: use IS_ENABLED instead of ifdef" (id:20201115224509.2020651-1-flokli@flokli.de @ linux-netdev) Where the kernel does support it but due to a bug fails to add those routes when CONFIG_IPV6=m. Kernel version comparison is probably a bad idea. Which kind of route can we attempt to install to test this functionality without breaking any setup?
Can't we just have Bird do the equivalent of:
ip r add 192.0.2.1/32 via inet6 ::2 dev lo onlink ip r del 192.0.2.1/32 via inet6 ::2 dev lo onlink
on startup, and set a system flag depending on whether the operation fails or not?
That could work. I am not happy to use the documentation prefix for this but I am also not aware of any other subnet that might be better suited for this task. If someone is using the documentation prefix they will probably not like this tho (but that is hardly our fault…).
Well, just used it as an example of something that was not likely to be in use; could be anything, really. Maybe you can even use 127.0.0.1/32 on lo without any ill effects? -Toke
On 17:51 15.12.20, Toke Høiland-Jørgensen wrote:
Andreas Rammhold <andreas@rammhold.de> writes:
On 13:05 15.12.20, Toke Høiland-Jørgensen wrote:
Andreas Rammhold <andreas@rammhold.de> writes:
This is a first attempt at implementing draft-ietf-babel-v4viav6-00 as IPv4 via IPv6 extension to the Babel routing protocol that allows announcing routes to an IPv4 prefix with an IPv6 next-hop, which makes it possible for IPv4 traffic to flow through interfaces that have not been assigned an IPv4 address.
I've tried my best to be compliant with the current RFC version. One point that is not entirely clear is how to implement in section 2.2. regarding fallback when the device doesn't support v4-via-v6 routes. (See notes below)
The implementation is compatible with the current Babeld PR (https://github.com/jech/babeld/pull/56). I've verified this with a few VMs in the following setup:
Bird <- v4 only -> Bird <- v6 only -> Babeld <- v4 only -> Babeld
Each routing daemon was running on their own VM and had L2 connectivity to only its direct neighbors. At the nodes at the edges v4 networks have been announced and full end-to-end communication was possible over the mixed AF network. The v6 only link between Babel and Bird (at the "center" of the above setup) did transport the v4 packets via the v6 link-local next hop addresses just as expected.
Thanks to Toke Høiland-Jørgensen for early review on this work.
-----< notes >------
(My current notes on the current implementation regarding compliance with the current draft)
Bird doesn't implement compression for outgoing (send) IPv4 prefixes and thus this also hasn't been implemented for this AE. Best guess is that 4 bytes are cheap enough to not bother with the added complexity? In the RX path compression is supported for IPv4 prefixes (for the IPv4 AE as well as for IPv4-via-IPv6 AE).
Especially the part last paragraph of 2.2. remains to be discussed/solved:
* If a node cannot install v4-over-v6 routes, eg., due to hardware or software limitations, then routes to an IPv4 prefix with an IPv6 next-hop MUST NOT be selected, as described in Section 3.5.3 of [RFC6126bis].
* What if the kernel doesn't accept the RTA_VIA value we gave it? Does BIRD generally handle this already? One example is hitting: "ipv4: use IS_ENABLED instead of ifdef" (id:20201115224509.2020651-1-flokli@flokli.de @ linux-netdev) Where the kernel does support it but due to a bug fails to add those routes when CONFIG_IPV6=m. Kernel version comparison is probably a bad idea. Which kind of route can we attempt to install to test this functionality without breaking any setup?
Can't we just have Bird do the equivalent of:
ip r add 192.0.2.1/32 via inet6 ::2 dev lo onlink ip r del 192.0.2.1/32 via inet6 ::2 dev lo onlink
on startup, and set a system flag depending on whether the operation fails or not?
That could work. I am not happy to use the documentation prefix for this but I am also not aware of any other subnet that might be better suited for this task. If someone is using the documentation prefix they will probably not like this tho (but that is hardly our fault…).
Well, just used it as an example of something that was not likely to be in use; could be anything, really. Maybe you can even use 127.0.0.1/32 on lo without any ill effects?
That sounds good and doesn't appear to have any noticeable side effects while the route is installed. I'll have to see where in Bird such a check could be added.
On Tue, Dec 15, 2020 at 01:05:22PM +0100, Toke Høiland-Jørgensen wrote:
Andreas Rammhold <andreas@rammhold.de> writes:
* What if the kernel doesn't accept the RTA_VIA value we gave it? Does BIRD generally handle this already? One example is hitting: "ipv4: use IS_ENABLED instead of ifdef" (id:20201115224509.2020651-1-flokli@flokli.de @ linux-netdev) Where the kernel does support it but due to a bug fails to add those routes when CONFIG_IPV6=m. Kernel version comparison is probably a bad idea. Which kind of route can we attempt to install to test this functionality without breaking any setup?
Can't we just have Bird do the equivalent of:
ip r add 192.0.2.1/32 via inet6 ::2 dev lo onlink ip r del 192.0.2.1/32 via inet6 ::2 dev lo onlink
on startup, and set a system flag depending on whether the operation fails or not?
Well, i would prefer to just add Babel protocol option to accept these extended next hops, make it disabled by default, and left it to the user. It is similar to ECMP, which may not be supported by kernel (if its support is disabled). We also had it disabled by default in BIRD 1.x, but changed the default in BIRD 2, as it is now widely supported. We could perhaps implement later some generic mechanism later to detect global system properties during start, so they can be accessed during configuration processing to have different defaults based on system abilities. Note that even if Kernel supports this feature, there is no surety that such route is used for forwarding. For example, Kernel protocol may not be even configured, or export filter on it may filter out routes with such next hops. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej Zajicek <santiago@crfreenet.org> writes:
On Tue, Dec 15, 2020 at 01:05:22PM +0100, Toke Høiland-Jørgensen wrote:
Andreas Rammhold <andreas@rammhold.de> writes:
* What if the kernel doesn't accept the RTA_VIA value we gave it? Does BIRD generally handle this already? One example is hitting: "ipv4: use IS_ENABLED instead of ifdef" (id:20201115224509.2020651-1-flokli@flokli.de @ linux-netdev) Where the kernel does support it but due to a bug fails to add those routes when CONFIG_IPV6=m. Kernel version comparison is probably a bad idea. Which kind of route can we attempt to install to test this functionality without breaking any setup?
Can't we just have Bird do the equivalent of:
ip r add 192.0.2.1/32 via inet6 ::2 dev lo onlink ip r del 192.0.2.1/32 via inet6 ::2 dev lo onlink
on startup, and set a system flag depending on whether the operation fails or not?
Well, i would prefer to just add Babel protocol option to accept these extended next hops, make it disabled by default, and left it to the user.
It is similar to ECMP, which may not be supported by kernel (if its support is disabled). We also had it disabled by default in BIRD 1.x, but changed the default in BIRD 2, as it is now widely supported.
Heh, well that's a nice way to sidestep the whole problem, I guess ;)
We could perhaps implement later some generic mechanism later to detect global system properties during start, so they can be accessed during configuration processing to have different defaults based on system abilities.
Yeah, I think I was envisioning something like this. The v4-via-v6 thing is nice in that it Just Works without any other system config as long as the kernel supports it. So it would be neat to just turn it on by default when supported and save the user some configuring.
Note that even if Kernel supports this feature, there is no surety that such route is used for forwarding. For example, Kernel protocol may not be even configured, or export filter on it may filter out routes with such next hops.
Sure, users can always configure things in a way that doesn't work. But at least they have to actively do something to achieve this, it won't just be broken by default... -Toke
On 18:14 15.12.20, Ondrej Zajicek wrote:
On Tue, Dec 15, 2020 at 01:05:22PM +0100, Toke Høiland-Jørgensen wrote:
Andreas Rammhold <andreas@rammhold.de> writes:
* What if the kernel doesn't accept the RTA_VIA value we gave it? Does BIRD generally handle this already? One example is hitting: "ipv4: use IS_ENABLED instead of ifdef" (id:20201115224509.2020651-1-flokli@flokli.de @ linux-netdev) Where the kernel does support it but due to a bug fails to add those routes when CONFIG_IPV6=m. Kernel version comparison is probably a bad idea. Which kind of route can we attempt to install to test this functionality without breaking any setup?
Can't we just have Bird do the equivalent of:
ip r add 192.0.2.1/32 via inet6 ::2 dev lo onlink ip r del 192.0.2.1/32 via inet6 ::2 dev lo onlink
on startup, and set a system flag depending on whether the operation fails or not?
Well, i would prefer to just add Babel protocol option to accept these extended next hops, make it disabled by default, and left it to the user.
It is similar to ECMP, which may not be supported by kernel (if its support is disabled). We also had it disabled by default in BIRD 1.x, but changed the default in BIRD 2, as it is now widely supported.
In the ECMP case was there some breakage when the system didn't support it but Bird tried to install those routes?
We could perhaps implement later some generic mechanism later to detect global system properties during start, so they can be accessed during configuration processing to have different defaults based on system abilities.
Note that even if Kernel supports this feature, there is no surety that such route is used for forwarding. For example, Kernel protocol may not be even configured, or export filter on it may filter out routes with such next hops.
Arguably Bird2 always supports it internally and the user can always (re)export them via another protocol (BGP) without having to install them locally. In that case the kernel version doesn't actually matter and enabling this by default would probably not break any setup. Even if the kernel supports v4-via-v6 there is no new breakage when this feature is enabled as a "surprise" as v4-via-v4 is still preferred. v4-via-v6 only "breaks" your setup if you relied on the absence of v4 addresses on some interfaces to stop propagation of some route/path (e.g. a script adding/removing the address on the interface).
participants (4)
-
Andreas Rammhold -
andreas@rammhold.de -
Ondrej Zajicek -
Toke Høiland-Jørgensen