[RFC] Babel: add v4viav6 support
Toke Høiland-Jørgensen
toke at toke.dk
Tue Dec 15 13:05:22 CET 2020
Andreas Rammhold <andreas at rammhold.de> writes:
> This is a first attempt at implementing draft-ietf-babel-v4viav6-00 as
> IPv4 via IPv6 extension to the Babel routing protocol that allows
> announcing routes to an IPv4 prefix with an IPv6 next-hop, which makes it
> possible for IPv4 traffic to flow through interfaces that have not been
> assigned an IPv4 address.
>
> I've tried my best to be compliant with the current RFC version. One
> point that is not entirely clear is how to implement in section 2.2.
> regarding fallback when the device doesn't support v4-via-v6 routes.
> (See notes below)
>
> The implementation is compatible with the current Babeld PR
> (https://github.com/jech/babeld/pull/56). I've verified this with a few
> VMs in the following setup:
>
> Bird <- v4 only -> Bird <- v6 only -> Babeld <- v4 only -> Babeld
>
> Each routing daemon was running on their own VM and had L2 connectivity
> to only its direct neighbors. At the nodes at the edges v4 networks have
> been announced and full end-to-end communication was possible over the
> mixed AF network. The v6 only link between Babel and Bird (at the
> "center" of the above setup) did transport the v4 packets via the v6
> link-local next hop addresses just as expected.
>
> Thanks to Toke Høiland-Jørgensen for early review on this work.
>
> -----< notes >------
>
> (My current notes on the current implementation regarding
> compliance with the current draft)
>
> Bird doesn't implement compression for outgoing (send) IPv4 prefixes and
> thus this also hasn't been implemented for this AE. Best guess is that 4
> bytes are cheap enough to not bother with the added complexity? In the
> RX path compression is supported for IPv4 prefixes (for the IPv4 AE
> as well as for IPv4-via-IPv6 AE).
>
> Especially the part last paragraph of 2.2. remains to be discussed/solved:
>
> * If a node cannot install v4-over-v6 routes, eg., due to hardware or
> software limitations, then routes to an IPv4 prefix with an IPv6
> next-hop MUST NOT be selected, as described in Section 3.5.3 of
> [RFC6126bis].
>
> * What if the kernel doesn't accept the RTA_VIA value we gave it?
> Does BIRD generally handle this already?
> One example is hitting: "ipv4: use IS_ENABLED instead of ifdef"
> (id:20201115224509.2020651-1-flokli at flokli.de @ linux-netdev) Where
> the kernel does support it but due to a bug fails to add those routes
> when CONFIG_IPV6=m.
> Kernel version comparison is probably a bad idea. Which kind of route
> can we attempt to install to test this functionality without breaking
> any setup?
Can't we just have Bird do the equivalent of:
ip r add 192.0.2.1/32 via inet6 ::2 dev lo onlink
ip r del 192.0.2.1/32 via inet6 ::2 dev lo onlink
on startup, and set a system flag depending on whether the operation
fails or not?
Other than this, I think the patch generally looks good (with a few nits
below); but we should probably wait until the draft progresses to the
point where it actually has an AE assigned in the text before we merge
this :)
> Section 5: I am using the proposed value 4 to test interoperability with
> the Babled PR (https://github.com/jech/babeld/pull/56/commits).
> ---
> doc/bird.sgml | 7 +++
> proto/babel/babel.c | 26 +++++++---
> proto/babel/babel.h | 3 ++
> proto/babel/config.Y | 4 +-
> proto/babel/packets.c | 116 ++++++++++++++++++++++++++++++------------
> 5 files changed, 117 insertions(+), 39 deletions(-)
>
> diff --git a/doc/bird.sgml b/doc/bird.sgml
> index 5408cb2a..ee895a47 100644
> --- a/doc/bird.sgml
> +++ b/doc/bird.sgml
> @@ -1770,6 +1770,7 @@ protocol babel [<name>] {
> check link <switch>;
> next hop ipv4 <address>;
> next hop ipv6 <address>;
> + ipv4 via ipv6 <switch>;
> };
> }
> </code>
> @@ -1860,6 +1861,12 @@ protocol babel [<name>] {
> interface. If not set, the same link-local address that is used as the
> source for Babel packets will be used. In normal operation, it should not
> be necessary to set this option.
> +
> + <tag><label id="babel-ipv4-via-ipv6">ipv4 via ipv6 <m/switch/</tag>
> + If enabled, Bird will accept and emit IPv4-via-IPv6 routes when IPv4
> + addresses are absent from the interface as described in draft-ietf-babel-v4viav6.
> + Default: yes.
> +
> </descrip>
>
> <sect1>Attributes
> diff --git a/proto/babel/babel.c b/proto/babel/babel.c
> index 4b6b9d7f..64a88601 100644
> --- a/proto/babel/babel.c
> +++ b/proto/babel/babel.c
> @@ -954,8 +954,15 @@ babel_send_update_(struct babel_iface *ifa, btime changed, struct fib *rtable)
> msg.update.router_id = e->router_id;
> net_copy(&msg.update.net, e->n.addr);
>
> - msg.update.next_hop = ((e->n.addr->type == NET_IP4) ?
> - ifa->next_hop_ip4 : ifa->next_hop_ip6);
> + if (e->n.addr->type == NET_IP4 && !ipa_zero(ifa->next_hop_ip4)) {
> + msg.update.next_hop = ifa->next_hop_ip4;
> + } else if (e->n.addr->type == NET_IP6 && !ipa_zero(ifa->next_hop_ip6)) {
> + msg.update.next_hop = ifa->next_hop_ip6;
> + /* If there is no IPv4 address on the interface (and we are supporting v4viav6)
> + * send the v6 next hop instead */
> + } else if (ifa->cf->ip4_via_ip6 && e->n.addr->type == NET_IP4 && ipa_zero(ifa->next_hop_ip4) && !ipa_zero(ifa->next_hop_ip6)) {
> + msg.update.next_hop = ifa->next_hop_ip6;
> + }
You're doing a bit of redundant checking here, since there's a check for
ipa_zero() just below. How about:
if (e->n.addr->type == NET_IP4) {
/* Always prefer v4 nexthop if set */
if(!ipa_zero(ifa->next_hop_ip4))
msg.update.next_hop = ifa->next_hop_ip4;
/* Only send v4-via-v6 if enabled */
else if (ifa->cf->ip4_via_ip6)
msg.update.next_hop = ifa->next_hop_ip6;
} else {
msg.update.next_hop = ifa->next_hop_ip6;
}
>
> /* Do not send route if next hop is unknown, e.g. no configured IPv4 address */
> if (ipa_zero(msg.update.next_hop))
> @@ -1247,6 +1254,12 @@ babel_handle_update(union babel_msg *m, struct babel_iface *ifa)
> return;
> }
>
> + /* Reject IPv4 via IPv6 routes if disabled */
> + if (!ifa->cf->ip4_via_ip6 && msg->net.type == NET_IP4 && ipa_is_ip6(msg->next_hop)) {
> + DBG("Babel: Ignoring disabled IPv4 via IPv6 route.\n");
> + return;
> + }
> +
> /* Regular update */
> e = babel_get_entry(p, &msg->net);
> r = babel_get_route(p, e, nbr); /* the route entry indexed by neighbour */
> @@ -1529,7 +1542,7 @@ babel_iface_update_addr4(struct babel_iface *ifa)
> ifa->next_hop_ip4 = ipa_nonzero(ifa->cf->next_hop_ip4) ? ifa->cf->next_hop_ip4 : addr4;
>
> if (ipa_zero(ifa->next_hop_ip4) && p->ip4_channel)
> - log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname);
> + log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname); // FIXME: make sure we do not log this when it is expected
Just check ifa->cf->ip4_via_ip6 here as well, and make sure it gets set
to false if the feature is not supported on the system?
> if (ifa->up)
> babel_iface_kick_timer(ifa);
> @@ -1894,8 +1907,8 @@ babel_show_interfaces(struct proto *P, const char *iff)
> }
>
> cli_msg(-1023, "%s:", p->p.name);
> - cli_msg(-1023, "%-10s %-6s %7s %6s %7s %-15s %s",
> - "Interface", "State", "RX cost", "Nbrs", "Timer",
> + cli_msg(-1023, "%-10s %-6s %-12s %7s %6s %7s %-15s %s",
> + "Interface", "State", "IPv4 via Ip6", "RX cost", "Nbrs", "Timer",
> "Next hop (v4)", "Next hop (v6)");
>
> WALK_LIST(ifa, p->interfaces)
> @@ -1908,8 +1921,9 @@ babel_show_interfaces(struct proto *P, const char *iff)
> nbrs++;
>
> btime timer = MIN(ifa->next_regular, ifa->next_hello) - current_time();
> - cli_msg(-1023, "%-10s %-6s %7u %6u %7t %-15I %I",
> + cli_msg(-1023, "%-10s %-6s %-12s %7u %6u %7t %-15I %I",
> ifa->iface->name, (ifa->up ? "Up" : "Down"),
> + (ifa->cf->ip4_via_ip6 ? "Yes" : "No"),
> ifa->cf->rxcost, nbrs, MAX(timer, 0),
> ifa->next_hop_ip4, ifa->next_hop_ip6);
> }
> diff --git a/proto/babel/babel.h b/proto/babel/babel.h
> index e075024c..146d28f5 100644
> --- a/proto/babel/babel.h
> +++ b/proto/babel/babel.h
> @@ -104,6 +104,7 @@ enum babel_ae_type {
> BABEL_AE_IP4 = 1,
> BABEL_AE_IP6 = 2,
> BABEL_AE_IP6_LL = 3,
> + BABEL_AE_IPV4_VIA_IPV6 = 4,
> BABEL_AE_MAX
> };
>
> @@ -137,6 +138,8 @@ struct babel_iface_config {
>
> ip_addr next_hop_ip4;
> ip_addr next_hop_ip6;
> +
> + u8 ip4_via_ip6; /* Enabe IPv4 via IPv6 */
Typo in comment
> };
>
> struct babel_proto {
> diff --git a/proto/babel/config.Y b/proto/babel/config.Y
> index 2f3b637b..f17884eb 100644
> --- a/proto/babel/config.Y
> +++ b/proto/babel/config.Y
> @@ -25,7 +25,7 @@ CF_DECLS
> CF_KEYWORDS(BABEL, INTERFACE, METRIC, RXCOST, HELLO, UPDATE, INTERVAL, PORT,
> TYPE, WIRED, WIRELESS, RX, TX, BUFFER, PRIORITY, LENGTH, CHECK, LINK,
> NEXT, HOP, IPV4, IPV6, BABEL_METRIC, SHOW, INTERFACES, NEIGHBORS,
> - ENTRIES, RANDOMIZE, ROUTER, ID)
> + ENTRIES, RANDOMIZE, ROUTER, ID, VIA)
>
> CF_GRAMMAR
>
> @@ -65,6 +65,7 @@ babel_iface_start:
> BABEL_IFACE->tx_tos = IP_PREC_INTERNET_CONTROL;
> BABEL_IFACE->tx_priority = sk_priority_control;
> BABEL_IFACE->check_link = 1;
> + BABEL_IFACE->ip4_via_ip6 = 1;
> };
>
>
> @@ -109,6 +110,7 @@ babel_iface_item:
> | CHECK LINK bool { BABEL_IFACE->check_link = $3; }
> | NEXT HOP IPV4 ipa { BABEL_IFACE->next_hop_ip4 = $4; if (!ipa_is_ip4($4)) cf_error("Must be an IPv4 address"); }
> | NEXT HOP IPV6 ipa { BABEL_IFACE->next_hop_ip6 = $4; if (!ipa_is_ip6($4)) cf_error("Must be an IPv6 address"); }
> + | IPV4 VIA IPV6 bool { BABEL_IFACE->ip4_via_ip6 = $4; }
> ;
>
> babel_iface_opts:
> diff --git a/proto/babel/packets.c b/proto/babel/packets.c
> index 415ac3f9..d4c66a53 100644
> --- a/proto/babel/packets.c
> +++ b/proto/babel/packets.c
> @@ -130,9 +130,11 @@ struct babel_parse_state {
> u64 router_id; /* Router ID used in subsequent updates */
> u8 def_ip6_prefix[16]; /* Implicit IPv6 prefix in network order */
> u8 def_ip4_prefix[4]; /* Implicit IPv4 prefix in network order */
> + u8 def_ip4viaip6_prefix[4]; /* Implicit IPv4 prefix in network order */
> u8 router_id_seen; /* router_id field is valid */
> u8 def_ip6_prefix_seen; /* def_ip6_prefix is valid */
> u8 def_ip4_prefix_seen; /* def_ip4_prefix is valid */
> + u8 def_ip4viaip6_prefix_seen; /* def_ip4viaip6_prefix is valid*/
> u8 current_tlv_endpos; /* End of self-terminating TLVs (offset from start) */
> u8 sadr_enabled;
> };
> @@ -246,6 +248,8 @@ static int babel_read_update(struct babel_tlv *hdr, union babel_msg *msg, struct
> static int babel_read_route_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state);
> static int babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state);
> static int babel_read_source_prefix(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state);
> +static int babel_read_update_ip4_prefix(struct babel_tlv_update *tlv, struct babel_msg_update *msg, u8 *def_prefix_seen,
> + u8 (*def_prefix)[4], ip_addr * next_hop, int len);
No need to forward-declare this function; you're only using it after it
was defined.
> static uint babel_write_ack(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len);
> static uint babel_write_hello(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len);
> @@ -398,9 +402,6 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
> msg->addr = IPA_NONE;
> msg->sender = state->saddr;
>
> - if (msg->ae >= BABEL_AE_MAX)
> - return PARSE_IGNORE;
> -
> /*
> * We only actually read link-local IPs. In every other case, the addr field
> * will be 0 but validation will succeed. The handler takes care of these
> @@ -413,13 +414,13 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
> if (TLV_OPT_LENGTH(tlv) < 4)
> return PARSE_ERROR;
> state->current_tlv_endpos += 4;
> - break;
> + return PARSE_SUCCESS;
>
> case BABEL_AE_IP6:
> if (TLV_OPT_LENGTH(tlv) < 16)
> return PARSE_ERROR;
> state->current_tlv_endpos += 16;
> - break;
> + return PARSE_SUCCESS;
>
> case BABEL_AE_IP6_LL:
> if (TLV_OPT_LENGTH(tlv) < 8)
> @@ -427,10 +428,10 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
>
> msg->addr = ipa_from_ip6(get_ip6_ll(&tlv->addr));
> state->current_tlv_endpos += 8;
> - break;
> + return PARSE_SUCCESS;
> }
>
> - return PARSE_SUCCESS;
> + return PARSE_IGNORE;
> }
>
> static uint
> @@ -530,6 +531,47 @@ babel_read_next_hop(struct babel_tlv *hdr, union babel_msg *m UNUSED,
> return PARSE_IGNORE;
> }
>
> +
> +/* This is called directrly from babel_read_next_hop to handle the ip4 address
> + compression state */
Typo in comment
> +static int
> +babel_read_update_ip4_prefix(struct babel_tlv_update *tlv,
The babel_read_* functions generally read a particular type of TLV, so
maybe change the name to make it obvious this is a helper function.
babel_get_ip4_prefix() ?
> + struct babel_msg_update *msg,
> + u8 *def_prefix_seen,
> + u8 (*def_prefix)[4],
> + ip_addr *next_hop,
> + int len)
I don't think the "pointer with array size" argument type is generally
used in the Bird code; is it even supported by all compilers?
Also, I don't think it's necessary to pass a pointer to next_hop; it's a
fairly small value, so copying it is fine, and if it's not a pointer
it's obvious that it's not modified.
> +{
> + if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
> + return PARSE_ERROR;
> +
> + /* Cannot omit data if there is no saved prefix */
> + if (tlv->omitted && !*def_prefix_seen)
> + return PARSE_ERROR;
> +
> + /* Update must have next hop, unless it is retraction */
> + if (ipa_zero(*next_hop) && msg->metric != BABEL_INFINITY)
> + return PARSE_IGNORE;
> +
> + /* Merge saved prefix and received prefix parts */
> + u8 buf[4] = {};
> + memcpy(buf, *def_prefix, tlv->omitted);
> + memcpy(buf + tlv->omitted, tlv->addr, len);
> +
> + ip4_addr prefix4 = get_ip4(buf);
> + net_fill_ip4(&msg->net, prefix4, tlv->plen);
> +
> + if (tlv->flags & BABEL_UF_DEF_PREFIX)
> + {
> + put_ip4(*def_prefix, prefix4);
> + *def_prefix_seen = 1;
> + }
> +
> + msg->next_hop = *next_hop;
> +
> + return PARSE_SUCCESS;
> +}
> +
> /* This is called directly from babel_write_update() and returns -1 if a next
> hop should be written but there is not enough space. */
> static int
> @@ -579,6 +621,7 @@ static int
> babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
> struct babel_parse_state *state)
> {
> + int rc;
> struct babel_tlv_update *tlv = (void *) hdr;
> struct babel_msg_update *msg = &m->update;
>
> @@ -589,7 +632,6 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
>
> /* Length of received prefix data without omitted part */
> int len = BYTES(tlv->plen) - (int) tlv->omitted;
> - u8 buf[16] = {};
>
> if ((len < 0) || ((uint) len > TLV_OPT_LENGTH(tlv)))
> return PARSE_ERROR;
> @@ -607,31 +649,20 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
> break;
>
> case BABEL_AE_IP4:
> - if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
> - return PARSE_ERROR;
> -
> - /* Cannot omit data if there is no saved prefix */
> - if (tlv->omitted && !state->def_ip4_prefix_seen)
> - return PARSE_ERROR;
> -
> - /* Update must have next hop, unless it is retraction */
> - if (ipa_zero(state->next_hop_ip4) && (msg->metric != BABEL_INFINITY))
> - return PARSE_IGNORE;
> -
> - /* Merge saved prefix and received prefix parts */
> - memcpy(buf, state->def_ip4_prefix, tlv->omitted);
> - memcpy(buf + tlv->omitted, tlv->addr, len);
> -
> - ip4_addr prefix4 = get_ip4(buf);
> - net_fill_ip4(&msg->net, prefix4, tlv->plen);
> + rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4_prefix_seen,
> + &state->def_ip4_prefix, &state->next_hop_ip4,
> + len);
> + if (rc != PARSE_SUCCESS)
> + return rc;
>
> - if (tlv->flags & BABEL_UF_DEF_PREFIX)
> - {
> - put_ip4(state->def_ip4_prefix, prefix4);
> - state->def_ip4_prefix_seen = 1;
> - }
> + break;
>
> - msg->next_hop = state->next_hop_ip4;
> + case BABEL_AE_IPV4_VIA_IPV6:
> + rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4viaip6_prefix_seen,
> + &state->def_ip4viaip6_prefix, &state->next_hop_ip6,
> + len);
> + if (rc != PARSE_SUCCESS)
> + return rc;
>
> break;
>
> @@ -644,6 +675,7 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
> return PARSE_ERROR;
>
> /* Merge saved prefix and received prefix parts */
> + u8 buf[16] = {};
> memcpy(buf, state->def_ip6_prefix, tlv->omitted);
> memcpy(buf + tlv->omitted, tlv->addr, len);
>
> @@ -746,7 +778,11 @@ babel_write_update(struct babel_tlv *hdr, union babel_msg *m,
> }
> else if (msg->net.type == NET_IP4)
> {
> - tlv->ae = BABEL_AE_IP4;
> + if (!ipa_zero(msg->next_hop) && ipa_is_ip6(msg->next_hop))
> + tlv->ae = BABEL_AE_IPV4_VIA_IPV6;
> + else
> + tlv->ae = BABEL_AE_IP4;
> +
> tlv->plen = net4_pxlen(&msg->net);
> put_ip4_px(tlv->addr, &msg->net);
> }
> @@ -814,7 +850,15 @@ babel_read_route_request(struct babel_tlv *hdr, union babel_msg *m,
> msg->full = 1;
> return PARSE_SUCCESS;
>
> + /*
> + * When receiving requests, AEs 1 (IPv4) and 4 (v4-via-v6) MUST be
> + * treated in the same manner: the receiver processes the request as
> + * described in Section 3.8 of [RFC6126bis]. If an Update is sent, then
> + * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above,
> + * irrespective of which AE was used in the request.
> + */
You write 4 in the beginning and TBD later here. Should obviously be
changed to the right number when it's assigned.
> case BABEL_AE_IP4:
> + case BABEL_AE_IPV4_VIA_IPV6:
> if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
> return PARSE_ERROR;
>
> @@ -915,7 +959,15 @@ babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *m,
> case BABEL_AE_WILDCARD:
> return PARSE_ERROR;
>
> + /*
> + * When receiving requests, AEs 1 (IPv4) and TBD (v4-via-v6) MUST be
> + * treated in the same manner: the receiver processes the request as
> + * described in Section 3.8 of [RFC6126bis]. If an Update is sent, then
> + * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above,
> + * irrespective of which AE was used in the request.
> + */
Here it's TBD everywhere, but see above.
> case BABEL_AE_IP4:
> + case BABEL_AE_IPV4_VIA_IPV6:
> if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
> return PARSE_ERROR;
>
> --
> 2.29.2
More information about the Bird-users
mailing list