[RFC] Babel: add v4viav6 support

Andreas Rammhold andreas at rammhold.de
Tue Dec 15 03:06:29 CET 2020


This is a first attempt at implementing draft-ietf-babel-v4viav6-00 as
IPv4 via IPv6 extension to the Babel routing protocol that allows
announcing routes to an IPv4 prefix with an IPv6 next-hop, which makes it
possible for IPv4 traffic to flow through interfaces that have not been
assigned an IPv4 address.

I've tried my best to be compliant with the current RFC version. One
point that is not entirely clear is how to implement in section 2.2.
regarding fallback when the device doesn't support v4-via-v6 routes.
(See notes below)

The implementation is compatible with the current Babeld PR
(https://github.com/jech/babeld/pull/56). I've verified this with a few
VMs in the following setup:

Bird <- v4 only -> Bird <- v6 only -> Babeld <- v4 only -> Babeld

Each routing daemon was running on their own VM and had L2 connectivity
to only its direct neighbors. At the nodes at the edges v4 networks have
been announced and full end-to-end communication was possible over the
mixed AF network. The v6 only link between Babel and Bird (at the
"center" of the above setup) did transport the v4 packets via the v6
link-local next hop addresses just as expected.

Thanks to Toke Høiland-Jørgensen for early review on this work.

-----< notes >------

(My current notes on the current implementation regarding
compliance with the current draft)

Bird doesn't implement compression for outgoing (send) IPv4 prefixes and
thus this also hasn't been implemented for this AE. Best guess is that 4
bytes are cheap enough to not bother with the added complexity? In the
RX path compression is supported for IPv4 prefixes (for the IPv4 AE
as well as for IPv4-via-IPv6 AE).

Especially the part last paragraph of 2.2. remains to be discussed/solved:

 * If a node cannot install v4-over-v6 routes, eg., due to hardware or
   software limitations, then routes to an IPv4 prefix with an IPv6
   next-hop MUST NOT be selected, as described in Section 3.5.3 of
   [RFC6126bis].

 * What if the kernel doesn't accept the RTA_VIA value we gave it?
   Does BIRD generally handle this already?
   One example is hitting: "ipv4: use IS_ENABLED instead of ifdef"
   (id:20201115224509.2020651-1-flokli at flokli.de @ linux-netdev) Where
   the kernel does support it but due to a bug fails to add those routes
   when CONFIG_IPV6=m.
   Kernel version comparison is probably a bad idea. Which kind of route
   can we attempt to install to test this functionality without breaking
   any setup?

Section 5: I am using the proposed value 4 to test interoperability with
the Babled PR (https://github.com/jech/babeld/pull/56/commits).
---
 doc/bird.sgml         |   7 +++
 proto/babel/babel.c   |  26 +++++++---
 proto/babel/babel.h   |   3 ++
 proto/babel/config.Y  |   4 +-
 proto/babel/packets.c | 116 ++++++++++++++++++++++++++++++------------
 5 files changed, 117 insertions(+), 39 deletions(-)

diff --git a/doc/bird.sgml b/doc/bird.sgml
index 5408cb2a..ee895a47 100644
--- a/doc/bird.sgml
+++ b/doc/bird.sgml
@@ -1770,6 +1770,7 @@ protocol babel [<name>] {
 		check link <switch>;
 		next hop ipv4 <address>;
 		next hop ipv6 <address>;
+		ipv4 via ipv6 <switch>;
 	};
 }
 </code>
@@ -1860,6 +1861,12 @@ protocol babel [<name>] {
       interface. If not set, the same link-local address that is used as the
       source for Babel packets will be used. In normal operation, it should not
       be necessary to set this option.
+
+      <tag><label id="babel-ipv4-via-ipv6">ipv4 via ipv6 <m/switch/</tag>
+      If enabled, Bird will accept and emit IPv4-via-IPv6 routes when IPv4
+      addresses are absent from the interface as described in draft-ietf-babel-v4viav6.
+      Default: yes.
+
 </descrip>
 
 <sect1>Attributes
diff --git a/proto/babel/babel.c b/proto/babel/babel.c
index 4b6b9d7f..64a88601 100644
--- a/proto/babel/babel.c
+++ b/proto/babel/babel.c
@@ -954,8 +954,15 @@ babel_send_update_(struct babel_iface *ifa, btime changed, struct fib *rtable)
     msg.update.router_id = e->router_id;
     net_copy(&msg.update.net, e->n.addr);
 
-    msg.update.next_hop = ((e->n.addr->type == NET_IP4) ?
-			   ifa->next_hop_ip4 : ifa->next_hop_ip6);
+    if (e->n.addr->type == NET_IP4 && !ipa_zero(ifa->next_hop_ip4)) {
+	    msg.update.next_hop = ifa->next_hop_ip4;
+    } else if (e->n.addr->type == NET_IP6 && !ipa_zero(ifa->next_hop_ip6)) {
+	    msg.update.next_hop = ifa->next_hop_ip6;
+    /* If there is no IPv4 address on the interface (and we are supporting v4viav6)
+     * send the v6 next hop instead */
+    } else if (ifa->cf->ip4_via_ip6 && e->n.addr->type == NET_IP4 && ipa_zero(ifa->next_hop_ip4) && !ipa_zero(ifa->next_hop_ip6)) {
+	    msg.update.next_hop = ifa->next_hop_ip6;
+    }
 
     /* Do not send route if next hop is unknown, e.g. no configured IPv4 address */
     if (ipa_zero(msg.update.next_hop))
@@ -1247,6 +1254,12 @@ babel_handle_update(union babel_msg *m, struct babel_iface *ifa)
     return;
   }
 
+  /* Reject IPv4 via IPv6 routes if disabled */
+  if (!ifa->cf->ip4_via_ip6 && msg->net.type == NET_IP4 && ipa_is_ip6(msg->next_hop)) {
+    DBG("Babel: Ignoring disabled IPv4 via IPv6 route.\n");
+    return;
+  }
+
   /* Regular update */
   e = babel_get_entry(p, &msg->net);
   r = babel_get_route(p, e, nbr); /* the route entry indexed by neighbour */
@@ -1529,7 +1542,7 @@ babel_iface_update_addr4(struct babel_iface *ifa)
   ifa->next_hop_ip4 = ipa_nonzero(ifa->cf->next_hop_ip4) ? ifa->cf->next_hop_ip4 : addr4;
 
   if (ipa_zero(ifa->next_hop_ip4) && p->ip4_channel)
-    log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname);
+    log(L_WARN "%s: Missing IPv4 next hop address for %s", p->p.name, ifa->ifname); // FIXME: make sure we do not log this when it is expected
 
   if (ifa->up)
     babel_iface_kick_timer(ifa);
@@ -1894,8 +1907,8 @@ babel_show_interfaces(struct proto *P, const char *iff)
   }
 
   cli_msg(-1023, "%s:", p->p.name);
-  cli_msg(-1023, "%-10s %-6s %7s %6s %7s %-15s %s",
-	  "Interface", "State", "RX cost", "Nbrs", "Timer",
+  cli_msg(-1023, "%-10s %-6s %-12s %7s %6s %7s %-15s %s",
+	  "Interface", "State", "IPv4 via Ip6", "RX cost", "Nbrs", "Timer",
 	  "Next hop (v4)", "Next hop (v6)");
 
   WALK_LIST(ifa, p->interfaces)
@@ -1908,8 +1921,9 @@ babel_show_interfaces(struct proto *P, const char *iff)
 	nbrs++;
 
     btime timer = MIN(ifa->next_regular, ifa->next_hello) - current_time();
-    cli_msg(-1023, "%-10s %-6s %7u %6u %7t %-15I %I",
+    cli_msg(-1023, "%-10s %-6s %-12s %7u %6u %7t %-15I %I",
 	    ifa->iface->name, (ifa->up ? "Up" : "Down"),
+	    (ifa->cf->ip4_via_ip6 ? "Yes" : "No"),
 	    ifa->cf->rxcost, nbrs, MAX(timer, 0),
 	    ifa->next_hop_ip4, ifa->next_hop_ip6);
   }
diff --git a/proto/babel/babel.h b/proto/babel/babel.h
index e075024c..146d28f5 100644
--- a/proto/babel/babel.h
+++ b/proto/babel/babel.h
@@ -104,6 +104,7 @@ enum babel_ae_type {
   BABEL_AE_IP4			= 1,
   BABEL_AE_IP6			= 2,
   BABEL_AE_IP6_LL		= 3,
+  BABEL_AE_IPV4_VIA_IPV6	= 4,
   BABEL_AE_MAX
 };
 
@@ -137,6 +138,8 @@ struct babel_iface_config {
 
   ip_addr next_hop_ip4;
   ip_addr next_hop_ip6;
+
+  u8 ip4_via_ip6;			/* Enabe IPv4 via IPv6 */
 };
 
 struct babel_proto {
diff --git a/proto/babel/config.Y b/proto/babel/config.Y
index 2f3b637b..f17884eb 100644
--- a/proto/babel/config.Y
+++ b/proto/babel/config.Y
@@ -25,7 +25,7 @@ CF_DECLS
 CF_KEYWORDS(BABEL, INTERFACE, METRIC, RXCOST, HELLO, UPDATE, INTERVAL, PORT,
 	TYPE, WIRED, WIRELESS, RX, TX, BUFFER, PRIORITY, LENGTH, CHECK, LINK,
 	NEXT, HOP, IPV4, IPV6, BABEL_METRIC, SHOW, INTERFACES, NEIGHBORS,
-	ENTRIES, RANDOMIZE, ROUTER, ID)
+	ENTRIES, RANDOMIZE, ROUTER, ID, VIA)
 
 CF_GRAMMAR
 
@@ -65,6 +65,7 @@ babel_iface_start:
   BABEL_IFACE->tx_tos = IP_PREC_INTERNET_CONTROL;
   BABEL_IFACE->tx_priority = sk_priority_control;
   BABEL_IFACE->check_link = 1;
+  BABEL_IFACE->ip4_via_ip6 = 1;
 };
 
 
@@ -109,6 +110,7 @@ babel_iface_item:
  | CHECK LINK bool { BABEL_IFACE->check_link = $3; }
  | NEXT HOP IPV4 ipa { BABEL_IFACE->next_hop_ip4 = $4; if (!ipa_is_ip4($4)) cf_error("Must be an IPv4 address"); }
  | NEXT HOP IPV6 ipa { BABEL_IFACE->next_hop_ip6 = $4; if (!ipa_is_ip6($4)) cf_error("Must be an IPv6 address"); }
+ | IPV4 VIA IPV6 bool { BABEL_IFACE->ip4_via_ip6 = $4; }
  ;
 
 babel_iface_opts:
diff --git a/proto/babel/packets.c b/proto/babel/packets.c
index 415ac3f9..d4c66a53 100644
--- a/proto/babel/packets.c
+++ b/proto/babel/packets.c
@@ -130,9 +130,11 @@ struct babel_parse_state {
   u64 router_id;		/* Router ID used in subsequent updates */
   u8 def_ip6_prefix[16];	/* Implicit IPv6 prefix in network order */
   u8 def_ip4_prefix[4];		/* Implicit IPv4 prefix in network order */
+  u8 def_ip4viaip6_prefix[4];	/* Implicit IPv4 prefix in network order */
   u8 router_id_seen;		/* router_id field is valid */
   u8 def_ip6_prefix_seen;	/* def_ip6_prefix is valid */
   u8 def_ip4_prefix_seen;	/* def_ip4_prefix is valid */
+  u8 def_ip4viaip6_prefix_seen; /* def_ip4viaip6_prefix is valid*/
   u8 current_tlv_endpos;	/* End of self-terminating TLVs (offset from start) */
   u8 sadr_enabled;
 };
@@ -246,6 +248,8 @@ static int babel_read_update(struct babel_tlv *hdr, union babel_msg *msg, struct
 static int babel_read_route_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state);
 static int babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state);
 static int babel_read_source_prefix(struct babel_tlv *hdr, union babel_msg *msg, struct babel_parse_state *state);
+static int babel_read_update_ip4_prefix(struct babel_tlv_update *tlv, struct babel_msg_update *msg, u8 *def_prefix_seen,
+		                        u8 (*def_prefix)[4], ip_addr * next_hop, int len);
 
 static uint babel_write_ack(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len);
 static uint babel_write_hello(struct babel_tlv *hdr, union babel_msg *msg, struct babel_write_state *state, uint max_len);
@@ -398,9 +402,6 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
   msg->addr = IPA_NONE;
   msg->sender = state->saddr;
 
-  if (msg->ae >= BABEL_AE_MAX)
-    return PARSE_IGNORE;
-
   /*
    * We only actually read link-local IPs. In every other case, the addr field
    * will be 0 but validation will succeed. The handler takes care of these
@@ -413,13 +414,13 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
     if (TLV_OPT_LENGTH(tlv) < 4)
       return PARSE_ERROR;
     state->current_tlv_endpos += 4;
-    break;
+    return PARSE_SUCCESS;
 
   case BABEL_AE_IP6:
     if (TLV_OPT_LENGTH(tlv) < 16)
       return PARSE_ERROR;
     state->current_tlv_endpos += 16;
-    break;
+    return PARSE_SUCCESS;
 
   case BABEL_AE_IP6_LL:
     if (TLV_OPT_LENGTH(tlv) < 8)
@@ -427,10 +428,10 @@ babel_read_ihu(struct babel_tlv *hdr, union babel_msg *m,
 
     msg->addr = ipa_from_ip6(get_ip6_ll(&tlv->addr));
     state->current_tlv_endpos += 8;
-    break;
+    return PARSE_SUCCESS;
   }
 
-  return PARSE_SUCCESS;
+  return PARSE_IGNORE;
 }
 
 static uint
@@ -530,6 +531,47 @@ babel_read_next_hop(struct babel_tlv *hdr, union babel_msg *m UNUSED,
   return PARSE_IGNORE;
 }
 
+
+/* This is called directrly from babel_read_next_hop to handle the ip4 address
+   compression state */
+static int
+babel_read_update_ip4_prefix(struct babel_tlv_update *tlv,
+                             struct babel_msg_update *msg,
+                             u8 *def_prefix_seen,
+                             u8 (*def_prefix)[4],
+                             ip_addr *next_hop,
+                             int len)
+{
+    if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
+      return PARSE_ERROR;
+
+    /* Cannot omit data if there is no saved prefix */
+    if (tlv->omitted && !*def_prefix_seen)
+      return PARSE_ERROR;
+
+    /* Update must have next hop, unless it is retraction */
+    if (ipa_zero(*next_hop) && msg->metric != BABEL_INFINITY)
+      return PARSE_IGNORE;
+
+    /* Merge saved prefix and received prefix parts */
+    u8 buf[4] = {};
+    memcpy(buf, *def_prefix, tlv->omitted);
+    memcpy(buf + tlv->omitted, tlv->addr, len);
+
+    ip4_addr prefix4 = get_ip4(buf);
+    net_fill_ip4(&msg->net, prefix4, tlv->plen);
+
+    if (tlv->flags & BABEL_UF_DEF_PREFIX)
+    {
+      put_ip4(*def_prefix, prefix4);
+      *def_prefix_seen = 1;
+    }
+
+    msg->next_hop = *next_hop;
+
+    return PARSE_SUCCESS;
+}
+
 /* This is called directly from babel_write_update() and returns -1 if a next
    hop should be written but there is not enough space. */
 static int
@@ -579,6 +621,7 @@ static int
 babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
                   struct babel_parse_state *state)
 {
+  int rc;
   struct babel_tlv_update *tlv = (void *) hdr;
   struct babel_msg_update *msg = &m->update;
 
@@ -589,7 +632,6 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
 
   /* Length of received prefix data without omitted part */
   int len = BYTES(tlv->plen) - (int) tlv->omitted;
-  u8 buf[16] = {};
 
   if ((len < 0) || ((uint) len > TLV_OPT_LENGTH(tlv)))
     return PARSE_ERROR;
@@ -607,31 +649,20 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
     break;
 
   case BABEL_AE_IP4:
-    if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
-      return PARSE_ERROR;
-
-    /* Cannot omit data if there is no saved prefix */
-    if (tlv->omitted && !state->def_ip4_prefix_seen)
-      return PARSE_ERROR;
-
-    /* Update must have next hop, unless it is retraction */
-    if (ipa_zero(state->next_hop_ip4) && (msg->metric != BABEL_INFINITY))
-      return PARSE_IGNORE;
-
-    /* Merge saved prefix and received prefix parts */
-    memcpy(buf, state->def_ip4_prefix, tlv->omitted);
-    memcpy(buf + tlv->omitted, tlv->addr, len);
-
-    ip4_addr prefix4 = get_ip4(buf);
-    net_fill_ip4(&msg->net, prefix4, tlv->plen);
+    rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4_prefix_seen,
+		                      &state->def_ip4_prefix, &state->next_hop_ip4,
+				      len);
+    if (rc != PARSE_SUCCESS)
+      return rc;
 
-    if (tlv->flags & BABEL_UF_DEF_PREFIX)
-    {
-      put_ip4(state->def_ip4_prefix, prefix4);
-      state->def_ip4_prefix_seen = 1;
-    }
+    break;
 
-    msg->next_hop = state->next_hop_ip4;
+  case BABEL_AE_IPV4_VIA_IPV6:
+    rc = babel_read_update_ip4_prefix(tlv, msg, &state->def_ip4viaip6_prefix_seen,
+    				      &state->def_ip4viaip6_prefix, &state->next_hop_ip6,
+				      len);
+    if (rc != PARSE_SUCCESS)
+      return rc;
 
     break;
 
@@ -644,6 +675,7 @@ babel_read_update(struct babel_tlv *hdr, union babel_msg *m,
       return PARSE_ERROR;
 
     /* Merge saved prefix and received prefix parts */
+    u8 buf[16] = {};
     memcpy(buf, state->def_ip6_prefix, tlv->omitted);
     memcpy(buf + tlv->omitted, tlv->addr, len);
 
@@ -746,7 +778,11 @@ babel_write_update(struct babel_tlv *hdr, union babel_msg *m,
   }
   else if (msg->net.type == NET_IP4)
   {
-    tlv->ae = BABEL_AE_IP4;
+    if (!ipa_zero(msg->next_hop) && ipa_is_ip6(msg->next_hop))
+      tlv->ae = BABEL_AE_IPV4_VIA_IPV6;
+    else
+      tlv->ae = BABEL_AE_IP4;
+
     tlv->plen = net4_pxlen(&msg->net);
     put_ip4_px(tlv->addr, &msg->net);
   }
@@ -814,7 +850,15 @@ babel_read_route_request(struct babel_tlv *hdr, union babel_msg *m,
     msg->full = 1;
     return PARSE_SUCCESS;
 
+  /*
+   * When receiving requests, AEs 1 (IPv4) and 4 (v4-via-v6) MUST be
+   * treated in the same manner: the receiver processes the request as
+   * described in Section 3.8 of [RFC6126bis].  If an Update is sent, then
+   * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above,
+   * irrespective of which AE was used in the request.
+   */
   case BABEL_AE_IP4:
+  case BABEL_AE_IPV4_VIA_IPV6:
     if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
       return PARSE_ERROR;
 
@@ -915,7 +959,15 @@ babel_read_seqno_request(struct babel_tlv *hdr, union babel_msg *m,
   case BABEL_AE_WILDCARD:
     return PARSE_ERROR;
 
+  /*
+   * When receiving requests, AEs 1 (IPv4) and TBD (v4-via-v6) MUST be
+   * treated in the same manner: the receiver processes the request as
+   * described in Section 3.8 of [RFC6126bis].  If an Update is sent, then
+   * it MAY be sent with AE 1 or TBD, as described in Section 2.1 above,
+   * irrespective of which AE was used in the request.
+   */
   case BABEL_AE_IP4:
+  case BABEL_AE_IPV4_VIA_IPV6:
     if (tlv->plen > IP4_MAX_PREFIX_LENGTH)
       return PARSE_ERROR;
 
-- 
2.29.2



More information about the Bird-users mailing list