[PATCH RFC 0/1] Adding the Babel routing protocol to Bird
Hi everyone Over the past couple of weeks, I have amused myself with adding support for the Babel routing protocol (RFC 6126) to Bird. Quoting the RFC: Babel is a loop-avoiding distance-vector routing protocol that is robust and efficient both in ordinary wired networks and in wireless mesh networks. The implementation is a clean-slate implementation from the RFC, and it is now at the state where it is reasonably complete (I think), and communicates with the official babeld implementation. At this stage I'm not proposing this be included in Bird proper: I'm pretty sure I got at least parts of the interaction with the core wrong, and there is probably some cleanup needed; and as you can see from the list of limitations below, some things are still needed for this to be a production-quality implementation of the protocol. However, I thought that having some feedback to guide me at this stage would be useful. I would thus like to solicit your opinion on whether or not this is something you could see eventually making it into Bird proper (given that I perform the necessary rework of the code)? And if so, to comment on the implementation and how it might be improved. In particular, I'm interested in your opinion on what to do about IPv4 support (see below). Limitations of the implementation at this stage are (at least): - The implementation is IPv6-only. The Babel protocol supports carrying IPv4 and IPv6 routes in the same protocol instance, and the official implementation only speaks IPv6. The RFC does specify an IPv4 multicast group, v4 support could conceivably be added in the same manner as the other protocols. - Several of the SHOULDs in the RFC are not implemented. Most notably, this implementation is somewhat more chatty than the official implementation, and the route selection procedure is quite rudimentary. - None of the Babel protocol extensions are currently implemented. - I have only performed rudimentary testing: I have confirmed that the implementation can communicate and exchange routes with the official implementation, and have even had it running on my own Babel-enabled VPN. However, I have not tested it in a position in the network where it is likely to suffer from starvation or even have multiple routes for the same prefixes. The protocol addition is included in the next email as a single big patch. Alternatively, I have a repo on github with the whole thing (including the commit history): https://github.com/tohojo/bird. Looking forward to your comments; and thanks in advance! :) Cheers, -Toke Toke Høiland-Jørgensen (1): Add the Babel routing protocol (RFC 6129) to Bird. configure.in | 3 + lib/ip.h | 1 + nest/proto.c | 4 + nest/protocol.h | 2 +- nest/route.h | 11 +- proto/Doc | 1 + proto/babel/Doc | 2 + proto/babel/Makefile | 5 + proto/babel/babel.c | 1417 ++++++++++++++++++++++++++++++++++++++++++++++++++ proto/babel/babel.h | 427 +++++++++++++++ proto/babel/config.Y | 84 +++ proto/babel/packet.c | 436 ++++++++++++++++ sysdep/autoconf.h.in | 1 + 13 files changed, 2392 insertions(+), 2 deletions(-) create mode 100644 proto/babel/Doc create mode 100644 proto/babel/Makefile create mode 100644 proto/babel/babel.c create mode 100644 proto/babel/babel.h create mode 100644 proto/babel/config.Y create mode 100644 proto/babel/packet.c -- 2.5.0
The implementation is currently IPv6-only (since Bird does not support mixed IPv4/6 protocols in the same instance), but is otherwise complete as far as MUSTs in the RFC is concerned, with one exception: The RFC specifies that jitter must be applied to packet transmission times. This implementation currently does not buffer TLVs before sending them, but it does introduce some randomness to sending of global updates by timer and event scheduling randomness in the Bird core. Some cleanup and possible reorganisation is needed in the code. Consider this patch an RFC for the general structure of the protocol addition. The implementation started out as a copy of the RIP protocol implementation, so some patterns in how the communication with the core is setup is taken from there (but has diverted quite a bit as the different parts of the protocol implementation fell into place). Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> --- configure.in | 3 + lib/ip.h | 1 + nest/proto.c | 4 + nest/protocol.h | 2 +- nest/route.h | 11 +- proto/Doc | 1 + proto/babel/Doc | 2 + proto/babel/Makefile | 5 + proto/babel/babel.c | 1417 ++++++++++++++++++++++++++++++++++++++++++++++++++ proto/babel/babel.h | 427 +++++++++++++++ proto/babel/config.Y | 84 +++ proto/babel/packet.c | 436 ++++++++++++++++ sysdep/autoconf.h.in | 1 + 13 files changed, 2392 insertions(+), 2 deletions(-) create mode 100644 proto/babel/Doc create mode 100644 proto/babel/Makefile create mode 100644 proto/babel/babel.c create mode 100644 proto/babel/babel.h create mode 100644 proto/babel/config.Y create mode 100644 proto/babel/packet.c diff --git a/configure.in b/configure.in index c81709e..1265565 100644 --- a/configure.in +++ b/configure.in @@ -206,6 +206,9 @@ fi AC_SUBST(iproutedir) all_protocols="$proto_bfd bgp ospf pipe $proto_radv rip static" +if test "$ip" = ipv6 ; then + all_protocols="$all_protocols babel" +fi all_protocols=`echo $all_protocols | sed 's/ /,/g'` if test "$with_protocols" = all ; then diff --git a/lib/ip.h b/lib/ip.h index 90bb7f8..fa18da1 100644 --- a/lib/ip.h +++ b/lib/ip.h @@ -23,6 +23,7 @@ #define IP6_OSPF_ALL_ROUTERS ipa_build6(0xFF020000, 0, 0, 5) #define IP6_OSPF_DES_ROUTERS ipa_build6(0xFF020000, 0, 0, 6) #define IP6_RIP_ROUTERS ipa_build6(0xFF020000, 0, 0, 9) +#define IP6_BABEL_ROUTERS ipa_build6(0xFF020000, 0, 0, 0x00010006) #define IP4_NONE _MI4(0) #define IP6_NONE _MI6(0,0,0,0) diff --git a/nest/proto.c b/nest/proto.c index 6531083..5804b73 100644 --- a/nest/proto.c +++ b/nest/proto.c @@ -919,6 +919,10 @@ protos_build(void) proto_build(&proto_bfd); bfd_init_all(); #endif +#ifdef CONFIG_BABEL + proto_build(&proto_babel); + bfd_init_all(); +#endif proto_pool = rp_new(&root_pool, "Protocols"); proto_flush_event = ev_new(proto_pool); diff --git a/nest/protocol.h b/nest/protocol.h index 8c49154..ec78735 100644 --- a/nest/protocol.h +++ b/nest/protocol.h @@ -76,7 +76,7 @@ void protos_dump_all(void); extern struct protocol proto_device, proto_radv, proto_rip, proto_static, - proto_ospf, proto_pipe, proto_bgp, proto_bfd; + proto_ospf, proto_pipe, proto_bgp, proto_bfd, proto_babel; /* * Routing Protocol Instance diff --git a/nest/route.h b/nest/route.h index 6067526..a5d71a8 100644 --- a/nest/route.h +++ b/nest/route.h @@ -214,6 +214,12 @@ typedef struct rte { u8 suppressed; /* Used for deterministic MED comparison */ } bgp; #endif +#ifdef CONFIG_BABEL + struct { + u16 metric; /* Babel metric */ + u64 router_id; /* Babel router id */ + } babel; +#endif struct { /* Routes generated by krt sync (both temporary and inherited ones) */ s8 src; /* Alleged route source (see krt.h) */ u8 proto; /* Kernel source protocol ID */ @@ -368,6 +374,7 @@ typedef struct rta { #define RTS_OSPF_EXT2 10 /* OSPF external route type 2 */ #define RTS_BGP 11 /* BGP route */ #define RTS_PIPE 12 /* Inter-table wormhole */ +#define RTS_BABEL 13 /* Babel route */ #define RTC_UNICAST 0 #define RTC_BROADCAST 1 @@ -416,7 +423,8 @@ typedef struct eattr { #define EAP_RIP 2 /* RIP */ #define EAP_OSPF 3 /* OSPF */ #define EAP_KRT 4 /* Kernel route attributes */ -#define EAP_MAX 5 +#define EAP_BABEL 5 /* Babel attributes */ +#define EAP_MAX 6 #define EA_CODE(proto,id) (((proto) << 8) | (id)) #define EA_PROTO(ea) ((ea) >> 8) @@ -538,6 +546,7 @@ extern struct protocol *attr_class_to_protocol[EAP_MAX]; #define DEF_PREF_RIP 120 /* RIP */ #define DEF_PREF_BGP 100 /* BGP */ #define DEF_PREF_PIPE 70 /* Routes piped from other tables */ +#define DEF_PREF_BABEL 42 /* Babel */ #define DEF_PREF_INHERITED 10 /* Routes inherited from other routing daemons */ diff --git a/proto/Doc b/proto/Doc index 7863472..04c25bc 100644 --- a/proto/Doc +++ b/proto/Doc @@ -1,4 +1,5 @@ H Protocols +C babel C bfd C bgp C ospf diff --git a/proto/babel/Doc b/proto/babel/Doc new file mode 100644 index 0000000..2480239 --- /dev/null +++ b/proto/babel/Doc @@ -0,0 +1,2 @@ +S babel.c +S packet.c diff --git a/proto/babel/Makefile b/proto/babel/Makefile new file mode 100644 index 0000000..a27b010 --- /dev/null +++ b/proto/babel/Makefile @@ -0,0 +1,5 @@ +source=babel.c packet.c +root-rel=../../ +dir-name=proto/babel + +include ../../Rules diff --git a/proto/babel/babel.c b/proto/babel/babel.c new file mode 100644 index 0000000..52c961f --- /dev/null +++ b/proto/babel/babel.c @@ -0,0 +1,1417 @@ +/* -*- c-file-style: "gnu"; -*- + * + * The Babel protocol + * + * Copyright (c) 2015 Toke Høiland-Jørgensen + * + * Can be freely distributed and used under the terms of the GNU GPL. + * + * This file contains the main routines for handling and sending TLVs, as + * well as timers and interaction with the nest. + */ + +/** + * DOC: The Babel protocol + * + * Babel (RFC6126) is a loop-avoiding distance-vector routing protocol that is + * robust and efficient both in ordinary wired networks and in wireless mesh + * networks. + * + * The Babel protocol keeps state for each neighbour in a &babel_neighbor + * struct, tracking received hellos and I Heard You (IHU) messages. A + * &babel_interface struct keeps hello and update timers for each interface, and + * a separate hello seqno is maintained for each interface. + * + * For each prefix, Babel keeps track of both the possible routes + * (with next hop and router IDs), as well as the feasibility distance for each + * prefix and router id. The prefix itself is tracked in a &babel_entry struct, + * while the possible routes for the prefix are tracked as &babel_route entries + * and the feasibility distance is maintained through &babel_source structures. + * + * The main route selection is done in babel_select_route(). This is called when + * an update for a prefix is received, when a new prefix is received from the + * nest, and when a prefix expiry timer fires. It performs feasibility checks on + * the available routes for the prefix and selects the one with the lowest + * metric. + */ + +#undef LOCAL_DEBUG +#define LOCAL_DEBUG 1 + +#include "babel.h" + +#define P ((struct babel_proto *) p) +#define P_CF ((struct babel_proto_config *)p->cf) + +#undef TRACE +#define TRACE(level, msg, args...) do { if (p->debug & level) { log(L_TRACE "%s: " msg, p->name , ## args); } } while(0) +#define BAD( x ) { log( L_REMOTE "%s: " x, p->name ); return 1; } + +/* computes a-b % 65535 for u16 datatypes */ +static inline u16 diff_mod64k(u16 a, u16 b) +{ + return a >= b ? a-b : 0xffff-b+a; +} +/* Is one number larger than another mod 65535? Since diff_mod64k is always >= + 0, just use a simple cutoff value to determine if the difference is small + enough that one is really larger. Since these comparisons are only made for + values that should not differ by more than a few numbers, this should be + safe.*/ +static inline u16 ge_mod64k(u16 a, u16 b) +{ + return diff_mod64k(a,b) < 0xfff0; +} + +static struct babel_interface *new_iface(struct proto *p, struct iface *new, + unsigned long flags, struct iface_patt *patt); +static void babel_send_ihus(void *bif); +static void expire_hello(timer *t); +static void expire_ihu(timer *t); +static void expire_source(timer *t); +static void expire_route(timer *t); +static void babel_dump_entry(struct babel_entry *e); +static void babel_select_route(struct babel_entry *e); + + +static void babel_init_entry(struct fib_node *n) +{ + struct babel_entry *e = (struct babel_entry *)n; + e->proto = NULL; + e->selected = NULL; + init_list(&e->sources); + init_list(&e->routes); +} + + +static inline struct babel_entry * babel_find_entry(struct proto *p, ip_addr prefix, u8 plen) +{ + return fib_find(&P->rtable, &prefix, plen); +} +static struct babel_entry * babel_get_entry(struct proto *p, ip_addr prefix, u8 plen) +{ + struct babel_entry *e = babel_find_entry(p, prefix, plen); + if(e) return e; + e = fib_get(&P->rtable, &prefix, plen); + e->proto = p; + e->pool = rp_new(p->pool, "Babel entry"); + e->source_expiry_timer = tm_new_set(e->pool, expire_source, e, 0, BABEL_SOURCE_EXPIRY); + tm_start(e->source_expiry_timer, BABEL_SOURCE_EXPIRY); + return e; +} + +void babel_flush_entry(struct babel_entry *e) +{ + struct proto *p = e->proto; + tm_stop(e->source_expiry_timer); + rfree(e->pool); + if(p) fib_delete(&P->rtable, e); +} + +static struct babel_source * babel_find_source(struct babel_entry *e, u64 router_id) +{ + struct babel_source *s; + WALK_LIST(s, e->sources) + if(s->router_id == router_id) + return s; + return NULL; +} + +static struct babel_source * babel_get_source(struct babel_entry *e, u64 router_id) +{ + struct babel_source *s = babel_find_source(e, router_id); + if(s) return s; + s = mb_allocz(e->pool, sizeof(struct babel_source)); + s->router_id = router_id; + s->updated = now; + s->e = e; + add_tail(&e->sources, NODE s); + return s; +} + +static void expire_source(timer *t) +{ + struct babel_entry *e = t->data; + struct proto *p = e->proto; + TRACE(D_EVENTS, "Source expiry timer for %I/%d fired", e->n.prefix, e->n.pxlen); + struct babel_source *n, *nx; + WALK_LIST_DELSAFE(n, nx, e->sources) { + if(n->updated < now-BABEL_SOURCE_EXPIRY) { + rem_node(NODE n); + mb_free(n); + } + } + if(EMPTY_LIST(e->sources) && EMPTY_LIST(e->routes)) + babel_flush_entry(e); +} + +static struct babel_route * babel_find_route(struct babel_entry *e, struct babel_neighbor *n) +{ + struct babel_route *r; + WALK_LIST(r, e->routes) + if(r->neigh == n) + return r; + return NULL; +} +static struct babel_route * babel_get_route(struct babel_entry *e, struct babel_neighbor *n) +{ + struct babel_route *r = babel_find_route(e,n); + if(r) return r; + r = mb_allocz(e->pool, sizeof(struct babel_route)); + r->neigh = n; + r->e = e; + r->neigh_route.r = r; + r->expiry_timer = tm_new_set(e->pool, expire_route, r, 0, 0); + add_tail(&e->routes, NODE r); + if(n) add_tail(&n->routes, NODE &r->neigh_route); + return r; +} + +static void babel_flush_route(struct babel_route *r) +{ + DBG("Flush route %I/%d router_id %0lx\n", + r->e->n.prefix, r->e->n.pxlen, r->router_id); + tm_stop(r->expiry_timer); + rem_node(NODE r); + if(r->neigh) rem_node(NODE &r->neigh_route); + if(r->e->selected == r) r->e->selected = NULL; + mb_free(r); +} +static void expire_route(timer *t) +{ + struct babel_route *r = t->data; + struct proto *p = r->e->proto; + TRACE(D_EVENTS, "Route expiry timer for %I/%d router_id %0lx fired", + r->e->n.prefix, r->e->n.pxlen, r->router_id); + if(r->metric < BABEL_INFINITY) { + r->metric = BABEL_INFINITY; + tm_start(r->expiry_timer, r->expiry_interval); + } else { + babel_flush_route(r); + } + + babel_select_route(r->e); +} + + +static struct babel_neighbor * babel_find_neighbor(struct babel_interface *bif, ip_addr addr) +{ + struct proto *p = bif->proto; + neighbor *n = neigh_find2(p, &addr, bif->iface, NEF_STICKY); + return (n->data) ? n->data : NULL; +} + +static struct babel_neighbor * babel_get_neighbor(struct babel_interface *bif, ip_addr addr) +{ + struct proto *p = bif->proto; + neighbor *n = neigh_find2(p, &addr, bif->iface, NEF_STICKY); + if (n->data) return n->data; + pool *pool = rp_new(bif->pool, "Babel neighbor"); + struct babel_neighbor *bn = mb_allocz(pool, sizeof(struct babel_neighbor)); + bn->bif = bif; + bn->pool = pool; + bn->neigh = n; + bn->addr = n->addr; + bn->txcost = BABEL_INFINITY; + bn->hello_timer = tm_new_set(bn->pool, expire_hello, bn, 0, 0); + bn->ihu_timer = tm_new_set(bn->pool, expire_ihu, bn, 0, 0); + n->data = bn; + init_list(&bn->routes); + add_tail(&bif->neigh_list, NODE bn); + ev_schedule(bif->ihu_event); + return bn; +} + + +/** + From the RFC (section 3.5.1): + + a route advertisement carrying the quintuple (prefix, plen, router-id, seqno, + metric) is feasible if one of the following conditions holds: + + - metric is infinite; or + + - no entry exists in the source table indexed by (id, prefix, plen); or + + - an entry (prefix, plen, router-id, seqno', metric') exists in the source + table, and either + + - seqno' < seqno or + - seqno = seqno' and metric < metric'. +*/ +static inline int is_feasible(struct babel_source *s, u16 seqno, u16 metric) +{ + if(!s || metric == BABEL_INFINITY) return 1; + return (seqno > s->seqno + || (seqno == s->seqno && metric < s->metric)); +} + +static u16 babel_compute_rxcost(struct babel_neighbor *bn) +{ + struct babel_interface *bif = bn->bif; + struct proto *p = bif->proto; + u8 n, missed; + u16 map=bn->hello_map; + + if(!map) return BABEL_INFINITY; + n = __builtin_popcount(map); // number of bits set + missed = bn->hello_n-n; + + if(bif->type == BABEL_IFACE_TYPE_WIRED) { + /* k-out-of-j selection - Appendix 2.1 in the RFC. */ + DBG("Missed %d hellos from %I\n", missed, bn->addr); + /* Link is bad if more than half the expected hellos were lost */ + return (missed > 0 && n/missed < 2) ? BABEL_INFINITY : bif->rxcost; + } else if(bif->type == BABEL_IFACE_TYPE_WIRELESS) { + /* ETX - Appendix 2.2 in the RFC */ + double beta; /* FIXME is this right? */ + if(!missed) return BABEL_RXCOST_WIRELESS; + beta = 1-missed/bn->hello_n; + return (beta > 0) ? BABEL_RXCOST_WIRELESS/beta : BABEL_RXCOST_WIRELESS; + } else { + BAD("Unknown interface type!"); + } +} + + +static u16 compute_cost(struct babel_neighbor *bn) +{ + struct babel_interface *bif = bn->bif; + struct proto *p = bif->proto; + u16 rxcost = babel_compute_rxcost(bn); + if(rxcost == BABEL_INFINITY) return rxcost; + else if(bif->type == BABEL_IFACE_TYPE_WIRED) { + /* k-out-of-j selection - Appendix 2.1 in the RFC. */ + return bn->txcost; + } else if(bif->type == BABEL_IFACE_TYPE_WIRELESS) { + /* ETX - Appendix 2.2 in the RFC */ + return (MAX(bn->txcost, 256) * rxcost)/256; + } else { + BAD("Unknown interface type!"); + } +} + +/* Simple additive metric - Appendix 3.1 in the RFC */ +static u16 compute_metric(struct babel_neighbor *bn, u16 metric) +{ + u16 cost = compute_cost(bn); + return (cost == BABEL_INFINITY) ? cost : cost+metric; +} + +static int +babel_rte_same(struct rte *new, struct rte *old) +{ + return new->u.babel.router_id == old->u.babel.router_id; +} + + +static int babel_rte_better(struct rte *new, struct rte *old) +{ + return new->u.babel.metric < old->u.babel.metric; +} + +static rte * babel_build_rte(struct proto *p, net *n, struct babel_route *r) +{ + rta *a, A; + rte *rte; + memset(&A, 0, sizeof(A)); + A.src = p->main_source; + A.source = RTS_BABEL; + A.scope = SCOPE_UNIVERSE; + A.cast = RTC_UNICAST; + A.dest = r->metric == BABEL_INFINITY ? RTD_UNREACHABLE : RTD_ROUTER; + A.flags = 0; + A.gw = r->next_hop; + A.from = r->neigh->addr; + A.iface = r->neigh->bif->iface; + a = rta_lookup(&A); + rte = rte_get_temp(a); + rte->u.babel.metric = r->metric; + rte->u.babel.router_id = r->router_id; + rte->net = n; + rte->pflags = 0; + return rte; +} + +static void babel_send_seqno_request(struct babel_entry *e) +{ + struct proto *p = e->proto; + struct babel_route *r = e->selected; + struct babel_source *s = babel_find_source(e, r->router_id); + struct babel_interface *bif; + struct babel_tlv_seqno_request *tlv; + + TRACE(D_PACKETS, "Sending seqno request for %I/%d router_id %0lx", + e->n.prefix, e->n.pxlen, r->router_id); + + if(s) { + WALK_LIST(bif, P->interfaces) { + babel_new_packet(bif); + tlv = babel_add_tlv_seqno_request(bif); + tlv->plen = e->n.pxlen; + tlv->seqno = s->seqno + 1; + tlv->hop_count = BABEL_INITIAL_HOP_COUNT; + tlv->router_id = r->router_id; + babel_put_addr(&tlv->header, e->n.prefix); + babel_send(bif); + } + } +} + +/** + * babel_select_route: + * @e: Babel entry to select the best route for. + * + * Select the best feasible route for a given prefix. This just selects the + * feasible route with the lowest metric. If this results in switching upstream + * router (identified by router id), the nest is notified of the new route. + * + * If no feasible route is available for a prefix that previously had a route + * selected, a seqno request is sent to try to get a valid route. In the + * meantime, the route is marked as infeasible in the nest (to blackhole packets + * going to it, as per the RFC). + * + * If no feasible route is available, and no previous route is selected, the + * route is removed from the nest entirely. + */ +static void babel_select_route(struct babel_entry *e) +{ + struct proto *p = e->proto; + net *n = net_get(p->table, e->n.prefix, e->n.pxlen); + rte *old = rte_find(n, p->main_source); + struct babel_route *r, *cur = e->selected; + + /* try to find the best feasible route */ + WALK_LIST(r, e->routes) + if((!cur || r->metric < cur->metric) + && is_feasible(babel_find_source(e, r->router_id), + r->seqno, r->advert_metric)) + cur = r; + + if(cur && cur->neigh && ((!old && cur->metric < BABEL_INFINITY) + || (old && (old->u.babel.metric == BABEL_INFINITY + || old->u.babel.router_id != cur->router_id)))) { + TRACE(D_EVENTS, "Picked new route for prefix %I/%d: router id %0lx metric %d", + e->n.prefix, e->n.pxlen, cur->router_id, cur->metric); + /* Notify the nest of the update. If we change router ID, we also trigger + a global update. */ + e->selected = cur; + rte_update(p, n, babel_build_rte(p, n, cur)); + if(!old || old->u.babel.router_id != cur->router_id) + ev_schedule(P->update_event); + } else if(!cur || cur->metric == BABEL_INFINITY) { + /* Couldn't find a feasible route. If we have a selected route, that means + it just became infeasible; so set it's metric to infinite and install it + (as unreachable), then send a seqno request. */ + if(e->selected) { + e->selected->metric = BABEL_INFINITY; + rte_update(p, n, babel_build_rte(p, n, e->selected)); + babel_send_seqno_request(e); + } else { + /* No route currently selected, and no new one selected; this means we + don't have a route to this destination anymore (and were probably + called from an expiry timer). Remove the route from the nest. */ + e->selected = NULL; + rte_update(p, n, NULL); + } + } +} + +static void babel_send_ack(struct babel_interface *bif, ip_addr dest, u16 nonce) +{ + struct proto *p = bif->proto; + struct babel_tlv_ack *tlv; + TRACE(D_PACKETS, "Babel: Sending ACK to %I with nonce %d\n", dest, nonce); + babel_new_packet(bif); + tlv = babel_add_tlv_ack(bif); + tlv->nonce = nonce; + babel_send_to(bif, dest); +} + +static void babel_add_ihu(struct babel_interface *bif, struct babel_neighbor *bn) +{ + struct babel_tlv_ihu *tlv; + BABEL_ADD_TLV_SEND(tlv, bif, babel_add_tlv_ihu, IPA_NONE); + babel_put_addr_ihu(&tlv->header, bn->addr); + tlv->rxcost = babel_compute_rxcost(bn); + tlv->interval = bif->ihu_interval*100; +} + +static void babel_add_ihus(struct babel_interface *bif) +{ + struct babel_neighbor *bn; + WALK_LIST(bn, bif->neigh_list) + babel_add_ihu(bif,bn); +} + +static void babel_send_ihus(void *arg) +{ + struct babel_interface *bif = arg; + struct proto *p = bif->proto; + TRACE(D_PACKETS, "Babel: Sending IHUs"); + babel_new_packet(bif); + babel_add_ihus(bif); + babel_send(bif); +} + +static void babel_send_hello(struct babel_interface *bif, u8 send_ihu) +{ + struct proto *p = bif->proto; + struct babel_tlv_hello *tlv; + TRACE(D_PACKETS, "Babel: Sending hello on interface %s", bif->ifname); + babel_new_packet(bif); + tlv = babel_add_tlv_hello(bif); + tlv->seqno = bif->hello_seqno++; + tlv->interval = bif->hello_interval*100; + + if(send_ihu) babel_add_ihus(bif); + + babel_send(bif); +} + +static void babel_hello_timer(timer *t) +{ + struct babel_interface *bif = t->data; + babel_send_hello(bif, (bif->type == BABEL_IFACE_TYPE_WIRELESS + || bif->hello_seqno % BABEL_IHU_INTERVAL_FACTOR == 0)); +} + +static int babel_add_router_id(struct babel_interface *bif, u64 router_id) +{ + struct babel_tlv_router_id *rid = babel_add_tlv_router_id(bif); + if(!rid) return 1; + rid->router_id = router_id; + return 0; +} + +void babel_send_update(struct babel_interface *bif) +{ + struct proto *p = bif->proto; + struct babel_tlv_update *upd; + struct babel_entry *e; + struct babel_route *r; + struct babel_source *s; + u64 router_id = 0; + int res = 0, i = 0; + TRACE(D_PACKETS, "Sending update on %s", bif->ifname); + babel_new_packet(bif); + FIB_WALK(&P->rtable, n) { + e = (struct babel_entry *)n; + r = e->selected; + if(!r) continue; + i++; + + if(r->router_id != router_id) { + res = babel_add_router_id(bif, r->router_id); + if(res == 0) upd = babel_add_tlv_update(bif); + router_id = r->router_id; + } else { + upd = babel_add_tlv_update(bif); + } + if(res > 0 || !upd) { + babel_send(bif); + babel_add_router_id(bif, router_id); + upd = babel_add_tlv_update(bif); + i = 1; + } + + /* Our own seqno might have changed, in which case we update the routes we + originate. */ + if(r->router_id == P->router_id && r->seqno < P->update_seqno) + r->seqno = P->update_seqno; + upd->plen = e->n.pxlen; + upd->interval = bif->update_interval*100; + upd->seqno = r->seqno; + upd->metric = r->metric; + babel_put_addr(&upd->header, e->n.prefix); + + /* Update feasibility distance. */ + s = babel_get_source(e, r->router_id); + s->updated = now; + if(upd->seqno > s->seqno + || (upd->seqno == s->seqno && upd->metric < s->metric)) { + s->seqno = upd->seqno; + s->metric = upd->metric; + } + } FIB_WALK_END; + if(i > 0) babel_send(bif); +} + +/* Sends and update on all interfaces. */ +static void babel_global_update(void *arg) +{ + struct proto *p = arg; + struct babel_interface *bif; + TRACE(D_EVENTS, "Sending global update. Seqno %d", P->update_seqno); + WALK_LIST(bif, P->interfaces) + babel_send_update(bif); +} + +static void babel_update_timer(timer *t) +{ + struct babel_interface *bif = t->data; + struct proto *p = bif->proto; + TRACE(D_EVENTS, "Update timer firing"); + babel_send_update(bif); +} + + +int babel_handle_ack_req(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_ack_req *tlv = (struct babel_tlv_ack_req *)hdr; + struct proto *p = state->proto; + TRACE(D_PACKETS, "Received ACK req nonce %d interval %d", tlv->nonce, tlv->interval); + if(tlv->interval) { + babel_send_ack(state->bif, state->saddr, tlv->nonce); + } + return 1; +} + +int babel_handle_ack(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_ack *tlv = (struct babel_tlv_ack *)hdr; + struct proto *p = state->proto; + TRACE(D_PACKETS, "Received ACK nonce %d", tlv->nonce); + /* We don't send any ACK requests, so no need to do anything with ACKs. */ + return 1; +} + +static void babel_flush_neighbor(struct babel_neighbor *bn) +{ + struct proto *p = bn->bif->proto; + struct neighbor_route *r; + TRACE(D_EVENTS, "Flushing neighbor %I", bn->addr); + rem_node(NODE bn); + bn->neigh->data = NULL; + WALK_LIST_FIRST(r, bn->routes) + babel_flush_route(r->r); + rfree(bn->pool); // contains the neighbor itself +} + +static void expire_hello(timer *t) +{ + struct babel_neighbor *bn = t->data; + bn->hello_map <<= 1; + if(bn->hello_n < 16) bn->hello_n++; + if(!bn->hello_map) { + babel_flush_neighbor(bn); + } +} + +static void expire_ihu(timer *t) +{ + struct babel_neighbor *bn = t->data; + bn->txcost = BABEL_INFINITY; +} + + +/* update hello history according to Appendix A1 of the RFC */ +static void update_hello_history(struct babel_neighbor *bn, u16 seqno, u16 interval) +{ + u8 diff; + if(seqno == bn->next_hello_seqno) {/* do nothing */} + /* if the expected and seen seqnos are within 16 of each other (mod 65535), + the modular difference is going to be less than 16 for one of the + directions. Otherwise, the values differ too much, so just reset. */ + else if(diff_mod64k(seqno, bn->next_hello_seqno) > 16 && + diff_mod64k(bn->next_hello_seqno,seqno) > 16) { + /* note state reset - flush entries */ + bn->hello_map = bn->hello_n = 0; + } else if((diff = diff_mod64k(bn->next_hello_seqno,seqno)) <= 16) { + /* sending node increased interval; reverse history */ + bn->hello_map >>= diff; + bn->hello_n -= MAX(bn->hello_n-diff, 0); + } else if((diff = diff_mod64k(seqno,bn->next_hello_seqno)) <= 16) { + /* sending node decreased interval; fast-forward */ + bn->hello_map <<= diff; + bn->hello_n = MIN(bn->hello_n+diff, 16); + } + /* current entry */ + bn->hello_map = (bn->hello_map << 1) | 1; + bn->next_hello_seqno = seqno+1; + if(bn->hello_n < 16) bn->hello_n++; + tm_start(bn->hello_timer, (BABEL_HELLO_EXPIRY_FACTOR*interval)/100); +} + + +int babel_handle_hello(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_hello *tlv = (struct babel_tlv_hello *)hdr; + struct proto *p = state->proto; + struct babel_interface *bif = state->bif; + struct babel_neighbor *bn = babel_get_neighbor(bif, state->saddr); + TRACE(D_PACKETS, "Handling hello seqno %d interval %d", tlv->seqno, + tlv->interval, state->saddr); + update_hello_history(bn, tlv->seqno, tlv->interval); + return 0; +} + +int babel_handle_ihu(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_ihu *tlv = (struct babel_tlv_ihu *)hdr; + struct proto *p = state->proto; + struct babel_interface *bif = state->bif; + ip_addr addr = babel_get_addr(hdr, state); + + if(!ipa_equal(addr, bif->addr)) return 1; // not for us + TRACE(D_PACKETS, "Handling IHU rxcost %d interval %d", tlv->rxcost, + tlv->interval); + struct babel_neighbor *bn = babel_get_neighbor(bif, state->saddr); + bn->txcost = tlv->rxcost; + tm_start(bn->ihu_timer, 1.5*(tlv->interval/100)); + return 0; +} + +int babel_handle_router_id(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_router_id *tlv = (struct babel_tlv_router_id *)hdr; + struct proto *p = state->proto; + state->router_id = tlv->router_id; + TRACE(D_PACKETS, "Handling router ID %016lx", state->router_id); + return 0; +} + +int babel_handle_next_hop(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + state->next_hop = babel_get_addr(hdr, state); + return 0; +} + +int babel_handle_update(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_update *tlv = (struct babel_tlv_update *)hdr; + struct babel_interface *bif = state->bif; + struct proto *p = state->proto; + struct babel_neighbor *n; + struct babel_entry *e; + struct babel_source *s; + struct babel_route *r; + ip_addr prefix = babel_get_addr(hdr, state); + TRACE(D_PACKETS, "Handling update for %I/%d with seqno %d metric %d", + prefix, tlv->plen, tlv->seqno, tlv->metric); + if(tlv->flags & BABEL_FLAG_DEF_PREFIX) { + state->prefix = prefix; + } + if(tlv->flags & BABEL_FLAG_ROUTER_ID) { + u64 *buf = (u64*)&prefix; + memcpy(&state->router_id, buf+1, sizeof(u64)); + } + if(!state->router_id) + log(L_WARN "%s: Received update on %s with no preceding router id", p->name, bif->ifname); + + n = babel_find_neighbor(bif, state->saddr); + if(!n) { + DBG("Haven't heard from neighbor %I; ignoring update.\n", state->saddr); + return 1; + } + + /* RFC section 3.5.4: + + When a Babel node receives an update (id, prefix, seqno, metric) from a + neighbour neigh with a link cost value equal to cost, it checks whether it + already has a routing table entry indexed by (neigh, id, prefix). + + If no such entry exists: + + o if the update is unfeasible, it is ignored; + + o if the metric is infinite (the update is a retraction), the update is + ignored; + + o otherwise, a new route table entry is created, indexed by (neigh, id, + prefix), with seqno equal to seqno and an advertised metric equal to the + metric carried by the update. + + If such an entry exists: + + o if the entry is currently installed and the update is unfeasible, then + the behaviour depends on whether the router-ids of the two entries match. + If the router-ids are different, the update is treated as though it were + a retraction (i.e., as though the metric were FFFF hexadecimal). If the + router-ids are equal, the update is ignored; + + o otherwise (i.e., if either the update is feasible or the entry is not + currently installed), then the entry's sequence number, advertised + metric, metric, and router-id are updated and, unless the advertised + metric is infinite, the route's expiry timer is reset to a small multiple + of the Interval value included in the update. + +*/ + e = babel_get_entry(p, prefix, tlv->plen); + + s = babel_find_source(e, state->router_id); /* for feasibility */ + r = babel_find_route(e, n); /* the route entry indexed by neighbour */ + + if(!r) { + + if(!is_feasible(s, tlv->seqno, tlv->metric) || tlv->metric == BABEL_INFINITY) + return 1; + + r = babel_get_route(e, n); + r->advert_metric = tlv->metric; + r->router_id = state->router_id; + r->metric = compute_metric(n, tlv->metric); + r->next_hop = state->next_hop; + r->seqno = tlv->seqno; + } else if(r == r->e->selected + && !is_feasible(s, tlv->seqno, tlv->metric)) { + + /* route is installed and update is infeasible - check router id */ + + if(state->router_id == s->router_id) return 1; + r->metric = BABEL_INFINITY; /* retraction */ + } else { + /* last point above - update entry */ + r->seqno = tlv->seqno; + r->advert_metric = tlv->metric; + r->metric = compute_metric(n, tlv->metric); + r->router_id = state->router_id; + r->next_hop = state->next_hop; + r->seqno = tlv->seqno; + if(tlv->metric != BABEL_INFINITY) { + r->expiry_interval = (BABEL_ROUTE_EXPIRY_FACTOR*tlv->interval)/100; + tm_start(r->expiry_timer, r->expiry_interval); + } + } + babel_select_route(e); + return 0; +} + +/* A retraction is an update with an infinite metric. */ +static void babel_send_retraction(struct babel_interface *bif, ip_addr prefix, int plen) +{ + struct proto *p = bif->proto; + struct babel_tlv_update *upd; + babel_new_packet(bif); + babel_add_router_id(bif, P->router_id); + upd = babel_add_tlv_update(bif); + upd->plen = plen; + upd->interval = bif->update_interval*100; + upd->seqno = P->update_seqno; + upd->metric = BABEL_INFINITY; + babel_put_addr(&upd->header, prefix); + babel_send(bif); +} + +int babel_handle_route_request(struct babel_tlv_header *hdr, + struct babel_parse_state *state) +{ + struct babel_tlv_route_request *tlv = (struct babel_tlv_route_request *)hdr; + struct babel_interface *bif = state->bif; + struct proto *p = state->proto; + ip_addr prefix = babel_get_addr(hdr, state); + struct babel_entry *e; + + TRACE(D_PACKETS, "Handling route request for %I/%d on interface %s", + prefix, tlv->plen, bif->ifname); + + /* Wildcard request - full update on the interface */ + if(ipa_equal(prefix,IPA_NONE)) { + state->needs_update = 1; + return 0; + } + /* Non-wildcard request - see if we have an entry for the route. If not, send + a retraction, otherwise send an update. */ + e = babel_find_entry(p, prefix, tlv->plen); + if(!e) { + babel_send_retraction(bif, prefix, tlv->plen); + } else { + state->needs_update = 1; + } + return 0; +} + +static void expire_seqno_requests(timer *t) { + struct babel_seqno_request_cache *c = t->data; + struct babel_seqno_request *n, *nx; + WALK_LIST_DELSAFE(n, nx, c->entries) { + if(n->updated < now-BABEL_SEQNO_REQUEST_EXPIRY) { + rem_node(NODE n); + mb_free(n); + } + } +} + +/* Checks the seqno request cache for a matching request and returns failure if + found. Otherwise, a new entry is stored in the cache. */ +static int cache_seqno_request(struct proto *p, ip_addr prefix, u8 plen, + u64 router_id, u16 seqno) +{ + struct babel_seqno_request_cache *c = P->seqno_cache; + struct babel_seqno_request *r; + WALK_LIST(r, c->entries) { + if(ipa_equal(r->prefix, prefix) && r->plen == plen && + r->router_id == router_id && r->seqno == seqno) + return 0; + } + + /* no entries found */ + r = mb_allocz(c->pool, sizeof(struct babel_seqno_request)); + r->prefix = prefix; + r->plen = plen; + r->router_id = router_id; + r->seqno = seqno; + r->updated = now; + add_tail(&c->entries, NODE r); + return 1; +} + +void babel_forward_seqno_request(struct babel_entry *e, + struct babel_tlv_seqno_request *in, + ip_addr sender) +{ + struct proto *p = e->proto; + struct babel_route *r; + struct babel_interface *bif; + struct babel_tlv_seqno_request *out; + TRACE(D_PACKETS, "Forwarding seqno request for %I/%d router_id %0lx", + e->n.prefix, e->n.pxlen, in->router_id); + WALK_LIST(r, e->routes) { + if(r->router_id == in->router_id && r->neigh + && !ipa_equal(r->neigh->addr,sender)) { + if(!cache_seqno_request(p, e->n.prefix, e->n.pxlen, in->router_id, in->seqno)) + return; + bif = r->neigh->bif; + babel_new_packet(bif); + out = babel_add_tlv_seqno_request(bif); + out->plen = in->plen; + out->seqno = in->seqno; + out->hop_count = in->hop_count-1; + out->router_id = in->router_id; + babel_put_addr(&out->header, e->n.prefix); + babel_send_to(bif, r->neigh->addr); + return; + } + } +} + +/* The RFC section 3.8.1.2 on seqno requests: + + When a node receives a seqno request for a given router-id and sequence + number, it checks whether its routing table contains a selected entry for + that prefix; if no such entry exists, or the entry has infinite metric, it + ignores the request. + + If a selected route for the given prefix exists, and either the router-ids + are different or the router-ids are equal and the entry's sequence number is + no smaller than the requested sequence number, it MUST send an update for the + given prefix. + + If the router-ids match but the requested seqno is larger than the route + entry's, the node compares the router-id against its own router-id. If the + router-id is its own, then it increases its sequence number by 1 and sends an + update. A node MUST NOT increase its sequence number by more than 1 in + response to a route request. + + If the requested router-id is not its own, the received request's hop count + is 2 or more, and the node has a route (not necessarily a feasible one) for + the requested prefix that does not use the requestor as a next hop, the node + SHOULD forward the request. It does so by decreasing the hop count and + sending the request in a unicast packet destined to a neighbour that + advertises the given prefix (not necessarily the selected neighbour) and that + is distinct from the neighbour from which the request was received. + + A node SHOULD maintain a list of recently forwarded requests and forward the + reply in a timely manner. A node SHOULD compare every incoming request + against its list of recently forwarded requests and avoid forwarding it if it + is redundant. +*/ +int babel_handle_seqno_request(struct babel_tlv_header *hdr, + struct babel_parse_state *state) +{ + struct babel_tlv_seqno_request *tlv = (struct babel_tlv_seqno_request *)hdr; + struct proto *p = state->proto; + ip_addr prefix = babel_get_addr(hdr, state); + struct babel_entry *e; + struct babel_route *r; + + TRACE(D_PACKETS, "Handling seqno request for %I/%d router_id %0lx seqno %d hop count %d", + prefix, tlv->plen, tlv->router_id, tlv->seqno, tlv->hop_count); + + e = babel_find_entry(p, prefix, tlv->plen); + if(!e || !e->selected || e->selected->metric == BABEL_INFINITY) return 1; + + r = e->selected; + if(r->router_id != tlv->router_id || ge_mod64k(r->seqno, tlv->seqno) >= 0) { + state->needs_update = 1; + return 0; + } + + /* seqno is larger; check if we own the router id */ + if(tlv->router_id == P->router_id) { + P->update_seqno++; + ev_schedule(P->update_event); + return 0; + } + + if(tlv->hop_count > 1) { + babel_forward_seqno_request(e, tlv, state->saddr); + } + + return 1; + +} + + + +/* + * Interface to BIRD core + */ + +/* + * babel_start - initialize instance of babel + */ + +static void babel_dump_source(struct babel_source *s) +{ + debug("Source router_id %0lx seqno %d metric %d\n", + s->router_id, s->seqno, s->metric); +} + +static void babel_dump_route(struct babel_route *r) +{ + debug("Route neigh %I seqno %d metric %d/%d router_id %0lx\n", + r->neigh ? r->neigh->addr : IPA_NONE, r->seqno, r->advert_metric, + r->metric, r->router_id); +} + +static void babel_dump_entry(struct babel_entry *e) +{ + debug("Babel: Entry %I/%d:\n", e->n.prefix, e->n.pxlen); + struct babel_source *s; struct babel_route *r; + WALK_LIST(s,e->sources) { debug(" "); babel_dump_source(s); } + WALK_LIST(r,e->routes) { debug(r==e->selected?" * " : " "); babel_dump_route(r); } +} +static void babel_dump_neighbor(struct babel_neighbor *bn) +{ + debug("Neighbor %I txcost %d hello_map %x next seqno %d\n", + bn->addr, bn->txcost, bn->hello_map, bn->next_hello_seqno); +} +static void babel_dump_interface(struct babel_interface *bif) +{ + struct babel_neighbor *bn; + debug("Babel: Interface %s addr %I rxcost %d type %d hello seqno %d intervals %d %d %d\n", + bif->ifname, bif->addr, bif->rxcost, bif->type, bif->hello_seqno, + bif->hello_interval, bif->ihu_interval, bif->update_interval); + WALK_LIST(bn,bif->neigh_list) { debug(" "); babel_dump_neighbor(bn); } + +} + +static void babel_dump(struct proto *p) +{ + struct babel_entry *e; + struct babel_interface *bif; + debug("Babel: router id %0xl update seqno %d\n", P->router_id, P->update_seqno); + WALK_LIST(bif, P->interfaces) {babel_dump_interface(bif);} + FIB_WALK(&P->rtable, n) { + e = (struct babel_entry *)n; + babel_dump_entry(e); + } FIB_WALK_END; +} + +static void babel_tx_err( sock *s, int err ) +{ + log( L_ERR ": Unexpected error at Babel transmit: %M", err ); +} + + +static int +babel_rx(sock *s, int size) +{ + struct babel_interface *bif = s->data; + struct proto *p = bif->proto; + if (! bif->iface || s->lifindex != bif->iface->index) + return 1; + + DBG( "Babel: incoming packet: %d bytes from %I via %s\n", size, s->faddr, bif->iface ? bif->iface->name : "(dummy)" ); + if (size < sizeof(struct babel_header)) BAD( "Too small packet" ); + + if (ipa_equal(bif->iface->addr->ip, s->faddr)) { + DBG("My own packet\n"); + return 1; + } + + if (!ipa_is_link_local(s->faddr)) { BAD("Non-link local sender"); } + + babel_process_packet((struct babel_header *) s->rbuf, size, s->faddr, s->fport, bif ); + return 1; +} + +static struct babel_interface* +find_interface(struct proto *p, struct iface *what) +{ + struct babel_interface *bif; + + WALK_LIST (bif, P->interfaces) + if (bif->iface == what) + return bif; + return NULL; +} + +static void +kill_iface(struct babel_interface *bif) +{ + DBG( "Babel: Interface %s disappeared\n", bif->iface->name); + struct babel_neighbor *bn; + WALK_LIST_FIRST(bn, bif->neigh_list) + babel_flush_neighbor(bn); + rfree(bif->pool); +} + + + +static void +babel_add_if(struct object_lock *lock) +{ + struct iface *iface = lock->iface; + struct proto *p = lock->data; + struct babel_interface *bif; + struct iface_patt *k = iface_patt_find(&P_CF->iface_list, iface, iface->addr); + DBG("adding interface %s\n", iface->name ); + bif = new_iface(p, iface, iface->flags, k); + if (bif) { + add_head( &P->interfaces, NODE bif ); + DBG("Adding object lock of %p for %p\n", lock, bif); + bif->lock = lock; + babel_send_hello(bif,0); + } else { rfree(lock); } +} + + + +static void +babel_if_notify(struct proto *p, unsigned c, struct iface *iface) +{ + DBG("Babel: if notify: %s\n", iface->name); + if (iface->flags & IF_IGNORE) + return; + if (c & IF_CHANGE_DOWN) { + struct babel_interface *bif; + bif = find_interface(p, iface); + if (bif) { + rem_node(NODE bif); + rfree(bif->lock); + kill_iface(bif); + } + } + if (c & IF_CHANGE_UP) { + struct iface_patt *k = iface_patt_find(&P_CF->iface_list, iface, iface->addr); + struct object_lock *lock; + + /* we only speak multicast */ + if(!(iface->flags & IF_MULTICAST)) return; + + if (!k) return; /* We are not interested in this interface */ + + lock = olock_new( p->pool ); + lock->addr = IP6_BABEL_ROUTERS; + lock->port = P_CF->port; + lock->iface = iface; + lock->hook = babel_add_if; + lock->data = p; + lock->type = OBJLOCK_UDP; + olock_acquire(lock); + } + +} + +static struct babel_interface *new_iface(struct proto *p, struct iface *new, + unsigned long flags, struct iface_patt *patt) +{ + struct babel_interface * bif; + struct babel_patt *PATT = (struct babel_patt *) patt; + pool *pool; + + if(!new) return NULL; + + pool = rp_new(p->pool, new->name); + bif = mb_allocz(pool, sizeof( struct babel_interface )); + bif->pool = pool; + bif->iface = new; + bif->ifname = new->name; + bif->proto = p; + struct ifa* ifa; + WALK_LIST(ifa, new->addrs) + if(ipa_is_link_local(ifa->ip)) + bif->addr = ifa->ip; + if (PATT) { + bif->rxcost = PATT->rxcost; + bif->type = PATT->type; + + if(bif->type == BABEL_IFACE_TYPE_WIRED) { + bif->hello_interval = BABEL_HELLO_INTERVAL_WIRED; + bif->rxcost = BABEL_RXCOST_WIRED; + } else if(bif->type == BABEL_IFACE_TYPE_WIRELESS) { + bif->hello_interval = BABEL_HELLO_INTERVAL_WIRELESS; + bif->rxcost = BABEL_RXCOST_WIRELESS; + } + if(PATT->hello_interval < BABEL_INFINITY) { + bif->hello_interval = PATT->hello_interval; + } + if(PATT->rxcost < BABEL_INFINITY) { + bif->rxcost = PATT->rxcost; + } + if(PATT->update_interval < BABEL_INFINITY) { + bif->update_interval = PATT->update_interval; + } else { + bif->update_interval = bif->hello_interval*BABEL_UPDATE_INTERVAL_FACTOR; + } + bif->ihu_interval = bif->hello_interval*BABEL_IHU_INTERVAL_FACTOR; + } + init_list(&bif->neigh_list); + bif->hello_seqno = 1; + bif->max_pkt_len = new->mtu - BABEL_OVERHEAD; + + bif->ihu_event = ev_new(bif->pool); + bif->ihu_event->hook = babel_send_ihus; + bif->ihu_event->data = bif; + + + bif->hello_timer = tm_new_set(bif->pool, babel_hello_timer, bif, 0, bif->hello_interval); + bif->update_timer = tm_new_set(bif->pool, babel_update_timer, bif, 0, bif->update_interval); + + bif->sock = sk_new( bif->pool ); + bif->sock->type = SK_UDP; + bif->sock->sport = P_CF->port; + bif->sock->rx_hook = babel_rx; + bif->sock->data = bif; + bif->sock->rbsize = 10240; + bif->sock->iface = new; + bif->sock->tbuf = mb_alloc( bif->pool, new->mtu); + bif->sock->err_hook = babel_tx_err; + bif->sock->dport = P_CF->port; + bif->sock->daddr = IP6_BABEL_ROUTERS; + + bif->sock->tos = PATT->tx_tos; + bif->sock->priority = PATT->tx_priority; + bif->sock->flags = SKF_LADDR_RX; + if (sk_open( bif->sock) < 0) + goto err; + if (sk_setup_multicast( bif->sock) < 0) + goto err; + if (sk_join_group( bif->sock, bif->sock->daddr) < 0) + goto err; + TRACE(D_EVENTS, "Listening on %s, port %d, mode multicast (%I)", bif->iface ? bif->iface->name : "(dummy)", P_CF->port, bif->sock->daddr ); + + tm_start(bif->hello_timer, bif->hello_interval); + tm_start(bif->update_timer, bif->update_interval); + + return bif; + err: + sk_log_error(bif->sock, p->name); + log(L_ERR "%s: Cannot open socket for %s", p->name, bif->iface ? bif->iface->name : "(dummy)" ); + if (bif->iface) { + rfree(bif->pool); + mb_free(bif); + return NULL; + } + + return bif; +} + + +static struct ea_list * +babel_gen_attrs(struct linpool *pool, int metric, u64 router_id) +{ + struct ea_list *l = lp_alloc(pool, sizeof(struct ea_list) + 2*sizeof(eattr)); + struct adata *rid = lp_alloc(pool, sizeof(struct adata) + sizeof(u64)); + rid->length = sizeof(u64); + memcpy(&rid->data, &router_id, sizeof(u64)); + + l->next = NULL; + l->flags = EALF_SORTED; + l->count = 2; + l->attrs[0].id = EA_BABEL_METRIC; + l->attrs[0].flags = 0; + l->attrs[0].type = EAF_TYPE_INT | EAF_TEMP; + l->attrs[0].u.data = metric; + l->attrs[1].id = EA_BABEL_ROUTER_ID; + l->attrs[1].flags = 0; + l->attrs[1].type = EAF_TYPE_OPAQUE | EAF_TEMP; + l->attrs[1].u.ptr = rid; + return l; +} + +static void +babel_timer(timer *t) +{ +} + + +static int +babel_import_control(struct proto *p, struct rte **rt, struct ea_list **attrs, struct linpool *pool) +{ + if ((*rt)->attrs->src->proto == p) /* My own must not be touched */ + return 1; + + if ((*rt)->attrs->source != RTS_BABEL) { + struct ea_list *new = babel_gen_attrs(pool, 1, P->router_id); + new->next = *attrs; + *attrs = new; + } + return 0; +} + +static struct ea_list * +babel_make_tmp_attrs(struct rte *rt, struct linpool *pool) +{ + return babel_gen_attrs(pool, rt->u.babel.metric, rt->u.babel.router_id); +} + +static void +babel_store_tmp_attrs(struct rte *rt, struct ea_list *attrs) +{ + eattr *rid = ea_find(attrs, EA_BABEL_ROUTER_ID); + rt->u.babel.router_id = rid ? *((u64*) rid->u.ptr->data) : 0; + rt->u.babel.metric = ea_get_int(attrs, EA_BABEL_METRIC, 0); +} + +/* + * babel_rt_notify - core tells us about new route (possibly our + * own), so store it into our data structures. + */ +static void +babel_rt_notify(struct proto *p, struct rtable *table UNUSED, struct network *net, + struct rte *new, struct rte *old, struct ea_list *attrs) +{ + struct babel_entry *e; + struct babel_route *r; + + TRACE(D_EVENTS, "Got route from nest: %I/%d", net->n.prefix, net->n.pxlen); + if(new) { + e = babel_get_entry(p, net->n.prefix, net->n.pxlen); + r = (e->selected) ? e->selected : babel_get_route(e, NULL); + + if(!r->neigh) { + r->seqno = P->update_seqno; + r->router_id = P->router_id; + r->metric = 0; + e->selected = r; + } + } else if(old) { + /* route has gone away; send retraction */ + e = babel_find_entry(p, net->n.prefix, net->n.pxlen); + if(e && e->selected && !e->selected->neigh) { + /* no neighbour, so our route */ + e->selected->metric = BABEL_INFINITY; + tm_start(e->selected->expiry_timer, BABEL_HOLD_TIME); + babel_select_route(e); + } + } else { + return; + } + ev_schedule(P->update_event); +} + +static void babel_neigh_notify(neighbor *n) +{ + struct proto *p = n->proto; + struct babel_neighbor *bn = n->data; + DBG("Neighbor: bn %d scope %d flags %d\n", bn, n->scope, n->flags); + if(n->scope <= 0) { + TRACE(D_EVENTS, "Babel: Neighbor lost"); + } else { + TRACE(D_EVENTS, "Babel: Neighbor ready"); + } +} + + +static struct proto * +babel_init(struct proto_config *cfg) +{ + struct proto *p = proto_new(cfg, sizeof(struct babel_proto)); + + p->accept_ra_types = RA_OPTIMAL; + p->if_notify = babel_if_notify; + p->rt_notify = babel_rt_notify; + p->neigh_notify = babel_neigh_notify; + p->import_control = babel_import_control; + p->make_tmp_attrs = babel_make_tmp_attrs; + p->store_tmp_attrs = babel_store_tmp_attrs; + p->rte_better = babel_rte_better; + p->rte_same = babel_rte_same; + + return p; +} + +void +babel_init_config(struct babel_proto_config *c) +{ + init_list(&c->iface_list); + c->port = BABEL_PORT; +} + +static void +babel_get_route_info(rte *rte, byte *buf, ea_list *attrs) +{ + buf += bsprintf(buf, " (%d)", rte->u.babel.metric); +} + + +static int +babel_get_attr(eattr *a, byte *buf, int buflen UNUSED) +{ + switch (a->id) { + case EA_BABEL_METRIC: bsprintf( buf, "metric: %d", a->u.data ); return GA_FULL; + default: return GA_UNKNOWN; + } +} + +static int +babel_reconfigure(struct proto *p, struct proto_config *c) +{ + return 0; +} + +static void +babel_copy_config(struct proto_config *dest, struct proto_config *src) +{ + /* Shallow copy of everything */ + proto_copy_rest(dest, src, sizeof(struct babel_proto_config)); + + /* We clean up iface_list, ifaces are non-sharable */ + init_list(&((struct babel_proto_config *) dest)->iface_list); + +} + +static int +babel_start(struct proto *p) +{ + pool *pool; + DBG( "Babel: starting instance...\n" ); + fib_init( &P->rtable, p->pool, sizeof( struct babel_entry ), 0, babel_init_entry ); + init_list( &P->interfaces ); + P->timer = tm_new_set(p->pool, babel_timer, p, 0, 1); + tm_start( P->timer, 2 ); + P->update_seqno = 1; + P->router_id = proto_get_router_id(&P_CF->c); + P->update_event = ev_new(p->pool); + P->update_event->hook = babel_global_update; + P->update_event->data = p; + + pool = rp_new(p->pool, "Seqno request cache"); + P->seqno_cache = mb_allocz(pool, sizeof(struct babel_seqno_request_cache)); + P->seqno_cache->pool = pool; + init_list(&P->seqno_cache->entries); + P->seqno_cache->timer = tm_new_set(pool, expire_seqno_requests, + P->seqno_cache, 0, BABEL_SEQNO_REQUEST_EXPIRY); + tm_start(P->seqno_cache->timer, BABEL_SEQNO_REQUEST_EXPIRY); + DBG( "Babel: ...done\n"); + return PS_UP; +} + + + +struct protocol proto_babel = { + .name = "Babel", + .template = "babel%d", + .attr_class = EAP_BABEL, + .preference = DEF_PREF_BABEL, + .config_size = sizeof(struct babel_proto_config), + .init = babel_init, + .dump = babel_dump, + .start = babel_start, + .reconfigure = babel_reconfigure, + .copy_config = babel_copy_config, + .get_route_info = babel_get_route_info, + .get_attr = babel_get_attr +}; diff --git a/proto/babel/babel.h b/proto/babel/babel.h new file mode 100644 index 0000000..849c9bd --- /dev/null +++ b/proto/babel/babel.h @@ -0,0 +1,427 @@ +/* -*- c-file-style: "gnu"; -*- + * + * The Babel protocol + * + * Copyright (c) 2015 Toke Høiland-Jørgensen + * + * Can be freely distributed and used under the terms of the GNU GPL. + * + * This file contains the data structures used by Babel. + */ + +#include "nest/bird.h" +#include "nest/iface.h" +#include "nest/route.h" +#include "nest/protocol.h" +#include "nest/locks.h" +#include "lib/resource.h" +#include "lib/lists.h" +#include "lib/socket.h" +#include "lib/string.h" +#include "lib/timer.h" + +#ifndef IPV6 +#error "The Babel protocol only speaks IPv6" +#endif + +#define EA_BABEL_METRIC EA_CODE(EAP_BABEL, 0) +#define EA_BABEL_ROUTER_ID EA_CODE(EAP_BABEL, 1) + +#define BABEL_MAGIC 42 +#define BABEL_VERSION 2 +#define BABEL_PORT 6696 +#define BABEL_INFINITY 0xFFFF + + /* default hello intervals in seconds */ +#define BABEL_HELLO_INTERVAL_WIRED 20 +#define BABEL_HELLO_INTERVAL_WIRELESS 4 +#define BABEL_UPDATE_INTERVAL_FACTOR 4 +#define BABEL_IHU_INTERVAL_FACTOR 3 +#define BABEL_HELLO_EXPIRY_FACTOR 1.5 +#define BABEL_ROUTE_EXPIRY_FACTOR 3.5 +#define BABEL_HOLD_TIME 10 /* expiry time for our own routes */ +#define BABEL_RXCOST_WIRED 96 +#define BABEL_RXCOST_WIRELESS 256 +#define BABEL_INITIAL_HOP_COUNT 255 + +#define BABEL_SEQNO_REQUEST_EXPIRY 60 +#define BABEL_SOURCE_EXPIRY 300 + +/* ip header + udp header + babel header */ +#define BABEL_OVERHEAD (SIZE_OF_IP_HEADER+8+sizeof(struct babel_header)) + +struct babel_header { + u8 magic; + u8 version; + u16 length; +}; + +enum babel_tlv_type_t { + BABEL_TYPE_PAD0 = 0, + BABEL_TYPE_PADN = 1, + BABEL_TYPE_ACK_REQ = 2, + BABEL_TYPE_ACK = 3, + BABEL_TYPE_HELLO = 4, + BABEL_TYPE_IHU = 5, + BABEL_TYPE_ROUTER_ID = 6, + BABEL_TYPE_NEXT_HOP = 7, + BABEL_TYPE_UPDATE = 8, + BABEL_TYPE_ROUTE_REQUEST = 9, + BABEL_TYPE_SEQNO_REQUEST = 10, + /* extensions - not implemented + BABEL_TYPE_TS_PC = 11, + BABEL_TYPE_HMAC = 12, + BABEL_TYPE_SS_UPDATE = 13, + BABEL_TYPE_SS_REQUEST = 14, + BABEL_TYPE_SS_SEQNO_REQUEST = 15, + */ + BABEL_TYPE_MAX +}; + +enum babel_iface_type_t { + BABEL_IFACE_TYPE_WIRED, + BABEL_IFACE_TYPE_WIRELESS, + BABEL_IFACE_TYPE_MAX +}; + +enum babel_ae_type_t { + BABEL_AE_WILDCARD = 0, + BABEL_AE_IP4 = 1, + BABEL_AE_IP6 = 2, + BABEL_AE_IP6_LL = 3, + BABEL_AE_MAX +}; + + +struct babel_parse_state { + struct proto *proto; + struct babel_interface *bif; + ip_addr saddr; + u64 router_id; + ip_addr prefix; + ip_addr next_hop; + u8 needs_update; +}; + + + + +struct babel_tlv_header { + u8 type; + u8 length; +}; + +struct babel_tlv_ack_req { + struct babel_tlv_header header; + u16 reserved; + u16 nonce; + u16 interval; +}; + +void babel_hton_ack_req(struct babel_tlv_header *tlv); +void babel_ntoh_ack_req(struct babel_tlv_header *tlv); + +struct babel_tlv_ack { + struct babel_tlv_header header; + u16 nonce; +}; + +struct babel_tlv_hello { + struct babel_tlv_header header; + u16 reserved; + u16 seqno; + u16 interval; +}; + +void babel_hton_hello(struct babel_tlv_header *tlv); +void babel_ntoh_hello(struct babel_tlv_header *tlv); + + +struct babel_tlv_ihu { + struct babel_tlv_header header; + u8 ae; + u8 reserved; + u16 rxcost; + u16 interval; + u32 addr[2]; +}; +void babel_hton_ihu(struct babel_tlv_header *tlv); +void babel_ntoh_ihu(struct babel_tlv_header *tlv); +ip_addr babel_get_addr_ihu(struct babel_tlv_header *tlv, struct babel_parse_state *state); +void babel_put_addr_ihu(struct babel_tlv_header *tlv, ip_addr addr); + + +struct babel_tlv_router_id { + struct babel_tlv_header header; + u16 reserved; + u64 router_id __attribute__((packed)); +}; +void babel_hton_router_id(struct babel_tlv_header *tlv); +void babel_ntoh_router_id(struct babel_tlv_header *tlv); + +struct babel_tlv_next_hop { + struct babel_tlv_header header; + u8 ae; + u8 reserved; + u32 addr[2]; +}; +ip_addr babel_get_addr_next_hop(struct babel_tlv_header *tlv, struct babel_parse_state *state); +void babel_put_addr_next_hop(struct babel_tlv_header *tlv, ip_addr addr); + +struct babel_tlv_update { + struct babel_tlv_header header; + u8 ae; +#define BABEL_FLAG_DEF_PREFIX 0x80 +#define BABEL_FLAG_ROUTER_ID 0x40 + u8 flags; + u8 plen; + u8 omitted; + u16 interval; + u16 seqno; + u16 metric; + u32 addr[4]; +}; +void babel_hton_update(struct babel_tlv_header *tlv); +void babel_ntoh_update(struct babel_tlv_header *tlv); +ip_addr babel_get_addr_update(struct babel_tlv_header *tlv, struct babel_parse_state *state); +void babel_put_addr_update(struct babel_tlv_header *tlv, ip_addr addr); + +struct babel_tlv_route_request { + struct babel_tlv_header header; + u8 ae; + u8 plen; + u32 addr[4]; +}; +/* works for both types of request */ +ip_addr babel_get_addr_request(struct babel_tlv_header *tlv, + struct babel_parse_state *state); +void babel_put_addr_request(struct babel_tlv_header *tlv, ip_addr addr); + +struct babel_tlv_seqno_request { + struct babel_tlv_header header; + u8 ae; + u8 plen; + u16 seqno; + u8 hop_count; + u8 reserved; + u64 router_id __attribute__((packed)); + u32 addr[4]; +}; +void babel_hton_seqno_request(struct babel_tlv_header *tlv); +void babel_ntoh_seqno_request(struct babel_tlv_header *tlv); + + +/* Handlers */ + +int babel_validate_length(struct babel_tlv_header *tlv, struct babel_parse_state *state); +int babel_handle_ack_req(struct babel_tlv_header *tlv, struct babel_parse_state *state); +int babel_handle_ack(struct babel_tlv_header *tlv, struct babel_parse_state *state); +int babel_handle_hello(struct babel_tlv_header *tlv, struct babel_parse_state *state); +int babel_handle_ihu(struct babel_tlv_header *tlv, struct babel_parse_state *state); +int babel_validate_ihu(struct babel_tlv_header *hdr, struct babel_parse_state *state); +int babel_handle_router_id(struct babel_tlv_header *tlv, struct babel_parse_state *state); +int babel_handle_next_hop(struct babel_tlv_header *tlv, struct babel_parse_state *state); +int babel_validate_next_hop(struct babel_tlv_header *hdr, struct babel_parse_state *state); +int babel_handle_update(struct babel_tlv_header *tlv, struct babel_parse_state *state); +int babel_validate_update(struct babel_tlv_header *hdr, struct babel_parse_state *state); +int babel_handle_route_request(struct babel_tlv_header *tlv, + struct babel_parse_state *state); +int babel_validate_request(struct babel_tlv_header *hdr, struct babel_parse_state *state); +int babel_handle_seqno_request(struct babel_tlv_header *tlv, + struct babel_parse_state *state); + +/* Stores forwarded seqno requests for duplicate suppression. */ +struct babel_seqno_request { + node n; + ip_addr prefix; + u8 plen; + u64 router_id; + u16 seqno; + bird_clock_t updated; +}; + +struct babel_seqno_request_cache { + pool *pool; + list entries; + timer *timer; +}; + + +struct babel_interface { + node n; + + struct proto *proto; + struct iface *iface; + struct object_lock *lock; + + pool *pool; + char *ifname; + sock *sock; + ip_addr addr; + int max_pkt_len; + int rxcost; + int type; + list neigh_list; + + u16 hello_seqno; /* To be increased on each hello */ + u16 hello_interval; + u16 ihu_interval; + u16 update_interval; + + timer *hello_timer; + timer *update_timer; + event *ihu_event; +}; + +struct babel_patt { + struct iface_patt i; + + int rxcost; + int type; + int tx_tos; + int tx_priority; + int hello_interval; + int update_interval; +}; + +struct babel_neighbor { + node n; + struct babel_interface *bif; + + neighbor *neigh; + ip_addr addr; + pool *pool; + u16 txcost; + u8 hello_n; + u16 hello_map; + u16 next_hello_seqno; + /* expiry timers */ + timer *hello_timer; + timer *ihu_timer; + + list routes; +}; + +struct babel_entry; + +struct babel_source { + node n; + struct babel_entry *e; + + u64 router_id; + u16 seqno; + u16 metric; + bird_clock_t updated; +}; + +struct neighbor_route { + node n; + struct babel_route *r; +}; + +struct babel_route { + node n; + struct neighbor_route neigh_route; + struct babel_entry *e; + struct babel_neighbor *neigh; + + u16 seqno; + u16 advert_metric; + u16 metric; + u64 router_id; + ip_addr next_hop; + timer * expiry_timer; + u16 expiry_interval; +}; + + +struct babel_entry { + struct fib_node n; + struct proto *proto; + struct babel_route *selected; + + pool *pool; + list sources; + list routes; + timer *source_expiry_timer; +}; + + + +struct babel_proto_config { + struct proto_config c; + + list iface_list; /* Patterns configured -- keep it first; see babel_reconfigure why */ + int port; +}; + +struct babel_proto { + struct proto inherited; + timer *timer; + struct fib rtable; + list interfaces; /* Interfaces we really know about */ + u16 update_seqno; /* To be increased on request */ + u64 router_id; + event *update_event; /* For triggering global updates */ + + struct babel_seqno_request_cache *seqno_cache; +}; + + + +void babel_init_config(struct babel_proto_config *c); + +/* Packet mangling code - packet.c */ +void babel_send( struct babel_interface *bif ); +void babel_send_to( struct babel_interface *bif, ip_addr dest ); +void babel_send_update(struct babel_interface *bif); +int babel_process_packet(struct babel_header *pkt, int size, + ip_addr saddr, int port, struct babel_interface *bif); +ip_addr babel_get_addr(struct babel_tlv_header *hdr, struct babel_parse_state *state); +void babel_put_addr(struct babel_tlv_header *hdr, ip_addr addr); +void babel_new_packet(struct babel_interface *bif); +struct babel_tlv_header * babel_add_tlv(struct babel_interface *bif, u16 len); +#define BABEL_ADD_TLV_SEND(tlv,bif,func,addr) do { \ + tlv=func(bif); \ + if(!tlv) { \ + babel_send_to(bif,addr); \ + babel_new_packet(bif); \ + tlv=func(bif); \ + }} while (0); + +inline struct babel_tlv_ack_req * babel_add_tlv_ack_req(struct babel_interface *bif) +{ + return (struct babel_tlv_ack_req *) babel_add_tlv(bif, BABEL_TYPE_ACK_REQ); +} +inline struct babel_tlv_ack * babel_add_tlv_ack(struct babel_interface *bif) +{ + return (struct babel_tlv_ack *) babel_add_tlv(bif, BABEL_TYPE_ACK); +} +inline struct babel_tlv_hello * babel_add_tlv_hello(struct babel_interface *bif) +{ + return (struct babel_tlv_hello *) babel_add_tlv(bif, BABEL_TYPE_HELLO); +} +inline struct babel_tlv_ihu * babel_add_tlv_ihu(struct babel_interface *bif) +{ + return (struct babel_tlv_ihu *) babel_add_tlv(bif, BABEL_TYPE_IHU); +} +inline struct babel_tlv_router_id * babel_add_tlv_router_id(struct babel_interface *bif) +{ + return (struct babel_tlv_router_id *) babel_add_tlv(bif, BABEL_TYPE_ROUTER_ID); +} +inline struct babel_tlv_next_hop * babel_add_tlv_next_hop(struct babel_interface *bif) +{ + return (struct babel_tlv_next_hop *) babel_add_tlv(bif, BABEL_TYPE_NEXT_HOP); +} +inline struct babel_tlv_update * babel_add_tlv_update(struct babel_interface *bif) +{ + return (struct babel_tlv_update *) babel_add_tlv(bif, BABEL_TYPE_UPDATE); +} +inline struct babel_tlv_route_request * babel_add_tlv_route_request(struct babel_interface *bif) +{ + return (struct babel_tlv_route_request *) babel_add_tlv(bif, BABEL_TYPE_ROUTE_REQUEST); +} +inline struct babel_tlv_seqno_request * babel_add_tlv_seqno_request(struct babel_interface *bif) +{ + return (struct babel_tlv_seqno_request *) babel_add_tlv(bif, BABEL_TYPE_SEQNO_REQUEST); +} diff --git a/proto/babel/config.Y b/proto/babel/config.Y new file mode 100644 index 0000000..34fa6cf --- /dev/null +++ b/proto/babel/config.Y @@ -0,0 +1,84 @@ +/* + * BIRD -- Babel Configuration + * + * Can be freely distributed and used under the terms of the GNU GPL. + */ + + + +CF_HDR + +#include "proto/babel/babel.h" +#include "nest/iface.h" + +CF_DEFINES + +#define BABEL_CFG ((struct babel_proto_config *) this_proto) +#define BABEL_IPATT ((struct babel_patt *) this_ipatt) + +CF_DECLS + +CF_KEYWORDS(BABEL, METRIC, RXCOST, HELLO_INTERVAL, UPDATE_INTERVAL, PORT, WIRED, +WIRELESS, BABEL_METRIC) + +CF_GRAMMAR + +CF_ADDTO(proto, babel_cfg '}' { } ) + +babel_cfg_start: proto_start BABEL { + this_proto = proto_config_new(&proto_babel, $1); + babel_init_config(BABEL_CFG); + } + ; + +babel_cfg: + babel_cfg_start proto_name '{' + | babel_cfg proto_item ';' + | babel_cfg PORT expr ';' { BABEL_CFG->port = $3; } + | babel_cfg INTERFACE babel_iface ';' + ; + + +babel_iface_item: + | RXCOST expr { BABEL_IPATT->rxcost = $2; } + | TX tos { BABEL_IPATT->tx_tos = $2; } + | TX PRIORITY expr { BABEL_IPATT->tx_priority = $3; } + | TYPE WIRED { BABEL_IPATT->type = BABEL_IFACE_TYPE_WIRED; } + | TYPE WIRELESS { BABEL_IPATT->type = BABEL_IFACE_TYPE_WIRELESS; } + | HELLO_INTERVAL expr { BABEL_IPATT->hello_interval = $2; } + | UPDATE_INTERVAL expr { BABEL_IPATT->update_interval = $2; } + ; + +babel_iface_opts: + /* empty */ + | babel_iface_opts babel_iface_item ';' + ; + +babel_iface_opt_list: + /* empty */ + | '{' babel_iface_opts '}' + ; + +babel_iface_init: + /* EMPTY */ { + this_ipatt = cfg_allocz(sizeof(struct babel_patt)); + add_tail(&BABEL_CFG->iface_list, NODE this_ipatt); + init_list(&this_ipatt->ipn_list); + BABEL_IPATT->rxcost = BABEL_INFINITY; + BABEL_IPATT->type = BABEL_IFACE_TYPE_WIRED; + BABEL_IPATT->tx_tos = IP_PREC_INTERNET_CONTROL; + BABEL_IPATT->tx_priority = sk_priority_control; + BABEL_IPATT->hello_interval = BABEL_INFINITY; + BABEL_IPATT->update_interval = BABEL_INFINITY; + } + ; + +babel_iface: /* TODO: switch to iface_patt_list_nopx */ + babel_iface_init iface_patt_list babel_iface_opt_list + ; + +CF_ADDTO(dynamic_attr, BABEL_METRIC { $$ = f_new_dynamic_attr(EAF_TYPE_INT | EAF_TEMP, T_INT, EA_BABEL_METRIC); }) + +CF_CODE + +CF_END diff --git a/proto/babel/packet.c b/proto/babel/packet.c new file mode 100644 index 0000000..d1a3c16 --- /dev/null +++ b/proto/babel/packet.c @@ -0,0 +1,436 @@ +/** -*- c-file-style: "gnu"; -*- + * + * The Babel protocol + * + * Copyright (c) 2015 Toke Høiland-Jørgensen + * + * Can be freely distributed and used under the terms of the GNU GPL. + * + * This files contains the packet and TLV handling routines for the protocol. + */ + +#undef LOCAL_DEBUG +#define LOCAL_DEBUG 1 + +#include "babel.h" + + +#define FIRST_TLV(p) ((struct babel_tlv_header *)(((struct babel_header *) p) + 1)) +#define NEXT_TLV(t) (t = (void *)((char *)t) + TLV_SIZE(t)) +#define TLV_SIZE(t) (t->type == BABEL_TYPE_PAD0 ? 1 : t->length + sizeof(struct babel_tlv_header)) +#define TLV_LENGTH(t) (tlv_data[t].struct_length-sizeof(struct babel_tlv_header)) + + + +static ip_addr get_ip6_ll(u32 *addr) +{ + return ip6_or(ipa_build6(0xfe800000,0,0,0), + ipa_build6(0,0,ntohl(addr[0]),ntohl(addr[1]))); +} + + +struct babel_tlv_data { + int struct_length; + int (*handle)(struct babel_tlv_header *tlv, + struct babel_parse_state *state); + int (*validate)(struct babel_tlv_header *tlv, + struct babel_parse_state *state); + void (*hton)(struct babel_tlv_header *tlv); + void (*ntoh)(struct babel_tlv_header *tlv); + ip_addr (*get_addr)(struct babel_tlv_header *tlv, + struct babel_parse_state *state); + void (*put_addr)(struct babel_tlv_header *tlv, ip_addr addr); +}; + +static struct babel_tlv_data tlv_data[BABEL_TYPE_MAX] = { + {1, NULL,NULL,NULL,NULL,NULL}, + {3, NULL,NULL,NULL,NULL,NULL}, + {sizeof(struct babel_tlv_ack_req), + babel_handle_ack_req, babel_validate_length, + babel_hton_ack_req, babel_ntoh_ack_req, + NULL,NULL}, + {sizeof(struct babel_tlv_ack), + babel_handle_ack, babel_validate_length, + NULL, NULL, + NULL, NULL}, + {sizeof(struct babel_tlv_hello), + babel_handle_hello, babel_validate_length, + babel_hton_hello, babel_ntoh_hello, + NULL, NULL}, + {sizeof(struct babel_tlv_ihu), + babel_handle_ihu, babel_validate_ihu, + babel_hton_ihu, babel_ntoh_ihu, + babel_get_addr_ihu, babel_put_addr_ihu}, + {sizeof(struct babel_tlv_router_id), + babel_handle_router_id, babel_validate_length, + babel_hton_router_id, babel_ntoh_router_id, + NULL, NULL}, + {sizeof(struct babel_tlv_next_hop), + babel_handle_next_hop, babel_validate_next_hop, + NULL, NULL, + babel_get_addr_next_hop, babel_put_addr_next_hop}, + {sizeof(struct babel_tlv_update), + babel_handle_update, babel_validate_update, + babel_hton_update, babel_ntoh_update, + babel_get_addr_update, babel_put_addr_update}, + {sizeof(struct babel_tlv_route_request), + babel_handle_route_request, babel_validate_request, + NULL, NULL, + babel_get_addr_request, babel_put_addr_request}, + {sizeof(struct babel_tlv_seqno_request), + babel_handle_seqno_request, babel_validate_request, + babel_hton_seqno_request, babel_ntoh_seqno_request, + babel_get_addr_request, babel_put_addr_request}, +}; + +static inline int validate_tlv(struct babel_tlv_header *tlv, struct babel_parse_state *state) +{ + return (tlv_data[tlv->type].validate != NULL && tlv_data[tlv->type].validate(tlv, state)); +} + +void babel_hton_ack_req(struct babel_tlv_header *hdr) +{ + struct babel_tlv_ack_req *tlv = (struct babel_tlv_ack_req *)hdr; + tlv->interval = htons(tlv->interval); +} +void babel_ntoh_ack_req(struct babel_tlv_header *hdr) +{ + struct babel_tlv_ack_req *tlv = (struct babel_tlv_ack_req *)hdr; + tlv->interval = ntohs(tlv->interval); +} +void babel_hton_hello(struct babel_tlv_header *hdr) +{ + struct babel_tlv_hello *tlv = (struct babel_tlv_hello *)hdr; + tlv->seqno = htons(tlv->seqno); + tlv->interval = htons(tlv->interval); +} +void babel_ntoh_hello(struct babel_tlv_header *hdr) +{ + struct babel_tlv_hello *tlv = (struct babel_tlv_hello *)hdr; + tlv->seqno = ntohs(tlv->seqno); + tlv->interval = ntohs(tlv->interval); +} +int babel_validate_ihu(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_ihu *tlv = (struct babel_tlv_ihu *)hdr; + if(hdr->length < TLV_LENGTH(BABEL_TYPE_IHU)-sizeof(tlv->addr)) return 0; + return (tlv->ae == BABEL_AE_WILDCARD + || (tlv->ae == BABEL_AE_IP6_LL && hdr->length >= TLV_LENGTH(BABEL_TYPE_IHU))); +} +void babel_hton_ihu(struct babel_tlv_header *hdr) +{ + struct babel_tlv_ihu *tlv = (struct babel_tlv_ihu *)hdr; + tlv->rxcost = htons(tlv->rxcost); + tlv->interval = htons(tlv->interval); +} +void babel_ntoh_ihu(struct babel_tlv_header *hdr) +{ + struct babel_tlv_ihu *tlv = (struct babel_tlv_ihu *)hdr; + tlv->rxcost = ntohs(tlv->rxcost); + tlv->interval = ntohs(tlv->interval); +} +ip_addr babel_get_addr_ihu(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_ihu *tlv = (struct babel_tlv_ihu *)hdr; + struct babel_interface *bif = state->bif; + if(tlv->ae == BABEL_AE_WILDCARD) { + return bif->iface->addr->ip; /* FIXME: Correct? */ + } else if(tlv->ae == BABEL_AE_IP6_LL) { + return get_ip6_ll(tlv->addr); + } + return IPA_NONE; +} +void babel_put_addr_ihu(struct babel_tlv_header *hdr, ip_addr addr) +{ + char buf[16]; + struct babel_tlv_ihu *tlv = (struct babel_tlv_ihu *)hdr; + if(!ipa_is_link_local(addr)) { + tlv->ae = BABEL_AE_WILDCARD; + return; + } + put_ip6(buf,addr); + memcpy(tlv->addr, buf+8, 8); + tlv->ae = BABEL_AE_IP6_LL; +} +void babel_hton_router_id(struct babel_tlv_header *hdr) +{ + struct babel_tlv_router_id *tlv = (struct babel_tlv_router_id *)hdr; + tlv->router_id = htobe64(tlv->router_id); +} +void babel_ntoh_router_id(struct babel_tlv_header *hdr) +{ + struct babel_tlv_router_id *tlv = (struct babel_tlv_router_id *)hdr; + tlv->router_id = be64toh(tlv->router_id); +} +int babel_validate_next_hop(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_next_hop *tlv = (struct babel_tlv_next_hop *)hdr; + /* We don't speak IPv4, so only recognise IP6 LL next hops */ + if(tlv->ae != BABEL_AE_IP6_LL) return 0; + return babel_validate_length(hdr, state); +} +ip_addr babel_get_addr_next_hop(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_next_hop *tlv = (struct babel_tlv_next_hop *)hdr; + return get_ip6_ll(tlv->addr); +} +void babel_put_addr_next_hop(struct babel_tlv_header *hdr, ip_addr addr) +{ +} +int babel_validate_update(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_update *tlv = (struct babel_tlv_update *)hdr; + int min_length = TLV_LENGTH(BABEL_TYPE_UPDATE)-sizeof(tlv->addr); + u8 len = tlv->plen/8; + if(tlv->plen % 8) len++; + + if(tlv->plen > MAX_PREFIX_LENGTH) + return 0; + + if(hdr->length < min_length) return 0; + if(tlv->ae == BABEL_AE_IP4 /* we don't speak IPv4 */ + || tlv->ae >= BABEL_AE_MAX) /* invalid */ + return 0; + /* Can only omit bits if a previous update defined a prefix to take them from */ + if(tlv->omitted && ipa_equal(state->prefix, IPA_NONE)) + return 0; + + /* TLV should be large enough to old the entire prefix */ + if(hdr->length < min_length + len-tlv->omitted) + return 0; + + return 1; +} +void babel_hton_update(struct babel_tlv_header *hdr) +{ + struct babel_tlv_update *tlv = (struct babel_tlv_update *)hdr; + tlv->interval = htons(tlv->interval); + tlv->seqno = htons(tlv->seqno); + tlv->metric = htons(tlv->metric); +} +void babel_ntoh_update(struct babel_tlv_header *hdr) +{ + struct babel_tlv_update *tlv = (struct babel_tlv_update *)hdr; + tlv->interval = ntohs(tlv->interval); + tlv->seqno = ntohs(tlv->seqno); + tlv->metric = ntohs(tlv->metric); +} +ip_addr babel_get_addr_update(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + struct babel_tlv_update *tlv = (struct babel_tlv_update *)hdr; + char buf[16] = {0}; + u8 len = tlv->plen/8; + if(tlv->plen % 8) len++; + + /* fixed encodings */ + if(tlv->ae == BABEL_AE_WILDCARD) return IPA_NONE; + if(tlv->ae == BABEL_AE_IP6_LL) return get_ip6_ll(tlv->addr); + + /* if we have omitted bytes, get them from previous prefix */ + if(tlv->omitted) put_ipa(buf, state->prefix); + /* if the prefix is longer than the omitted octets, copy the rest */ + if(tlv->omitted < len) memcpy(buf+tlv->omitted, tlv->addr, len-tlv->omitted); + /* make sure the tail is zeroed */ + if(len < 16) memset(buf+len, 0, 16-len); + return get_ipa(buf); +} +void babel_put_addr_update(struct babel_tlv_header *hdr, ip_addr addr) +{ + struct babel_tlv_update *tlv = (struct babel_tlv_update *)hdr; + tlv->ae = BABEL_AE_IP6; + put_ipa(&tlv->addr, addr); +} +int babel_validate_request(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + /* Validates both seqno and route_request. Works because ae and plen fields + are in the same place. */ + struct babel_tlv_route_request *tlv = (struct babel_tlv_route_request *)hdr; + u8 len = tlv->plen/8; + if(tlv->plen % 8) len++; + + if(tlv->plen > MAX_PREFIX_LENGTH) + return 0; + + /* enough space to hold the prefix */ + if(hdr->length < TLV_LENGTH(hdr->type) - sizeof(tlv->addr) + len) + return 0; + /* wildcard requests must have plen 0 */ + if(tlv->ae == BABEL_AE_WILDCARD && tlv->plen > 0) + return 0; + + /* We don't speak IPv4, and prefixes cannot be link-local addresses. */ + if(tlv->ae != BABEL_AE_IP6 && tlv->ae != BABEL_AE_WILDCARD) + return 0; + + return 1; +} +ip_addr babel_get_addr_request(struct babel_tlv_header *hdr, + struct babel_parse_state *state) +{ + struct babel_tlv_route_request *tlv = (struct babel_tlv_route_request *)hdr; + char buf[16] = {0}; + u8 len = tlv->plen/8; + if(tlv->plen % 8) len++; + + /* fixed encoding */ + if(tlv->ae == BABEL_AE_WILDCARD) return IPA_NONE; + if(hdr->type == BABEL_TYPE_SEQNO_REQUEST) + memcpy(buf, ((struct babel_tlv_seqno_request *)tlv)->addr, len); + else + memcpy(buf, tlv->addr, len); + return get_ipa(buf); +} +void babel_put_addr_request(struct babel_tlv_header *hdr, ip_addr addr) +{ + struct babel_tlv_route_request *tlv = (struct babel_tlv_route_request *)hdr; + char buf[16] = {0}; + u8 len = tlv->plen/8; + if(tlv->plen % 8) len++; + put_ipa(buf, addr); + memcpy(tlv->addr, buf, len); +} +void babel_hton_seqno_request(struct babel_tlv_header *hdr) +{ + struct babel_tlv_seqno_request *tlv = (struct babel_tlv_seqno_request *)hdr; + tlv->seqno = htons(tlv->seqno); + tlv->router_id = htobe64(tlv->router_id); +} +void babel_ntoh_seqno_request(struct babel_tlv_header *hdr) +{ + struct babel_tlv_seqno_request *tlv = (struct babel_tlv_seqno_request *)hdr; + tlv->seqno = ntohs(tlv->seqno); + tlv->router_id = be64toh(tlv->router_id); +} +static void babel_tlv_hton(struct babel_tlv_header *hdr) +{ + if(tlv_data[hdr->type].hton) { + tlv_data[hdr->type].hton(hdr); + } +} + +static void babel_tlv_ntoh(struct babel_tlv_header *hdr) +{ + if(tlv_data[hdr->type].ntoh) { + tlv_data[hdr->type].ntoh(hdr); + } +} + +static void babel_packet_hton(struct babel_header *hdr) +{ + struct babel_tlv_header *tlv = FIRST_TLV(hdr); + int len = hdr->length+sizeof(struct babel_header); + char *p = (char *)hdr; + while((char *)tlv < p+len) { + babel_tlv_hton(tlv); + NEXT_TLV(tlv); + } + hdr->length = htons(hdr->length); +} + +ip_addr babel_get_addr(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + if(tlv_data[hdr->type].get_addr) { + return tlv_data[hdr->type].get_addr(hdr, state); + } + return IPA_NONE; +} +void babel_put_addr(struct babel_tlv_header *hdr, ip_addr addr) +{ + if(tlv_data[hdr->type].put_addr) { + tlv_data[hdr->type].put_addr(hdr, addr); + } +} + +void babel_new_packet(struct babel_interface *bif) +{ + sock *s = bif->sock; + struct babel_header *hdr = (void *) s->tbuf; + memset(hdr, 0, sizeof(struct babel_header)); + hdr->magic = BABEL_MAGIC; + hdr->version = BABEL_VERSION; +} + +struct babel_tlv_header * babel_add_tlv(struct babel_interface *bif, u16 type) +{ + sock *s = bif->sock; + struct babel_header *hdr = (void *) s->tbuf; + struct babel_tlv_header *tlv; + int len = tlv_data[type].struct_length; + int pktlen = sizeof(struct babel_header)+hdr->length; + if(pktlen+len > bif->max_pkt_len) { + return NULL; + } + hdr->length+=len; + tlv = (struct babel_tlv_header *)((char*)hdr+pktlen); + memset(tlv, 0, len); + tlv->type = type; + tlv->length = TLV_LENGTH(type); + return tlv; +} + +void babel_send_to(struct babel_interface *bif, ip_addr dest) +{ + sock *s = bif->sock; + struct babel_header *hdr = (void *) s->tbuf; + int len = hdr->length+sizeof(struct babel_header); + int done; + + babel_packet_hton(hdr); + + DBG( "Sending %d bytes to %I\n", len, dest); + done = sk_send_to(s, len, dest, 0); + if(!done) + log(L_WARN "Babel: TX queue full on %s", bif->ifname); +} + +void babel_send( struct babel_interface *bif ) +{ + babel_send_to(bif, IP6_BABEL_ROUTERS); +} + +int babel_process_packet(struct babel_header *pkt, int size, + ip_addr saddr, int port, struct babel_interface *bif) +{ + struct babel_tlv_header *tlv = FIRST_TLV(pkt); + struct proto *proto = bif->proto; + struct babel_parse_state state = { + .proto = proto, + .bif = bif, + .saddr = saddr, + .prefix = IPA_NONE, + .next_hop = saddr, + }; + char *p = (char *)pkt; + int res = 0; + + pkt->length = ntohs(pkt->length); + if(pkt->magic != BABEL_MAGIC + || pkt->version != BABEL_VERSION + || pkt->length > size - sizeof(struct babel_header)) { + DBG("Invalid packet: magic %d version %d length %d size %d\n", + pkt->magic, pkt->version, pkt->length, size); + return 1; + } + + while((char *)tlv < p+size) { + if(tlv->type > BABEL_TYPE_PADN + && tlv->type < BABEL_TYPE_MAX + && validate_tlv(tlv, &state)) { + babel_tlv_ntoh(tlv); + res &= tlv_data[tlv->type].handle(tlv, &state); + } else { + DBG("Unknown or invalid TLV of type %d\n",tlv->type); + } + NEXT_TLV(tlv); + } + if(state.needs_update) + babel_send_update(bif); + return res; +} + +int babel_validate_length(struct babel_tlv_header *hdr, struct babel_parse_state *state) +{ + /*DBG("Validate type: %d length: %d needed: %d\n", hdr->type, hdr->length, + tlv_data[hdr->type].struct_length - sizeof(struct babel_tlv_header));*/ + return (hdr->length >= tlv_data[hdr->type].struct_length - sizeof(struct babel_tlv_header)); +} diff --git a/sysdep/autoconf.h.in b/sysdep/autoconf.h.in index a9e46e2..c73270c 100644 --- a/sysdep/autoconf.h.in +++ b/sysdep/autoconf.h.in @@ -43,6 +43,7 @@ #undef CONFIG_BGP #undef CONFIG_OSPF #undef CONFIG_PIPE +#undef CONFIG_BABEL /* We use multithreading */ #undef USE_PTHREADS -- 2.5.0
On Tue, Aug 18, 2015 at 11:06:03PM +0100, Toke Høiland-Jørgensen wrote:
The implementation is currently IPv6-only (since Bird does not support mixed IPv4/6 protocols in the same instance), but is otherwise complete as far as MUSTs in the RFC is concerned, with one exception: The RFC specifies that jitter must be applied to packet transmission times. This implementation currently does not buffer TLVs before sending them, but it does introduce some randomness to sending of global updates by timer and event scheduling randomness in the Bird core.
Hi, i am finally sending the first batch of comments. This time it is mostly general comments.
Some cleanup and possible reorganisation is needed in the code. Consider this patch an RFC for the general structure of the protocol addition.
The implementation started out as a copy of the RIP protocol implementation, so some patterns in how the communication with the core is setup is taken from there (but has diverted quite a bit as the different parts of the protocol implementation fell into place).
Using the old RIP protocol implementation as a basis is a problem - that code has several design problems, some of these seems to be shared by the Babel code. Not to mention its idiosyncratic coding conventions are far from ones used in the rest of BIRD. We are currently finishing rewrite of BIRD RIP code to eliminate these problems from RIP. I will send you the new RIP code for inspiration. Route propagation between protocol and core: Old RIP uses core tables to keep some of its route state and therefore has nontrivial interaction with core. This should be avoided - ideally core tables should be handled as a black box with rte_update() in one direction and rt_notify() (and some other auxiliary hooks) in the other. Another important question is how to handle and compare external routes relative to internal routes. For vector-distance (route-propagating) protocols it seems natural to handle external routes received from core like if they are from another neighbor. But this approach is problematic because internal tables and core tables will fight for the right to choose the best route. The proper approach is that the protocol keeps information about incoming routes, choose best of incoming routes, propagate that to core, then core chooses the the global best route, and propagate that back to protocols. The protocol learn that route (which may be either its or external), keep it as 'outgoing' and propagate it to neighbors. Also note that unreachable routing entries should not be propagated to core. Interface and neighbor events: You should create babel_interface structure during babel_if_notify() hook, not as a result from acquiring lock. The used approach is potentially racy (you could end with multiple locks if you get multiple up/down events before lock events). Note that the lock could be allocated from the iface resource pool. See ospf_iface_new()/ospf_iface_add() or radv_iface_new()/radv_iface_add(). I am bit worried about liveness of neighbors and ordering of hooks. Say an iface disappears. First you probably get babel_neigh_notify(), which does nothing, then you get babel_if_notify(), which will do kill_iface() and babel_flush_neighbor(), which unregisters data from the neighbor (bn->neigh->data = NULL), but such neighbor is likely to be already disposed. Some minor notes: Please add comments to each 'list' field in structures to keep which structures are members of that lists (see e.g. radv.h). Having timers for each route or entry is not necessary a good idea, perhaps this could be handled by keeping just timestamps in routes and having common 'timeout' hook, which would walk the fib table and process nodes - see rip_timer() or ospf_disp()/ospf_update_lsadb(). Also having resource pools for entries seems excessive. babel_route structures could be just allocated from a common slab. Not sure what is a point of struct neighbor_route. If you just want to keep babel_route in two linked list, just add two 'node' fields in it and use SKIP_BACK macro. See e.g. struct proto. Avoid unnecessary floating points, just use fixed point math instead. It is a good idea to have a ptr from babel_iface to babel_patt instead of duplicating all the values. Although you would have to change it during reconfiguration, that is the simplest part of reconf. And if you do not have reconfiguration implemented, then the whole protocol is restarted anyways. Do you have any kind of multihop unicast communication in Babel, or everything is single-hop to neighbors? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej Zajicek <santiago@crfreenet.org> writes:
Hi, i am finally sending the first batch of comments. This time it is mostly general comments.
Thank you very much for the comments. I have a few follow-up questions (below), but will otherwise revise the implementation accordingly and resubmit :)
The proper approach is that the protocol keeps information about incoming routes, choose best of incoming routes, propagate that to core, then core chooses the the global best route, and propagate that back to protocols. The protocol learn that route (which may be either its or external), keep it as 'outgoing' and propagate it to neighbors.
Yes, this was what I was (attempting to) do as well: Anything from the core is treated as locally exported with metric 0.
Also note that unreachable routing entries should not be propagated to core.
This is actually done to satisfy a requirement of the Babel protocol: Temporary blackholing is used to avoid routing loops. Quoting section 3.5.5 of RFC6126: To avoid this issue, whenever a prefix is retracted, a routing table entry with infinite metric is maintained as described in Section 3.5.4 above, and packets destined for that prefix MUST NOT be forwarded by following a route for a shorter prefix. The infinite metric entry is maintained until it is superseded by a feasible update; if no such update arrives within the route hold time, the entry is flushed. And see also section 2.8 of the RFC for a description of the logic behind this. Now, if the protocol can't propagate an infeasible route to the core, how do I satisfy this requirement?
I am bit worried about liveness of neighbors and ordering of hooks. Say an iface disappears. First you probably get babel_neigh_notify(), which does nothing, then you get babel_if_notify(), which will do kill_iface() and babel_flush_neighbor(), which unregisters data from the neighbor (bn->neigh->data = NULL), but such neighbor is likely to be already disposed.
Noted. Is there a description somewhere of what hooks are called on what events (and in what order)? Section 2.6 of the programmer's manual only has fairly short descriptions (in particular, what events trigger neighbour state changes?), and there are no ordering information.
Do you have any kind of multihop unicast communication in Babel, or everything is single-hop to neighbors?
Babel only sends packets to link-local addresses of neighbours (and almost all packets can be either unicast or multicast). However, a node can forward seqno request packets to a different neighbour; but this is again only done by link-local unicast. -Toke
On Tue, Aug 25, 2015 at 03:01:05PM +0200, Toke Høiland-Jørgensen wrote:
Also note that unreachable routing entries should not be propagated to core.
This is actually done to satisfy a requirement of the Babel protocol: Temporary blackholing is used to avoid routing loops. Quoting section 3.5.5 of RFC6126: ... Now, if the protocol can't propagate an infeasible route to the core, how do I satisfy this requirement?
Didn't know that. You could propagate the unreachable route in the sense that it will work. But usually it is not a good idea - e.g. some other protocol could have regular routes for the same network, or there could be some covering route. BTW, the argumentation in 3.5.5 of RFC 6126 seems a bit strange to me. It essentially says that unreachable routes are added to avoid transient routing loops before Babel converges. But transient routing loops until convergence is a common behavior for IGPs (both RIP, OSPF and IS-IS do that), while blackholing may be far less expected behavior, esp. if it is for several minutes, which is far longer time than usually necessary for protocol convergence. It seems more like local policy setting than something which should be part of protocol specification.
I am bit worried about liveness of neighbors and ordering of hooks. Say an iface disappears. First you probably get babel_neigh_notify(), which does nothing, then you get babel_if_notify(), which will do kill_iface() and babel_flush_neighbor(), which unregisters data from the neighbor (bn->neigh->data = NULL), but such neighbor is likely to be already disposed.
Noted. Is there a description somewhere of what hooks are called on what events (and in what order)? Section 2.6 of the programmer's manual only has fairly short descriptions (in particular, what events trigger neighbour state changes?), and there are no ordering information.
It is true that this part is probably not described anywhere in depth. There are some notes in nest/protocol.h that are not propagated to the documentation. For the rest, you can use the source, but sometimes even the source is 'wrong'.
Do you have any kind of multihop unicast communication in Babel, or everything is single-hop to neighbors?
Babel only sends packets to link-local addresses of neighbours (and almost all packets can be either unicast or multicast). However, a node can forward seqno request packets to a different neighbour; but this is again only done by link-local unicast.
In that case you have always non-NULL babel_iface.iface and you could remove some vestigial code from RIP, like: bif->iface ? bif->iface->name : "(dummy)" -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej Zajicek <santiago@crfreenet.org> writes:
Also note that unreachable routing entries should not be propagated to core.
This is actually done to satisfy a requirement of the Babel protocol: Temporary blackholing is used to avoid routing loops. Quoting section 3.5.5 of RFC6126: ... Now, if the protocol can't propagate an infeasible route to the core, how do I satisfy this requirement?
Didn't know that. You could propagate the unreachable route in the sense that it will work. But usually it is not a good idea - e.g. some other protocol could have regular routes for the same network, or there could be some covering route.
BTW, the argumentation in 3.5.5 of RFC 6126 seems a bit strange to me. It essentially says that unreachable routes are added to avoid transient routing loops before Babel converges. But transient routing loops until convergence is a common behavior for IGPs (both RIP, OSPF and IS-IS do that), while blackholing may be far less expected behavior, esp. if it is for several minutes, which is far longer time than usually necessary for protocol convergence. It seems more like local policy setting than something which should be part of protocol specification.
Hmm, I think Juliusz is probably better at answering that than I am; added to CC... Juliusz, care to weigh in?
It is true that this part is probably not described anywhere in depth. There are some notes in nest/protocol.h that are not propagated to the documentation. For the rest, you can use the source, but sometimes even the source is 'wrong'.
Will poke around, thanks.
In that case you have always non-NULL babel_iface.iface and you could remove some vestigial code from RIP, like: bif->iface ? bif->iface->name : "(dummy)"
Yeah, have already removed that in some places, but some house cleaning is probably needed :) -Toke
BTW, the argumentation in 3.5.5 of RFC 6126 seems a bit strange to me. It essentially says that unreachable routes are added to avoid transient routing loops before Babel converges. But transient routing loops until convergence is a common behavior for IGPs (both RIP, OSPF and IS-IS do that), while blackholing may be far less expected behavior, esp. if it is for several minutes, which is far longer time than usually necessary for protocol convergence.
Yes. Babel is designed to be robust not only on wired networks (where OSPF and IS-IS work just fine), but also on wireless mesh networks, where a routing loop, even a transient one, causes cross-link interference and may prevent the routing protocol from reconverging. For that reason, Babel prefers to blackhole rather than risking to create a routing loop.
It seems more like local policy setting than something which should be part of protocol specification.
I agree, it could perhaps be disabled on some kinds of links. -- Juliusz
On 25 Aug 2015, at 19:34, Juliusz Chroboczek <jch@pps.univ-paris-diderot.fr> wrote:
Yes. Babel is designed to be robust not only on wired networks (where OSPF and IS-IS work just fine), but also on wireless mesh networks, where a routing loop, even a transient one, causes cross-link interference and may prevent the routing protocol from reconverging. For that reason, Babel prefers to blackhole rather than risking to create a routing loop.
Interesting. For my own education, why blackhole the route rather than simply not propagate it at all to the FIB (i.e. ignore it)? Is the aim here to avoid the consequent ICMP unreachable packets closing TCP sessions during reconvergence? How should the blackhole route interact with other routing protocols? -- Alex Bligh
For my own education, why blackhole the route rather than simply not propagate it at all to the FIB (i.e. ignore it)?
Please see RFC 6126 Section 2.8: https://tools.ietf.org/html/rfc6126#section-2.8 -- Juliusz
Toke Høiland-Jørgensen <toke@toke.dk> writes:
It is true that this part is probably not described anywhere in depth. There are some notes in nest/protocol.h that are not propagated to the documentation. For the rest, you can use the source, but sometimes even the source is 'wrong'.
Will poke around, thanks.
One thing that I have not been able to find is the neighbour stuff: Since Babel has its own neighbour discovery sub-protocol (Hellos / I Heard You's), I don't strictly need to use the core mechanism, I guess, but I started out using it anyway since I thought it might gain me something. But I haven't seen any neighbour events in any of the testing I have done yet. So what events cause a neighbour to go away / come back as far as the bird core is concerned? Is it just the obvious case of the interface going up/down and gaining or losing addresses? Or is there something smarter? -Toke
On Tue, Aug 25, 2015 at 11:18:34PM +0200, Toke Høiland-Jørgensen wrote:
So what events cause a neighbour to go away / come back as far as the bird core is concerned? Is it just the obvious case of the interface going up/down and gaining or losing addresses? Or is there something smarter?
Yes, just interface or prefix up/down events and link change events. There are two common usages for neighbors: Persistent (NEF_STICKY) neighbors are used by protocols that do not generally process iface/prefix events because they just care about some explicit addresses (e.g. BGP for the configured address of the peer). Regular neighbors are used by protocols just for validation that IP address is valid and associated with given iface (or finding the associated iface). In that case a neighbor is requested but not kept (ptr not stored in structures) and neighbor hooks are not used. See rt_sync() in OSPF or advertise_entry() in the old RIP code. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Hi Ondrej,
Using the old RIP protocol implementation as a basis is a problem - that code has several design problems, some of these seems to be shared by the Babel code. Not to mention its idiosyncratic coding conventions are far from ones used in the rest of BIRD. We are currently finishing rewrite of BIRD RIP code to eliminate these problems from RIP. I will send you the new RIP code for inspiration.
since yours *very welcome* RIP re-design is it late now to ask you to implement on it the IPIP (IP version 4) protocol? Our worldwide hamradio community (about 500 gateways around the world) named 44-net mesh based on 44.x.x.x IP numbers could benefit of that feature and directly manage (I hope) that RIP broadcast (coming from the central router at ucsd.edu (University of San Diego - California) through a dedicated 'tunl0' interface in our linux systems. Actually we use two types of 44ripd servers to manage/decode that RIP protocol and everything works very OK... but having that feature implemented in native mode on BIRD may simplify the thing and revamps the old good RIP protocol too :) many thanks gus
On Tue, Aug 18, 2015 at 11:06:02PM +0100, Toke Høiland-Jørgensen wrote:
Hi everyone
Over the past couple of weeks, I have amused myself with adding support for the Babel routing protocol (RFC 6126) to Bird. Quoting the RFC:
Babel is a loop-avoiding distance-vector routing protocol that is robust and efficient both in ordinary wired networks and in wireless mesh networks.
The implementation is a clean-slate implementation from the RFC, and it is now at the state where it is reasonably complete (I think), and communicates with the official babeld implementation.
At this stage I'm not proposing this be included in Bird proper: I'm pretty sure I got at least parts of the interaction with the core wrong, and there is probably some cleanup needed; and as you can see from the list of limitations below, some things are still needed for this to be a production-quality implementation of the protocol. However, I thought that having some feedback to guide me at this stage would be useful.
I would thus like to solicit your opinion on whether or not this is something you could see eventually making it into Bird proper (given that I perform the necessary rework of the code)? And if so, to comment on the implementation and how it might be improved. In particular, I'm interested in your opinion on what to do about IPv4 support (see below).
Hi I am glad to see your contribution, we definitely want to have Babel implementation in BIRD. The main issue here is that another developer addressed me privately some time ago that he is developing Babel implementation and he is planning to have alpha-quality implementation in 2015-09. I will ask him about the state of his work and i will let you know. Anyways, i will look at your code and send some comments ASAP. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej Zajicek <santiago@crfreenet.org> writes:
I am glad to see your contribution, we definitely want to have Babel implementation in BIRD. The main issue here is that another developer addressed me privately some time ago that he is developing Babel implementation and he is planning to have alpha-quality implementation in 2015-09. I will ask him about the state of his work and i will let you know.
Ah, right. Wasn't aware of that, obviously. I'll be happy to collaborate with him to get to the point of an implementation worth merging, though :)
Anyways, i will look at your code and send some comments ASAP.
Cool, will look forward to your comments! -Toke
participants (5)
-
Alex Bligh -
Gustavo Ponza -
Juliusz Chroboczek -
Ondrej Zajicek -
Toke Høiland-Jørgensen