BGP: Add support for peerings with link-local v6 addresses
Hi, I already posted a first patch to add support for link-local addresses for BGP peerings one year ago, and now I've finally found time to update the patch for the current bird version and fix the problems mentioned in reply to the first patch. This patch series allows using NEF_STICKY neighbors for link-local addresses by saving the interface name in the neighbor structure, thus working even when the interface is removed and re-added causing the interface index to change. The BPG patch itself adds an interface attribute to the BGP protocol definition allowing to specify an interface to use for the BGP session. An additional patch that is needed to make this work under FreeBSD is also included; the current behaviour doesn't handle numbered v6 addresses on peer-to-peer interfaces correctly, which link-local addresses are generally. I didn't get the chance to test this under other BSDs though. The patches have been tested under Linux and FreeBSD, with Bird and Quagga peers, and a few different configurations. Best regards, Matthias Schiffer
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> --- nest/iface.h | 2 ++ nest/neighbor.c | 34 +++++++++++++++++++++++++++------- 2 files changed, 29 insertions(+), 7 deletions(-) diff --git a/nest/iface.h b/nest/iface.h index 7307844..8636c0a 100644 --- a/nest/iface.h +++ b/nest/iface.h @@ -106,6 +106,7 @@ typedef struct neighbor { node if_n; /* Node in per-interface neighbor list */ ip_addr addr; /* Address of the neighbor */ struct iface *iface; /* Interface it's connected to */ + char bound_iface[16]; /* Name of the interface it's bound to */ struct proto *proto; /* Protocol this belongs to */ void *data; /* Protocol-specific data */ unsigned aux; /* Protocol-specific data */ @@ -118,6 +119,7 @@ typedef struct neighbor { neighbor *neigh_find(struct proto *, ip_addr *, unsigned flags); neighbor *neigh_find2(struct proto *p, ip_addr *a, struct iface *ifa, unsigned flags); +neighbor *neigh_find_ifname(struct proto *, ip_addr *, char *, unsigned flags); static inline int neigh_connected_to(struct proto *p, ip_addr *a, struct iface *i) { diff --git a/nest/neighbor.c b/nest/neighbor.c index 67bd32b..458d9fb 100644 --- a/nest/neighbor.c +++ b/nest/neighbor.c @@ -49,6 +49,9 @@ static slab *neigh_slab; static list sticky_neigh_list, neigh_hash_table[NEIGH_HASH_SIZE]; +static neighbor * +neigh_find_internal(struct proto *p, ip_addr *a, struct iface *ifa, char *bound_iface, unsigned flags); + static inline unsigned int neigh_hash(struct proto *p, ip_addr *a) { @@ -103,18 +106,32 @@ if_connected(ip_addr *a, struct iface *i) /* -1=error, 1=match, 0=no match */ * If the node is not connected directly or *@a is not a valid unicast * IP address, neigh_find() returns %NULL. */ + + neighbor * neigh_find(struct proto *p, ip_addr *a, unsigned flags) { - return neigh_find2(p, a, NULL, flags); + return neigh_find_internal(p, a, NULL, NULL, flags); } neighbor * neigh_find2(struct proto *p, ip_addr *a, struct iface *ifa, unsigned flags) { + return neigh_find_internal(p, a, ifa, ifa ? ifa->name : NULL, flags); +} + +neighbor * +neigh_find_ifname(struct proto *p, ip_addr *a, char *ifname, unsigned flags) +{ + return neigh_find_internal(p, a, ifname ? if_find_by_name(ifname) : NULL, ifname, flags); +} + +static neighbor * +neigh_find_internal(struct proto *p, ip_addr *a, struct iface *ifa, char *ifname, unsigned flags) +{ neighbor *n; - int class, scope = -1; ; + int class, scope = -1; unsigned int h = neigh_hash(p, a); struct iface *i; @@ -126,7 +143,7 @@ neigh_find2(struct proto *p, ip_addr *a, struct iface *ifa, unsigned flags) if (class < 0) /* Invalid address */ return NULL; if (((class & IADDR_SCOPE_MASK) == SCOPE_HOST) || - (((class & IADDR_SCOPE_MASK) == SCOPE_LINK) && (ifa == NULL)) || + (((class & IADDR_SCOPE_MASK) == SCOPE_LINK) && (ifa == NULL) && (ifname == NULL)) || !(class & IADDR_HOST)) return NULL; /* Bad scope or a somecast */ @@ -137,7 +154,7 @@ neigh_find2(struct proto *p, ip_addr *a, struct iface *ifa, unsigned flags) if ((scope < 0) && (flags & NEF_ONLINK)) scope = class & IADDR_SCOPE_MASK; } - else + else if (ifname == NULL) WALK_LIST(i, iface_list) if ((scope = if_connected(a, i)) >= 0) { @@ -160,8 +177,6 @@ neigh_find2(struct proto *p, ip_addr *a, struct iface *ifa, unsigned flags) } else { - /* sticky flag does not work for link-local neighbors; - fortunately, we don't use this combination */ add_tail(&sticky_neigh_list, &n->n); ifa = NULL; scope = -1; @@ -172,6 +187,8 @@ neigh_find2(struct proto *p, ip_addr *a, struct iface *ifa, unsigned flags) n->aux = 0; n->flags = flags; n->scope = scope; + strncpy(n->bound_iface, ifname ?: "", sizeof(n->bound_iface)-1); + return n; } @@ -193,6 +210,8 @@ neigh_dump(neighbor *n) debug("%s %p %08x scope %s", n->proto->name, n->data, n->aux, ip_scope_text(n->scope)); if (n->flags & NEF_STICKY) debug(" STICKY"); + if (*n->bound_iface) + debug(" bound to %s", n->bound_iface); debug("\n"); } @@ -262,7 +281,8 @@ neigh_if_up(struct iface *i) int scope; WALK_LIST_DELSAFE(n, next, sticky_neigh_list) - if ((scope = if_connected(&n->addr, i)) >= 0) + if ((!(*n->bound_iface) || strcmp(n->bound_iface, i->name) == 0) + && (scope = if_connected(&n->addr, i)) >= 0) neigh_up(n, i, scope); } -- 1.7.8.1
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> --- proto/bgp/bgp.c | 10 ++++++++-- proto/bgp/bgp.h | 1 + proto/bgp/config.Y | 4 +++- proto/bgp/packets.c | 3 +++ 4 files changed, 15 insertions(+), 3 deletions(-) diff --git a/proto/bgp/bgp.c b/proto/bgp/bgp.c index 28396a5..67594c7 100644 --- a/proto/bgp/bgp.c +++ b/proto/bgp/bgp.c @@ -585,6 +585,11 @@ bgp_connect(struct bgp_proto *p) /* Enter Connect state and start establishing c s->tos = IP_PREC_INTERNET_CONTROL; s->password = p->cf->password; s->tx_hook = bgp_connected; + + /* Can only be NULL for multihop instances */ + if (p->neigh && *p->cf->interface) + s->iface = p->neigh->iface; + BGP_TRACE(D_EVENTS, "Connecting to %I from local address %I", s->daddr, s->saddr); bgp_setup_conn(p, conn); bgp_setup_sk(conn, s); @@ -793,7 +798,7 @@ bgp_start_locked(struct object_lock *lock) return; } - p->neigh = neigh_find(&p->p, &cf->remote_ip, NEF_STICKY); + p->neigh = neigh_find_ifname(&p->p, &cf->remote_ip, (*cf->interface) ? cf->interface : NULL, NEF_STICKY); if (!p->neigh || (p->neigh->scope == SCOPE_HOST)) { log(L_ERR "%s: Invalid remote address %I", p->p.name, cf->remote_ip); @@ -807,7 +812,7 @@ bgp_start_locked(struct object_lock *lock) if (p->neigh->iface) bgp_start_neighbor(p); else - BGP_TRACE(D_EVENTS, "Waiting for %I to become my neighbor", cf->remote_ip); + BGP_TRACE(D_EVENTS, "Waiting for %I to become my neighbor%s%s", cf->remote_ip, (*cf->interface) ? " on " : "", cf->interface); } static int @@ -852,6 +857,7 @@ bgp_start(struct proto *P) lock->iface = NULL; lock->hook = bgp_start_locked; lock->data = p; + olock_acquire(lock); return PS_START; diff --git a/proto/bgp/bgp.h b/proto/bgp/bgp.h index 0c50583..76f0400 100644 --- a/proto/bgp/bgp.h +++ b/proto/bgp/bgp.h @@ -50,6 +50,7 @@ struct bgp_config { unsigned error_delay_time_min; /* Time to wait after an error is detected */ unsigned error_delay_time_max; unsigned disable_after_error; /* Disable the protocol when error is detected */ + char interface[16]; /* Interface to use */ char *password; /* Password used for MD5 authentication */ struct rtable_config *igp_table; /* Table used for recursive next hop lookups */ }; diff --git a/proto/bgp/config.Y b/proto/bgp/config.Y index 3ef9b29..6ec3623 100644 --- a/proto/bgp/config.Y +++ b/proto/bgp/config.Y @@ -25,7 +25,8 @@ CF_KEYWORDS(BGP, LOCAL, NEIGHBOR, AS, HOLD, TIME, CONNECT, RETRY, CLUSTER, ID, AS4, ADVERTISE, IPV4, CAPABILITIES, LIMIT, PASSIVE, PREFER, OLDER, MISSING, LLADDR, DROP, IGNORE, ROUTE, REFRESH, INTERPRET, COMMUNITIES, BGP_ORIGINATOR_ID, BGP_CLUSTER_LIST, IGP, - TABLE, GATEWAY, DIRECT, RECURSIVE, MED, TTL, SECURITY, DETERMINISTIC) + TABLE, GATEWAY, DIRECT, RECURSIVE, MED, TTL, SECURITY, DETERMINISTIC, + INTERFACE) CF_GRAMMAR @@ -100,6 +101,7 @@ bgp_proto: | bgp_proto INTERPRET COMMUNITIES bool ';' { BGP_CFG->interpret_communities = $4; } | bgp_proto IGP TABLE rtable ';' { BGP_CFG->igp_table = $4; } | bgp_proto TTL SECURITY bool ';' { BGP_CFG->ttl_security = $4; } + | bgp_proto INTERFACE TEXT ';' { strncpy(BGP_CFG->interface, $3, sizeof(BGP_CFG->interface)-1); } ; CF_ADDTO(dynamic_attr, BGP_ORIGIN diff --git a/proto/bgp/packets.c b/proto/bgp/packets.c index c3a8673..7d78d9f 100644 --- a/proto/bgp/packets.c +++ b/proto/bgp/packets.c @@ -434,6 +434,9 @@ bgp_create_update(struct bgp_conn *conn, byte *buf) *tmp++ = BGP_AF_IPV6; *tmp++ = 1; + if (ipa_has_link_scope(ip)) + ip = IPA_NONE; + if (ipa_nonzero(ip_ll)) { *tmp++ = 32; -- 1.7.8.1
Hello!
diff --git a/proto/bgp/bgp.h b/proto/bgp/bgp.h index 0c50583..76f0400 100644 --- a/proto/bgp/bgp.h +++ b/proto/bgp/bgp.h @@ -50,6 +50,7 @@ struct bgp_config { unsigned error_delay_time_min; /* Time to wait after an error is detected */ unsigned error_delay_time_max; unsigned disable_after_error; /* Disable the protocol when error is detected */ + char interface[16]; /* Interface to use */
Please do not place arbitrary limits on interface names. Instead of copying the interface name, you can store a pointer to the string -- it is allocated from the configuration mempool, which is kept until reconfiguration. However, you have to extend bgp_reconfigure() to compare the old and new interface name. Have a nice fortnight -- Martin `MJ' Mares <mj@ucw.cz> http://mj.ucw.cz/ Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth int random(void) { return 4; /* Random number chosen by a fair dice roll. */ }
Hi, the limit to 15 characters for interface names is used all over bird (look at 'struct iface' in nest/iface.h), as this is the limit imposed by most Unix APIs, so I didn't see a reason not to use this limit. Now that you mention bgp_reconfigure(), I see though that in my current patch the comparision can behave strangely when a user has more than one interface line in his BGP config, as the unused bytes of the interface string are compared as well. I guess this isn't a big problem as such a config doesn't make sense, and the comparison is still deterministic as the same bytes will be set in the interface field for the same config. Still, it would be cleaner to zero out bgp_config.interface before setting it. Comments? Thanks for looking at my patches, Matthias Schiffer On 12/31/2011 11:38 AM, Martin Mares wrote:
Hello!
diff --git a/proto/bgp/bgp.h b/proto/bgp/bgp.h index 0c50583..76f0400 100644 --- a/proto/bgp/bgp.h +++ b/proto/bgp/bgp.h @@ -50,6 +50,7 @@ struct bgp_config { unsigned error_delay_time_min; /* Time to wait after an error is detected */ unsigned error_delay_time_max; unsigned disable_after_error; /* Disable the protocol when error is detected */ + char interface[16]; /* Interface to use */
Please do not place arbitrary limits on interface names.
Instead of copying the interface name, you can store a pointer to the string -- it is allocated from the configuration mempool, which is kept until reconfiguration.
However, you have to extend bgp_reconfigure() to compare the old and new interface name.
Have a nice fortnight
Hello!
the limit to 15 characters for interface names is used all over bird (look at 'struct iface' in nest/iface.h), as this is the limit imposed by most Unix APIs, so I didn't see a reason not to use this limit.
Actually, it is not all over bird, but only in nest/iface.h (try grepping for '\[16]'). Using a hard-coded number (not even a named constant) for size of anything global is a mistake -- indeed, my own mistake somewhere in the distant past :) -- and I think we should get rid of it, not follow it in new code.
Now that you mention bgp_reconfigure(), I see though that in my current patch the comparision can behave strangely when a user has more than one interface line in his BGP config, as the unused bytes of the interface string are compared as well.
I guess this isn't a big problem as such a config doesn't make sense, and the comparison is still deterministic as the same bytes will be set in the interface field for the same config. Still, it would be cleaner to zero out bgp_config.interface before setting it. Comments?
I think that storing a pointer to the string gets rid of this problem nicely :) Have a nice fortnight -- Martin `MJ' Mares <mj@ucw.cz> http://mj.ucw.cz/ Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth "Object orientation is in the mind, not in the compiler." -- Alan Cox
Okay, I'll fix that. What about the neighbor.bound_iface field I introduced in my patch series? Should I just allocate the interface name from the same slab the neighbor node itself is allocated from, and free it before the neighbor is freed? Matthias Schiffer On 12/31/2011 02:13 PM, Martin Mares wrote:
Hello!
the limit to 15 characters for interface names is used all over bird (look at 'struct iface' in nest/iface.h), as this is the limit imposed by most Unix APIs, so I didn't see a reason not to use this limit.
Actually, it is not all over bird, but only in nest/iface.h (try grepping for '\[16]').
Using a hard-coded number (not even a named constant) for size of anything global is a mistake -- indeed, my own mistake somewhere in the distant past :) -- and I think we should get rid of it, not follow it in new code.
Now that you mention bgp_reconfigure(), I see though that in my current patch the comparision can behave strangely when a user has more than one interface line in his BGP config, as the unused bytes of the interface string are compared as well.
I guess this isn't a big problem as such a config doesn't make sense, and the comparison is still deterministic as the same bytes will be set in the interface field for the same config. Still, it would be cleaner to zero out bgp_config.interface before setting it. Comments?
I think that storing a pointer to the string gets rid of this problem nicely :)
Have a nice fortnight
On Sat, Dec 31, 2011 at 02:13:48PM +0100, Martin Mares wrote:
Hello!
the limit to 15 characters for interface names is used all over bird (look at 'struct iface' in nest/iface.h), as this is the limit imposed by most Unix APIs, so I didn't see a reason not to use this limit.
Actually, it is not all over bird, but only in nest/iface.h (try grepping for '\[16]').
Using a hard-coded number (not even a named constant) for size of anything global is a mistake -- indeed, my own mistake somewhere in the distant past :) -- and I think we should get rid of it, not follow it in new code.
What is a point if both Linux and BSD also has such limit? (But it is true that we should probably change name[16] to name[IFNAMSIZ] to be safe, IFNAMSIZ constant seems to be a part of standard socket API) -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Hello!
What is a point if both Linux and BSD also has such limit?
Well, not all the world is Linux or BSD. Since beginning, most parts of BIRD avoid any such system dependencies and I still think it's a good idea. Also, it's generally a good practice to avoid arbitrary limits on size of anything, at least as long as it does not make the code unnecessarily complex. Have a nice fortnight -- Martin `MJ' Mares <mj@ucw.cz> http://mj.ucw.cz/ Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth Hex dump: Where witches put used curses...
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> --- proto/bgp/bgp.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/proto/bgp/bgp.c b/proto/bgp/bgp.c index 67594c7..59468c3 100644 --- a/proto/bgp/bgp.c +++ b/proto/bgp/bgp.c @@ -1151,6 +1151,9 @@ bgp_show_proto_info(struct proto *P) p->rs_client ? " route-server" : "", p->as4_session ? " AS4" : ""); cli_msg(-1006, " Source address: %I", p->source_addr); + if (p->neigh && p->neigh->iface) + cli_msg(-1006, " Interface: %s%s", + p->neigh->iface->name, (*p->cf->interface) ? " (fixed)" : ""); if (p->cf->route_limit) cli_msg(-1006, " Route limit: %d/%d", p->p.stats.imp_routes, p->cf->route_limit); -- 1.7.8.1
On Thu, Dec 29, 2011 at 10:56:43PM +0100, Matthias Schiffer wrote:
Hi, I already posted a first patch to add support for link-local addresses for BGP peerings one year ago, and now I've finally found time to update the patch for the current bird version and fix the problems mentioned in reply to the first patch.
This patch series allows using NEF_STICKY neighbors for link-local addresses by saving the interface name in the neighbor structure, thus working even when the interface is removed and re-added causing the interface index to change.
Hello Thanks for the patch. I noticed that struct iface could exist even if the iface is missing (this is done when iface is removed), so sticky link-local neighbors could be implemented and used in more uniform way (using neigh_find2() with NEF_STICKY, like it is done for neighbors with global addresses). I implemented it in that way and adapted the static protocol for that (primarily for testing). See attached patch. Instead of separate option, this also uses standard syntax (like fe80::1%eth0) for link-local addresses. I will adapt the rest of your BGP patch to that and merge it.
An additional patch that is needed to make this work under FreeBSD is also included; the current behaviour doesn't handle numbered v6 addresses on peer-to-peer interfaces correctly, which link-local addresses are generally. I didn't get the chance to test this under other BSDs though.
Thanks. The patch is: +#ifdef IPV6 + if ((iface->flags & IF_MULTIACCESS) || (masklen != BITS_PER_IP_ADDRESS)) +#else if (iface->flags & IF_MULTIACCESS) +#endif Does anybody know, whether there should ever be check for (iface->flags & IF_MULTIACCESS), and not just the (masklen != BITS_PER_IP_ADDRESS), like it is in the Linux code? -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Hi I need to change preffered source in imported routes (in ospf protocol) I tried import filter { krt_prefsrc = some.ip.of.router; accept; }; but it doesnt work for me. Is there any way to change this? Thank you
On Mon, Jan 02, 2012 at 12:30:00AM +0100, Jaroslav Jirásek wrote:
Hi
I need to change preffered source in imported routes (in ospf protocol)
I tried import filter { krt_prefsrc = some.ip.of.router; accept; };
but it doesnt work for me. Is there any way to change this?
Try export filter to kernel protocol. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Mon, Jan 02, 2012 at 12:30:00AM +0100, Jaroslav Jirásek wrote:
Hi
I need to change preffered source in imported routes (in ospf protocol)
I tried import filter { krt_prefsrc = some.ip.of.router; accept; };
but it doesnt work for me. Is there any way to change this?
BTW, using this in import from OSPF works for me: bird> show route all ... 192.168.42.192/29 via 81.92.145.129 on eth0 [ospf1 01:44] * I (150/15) [192.168.43.1] Type: OSPF unicast univ OSPF.metric1: 15 OSPF.metric2: 16777215 OSPF.tag: 0x00000000 OSPF.router_id: 192.168.43.1 Kernel.prefsrc: 1.2.3.4 ... You don't have Kernel.prefsrc here (if you use that import filter) ? -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
participants (4)
-
Jaroslav Jirásek -
Martin Mares -
Matthias Schiffer -
Ondrej Zajicek