<div dir="ltr">
<p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal">Add support for setting the TCP congestion control algorithm per destination by <font face="arial, sans-serif" style=""><br></font></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal">modifying route attributes in BIRD. Previously supported kernel route attributes<br></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal">in BIRD were all integer values. Subsequently, this also adds string handling of<br></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal">kernel metrics. </p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal"><br></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal">Usage: set krt_cc_algo in the BIRD filter configs.<br></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal"><font face="arial, sans-serif" style="">Allowed values can be found with</font> net.ipv4.tcp_allowed_congestion_control.</p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal"><br></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal">---</p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal">diff --git a/doc/bird.sgml b/doc/bird.sgml<br>index 1d5ae056..310aef37 100644<br>--- a/doc/bird.sgml<br>+++ b/doc/bird.sgml<br>@@ -3377,7 +3377,7 @@ Supported attributes are:<br> <cf/krt_sstresh/, <cf/krt_lock_sstresh/, <cf/krt_cwnd/, <cf/krt_lock_cwnd/,<br> <cf/krt_advmss/, <cf/krt_lock_advmss/, <cf/krt_reordering/, <cf/krt_lock_reordering/,<br> <cf/krt_hoplimit/, <cf/krt_lock_hoplimit/, <cf/krt_rto_min/, <cf/krt_lock_rto_min/,<br>-<cf/krt_initcwnd/, <cf/krt_initrwnd/, <cf/krt_quickack/,<br>+<cf/krt_initcwnd/, <cf/krt_initrwnd/, <cf/krt_quickack/, <cf/krt_cc_algo/,<br> <cf/krt_feature_ecn/, <cf/krt_feature_allfrag/<br> <br> <sect1>Example<br>diff --git a/filter/config.Y b/filter/config.Y<br>index 8916ea97..8916d08d 100644<br>--- a/filter/config.Y<br>+++ b/filter/config.Y<br>@@ -178,6 +178,9 @@ f_generate_empty(struct f_dynamic_attr dyn)<br> case EAF_TYPE_LC_SET:<br> empty = f_const_empty_lclist;<br> break;<br>+ case EAF_TYPE_CC_ALGO:<br>+ empty = f_const_empty_cc_algo;<br>+ break;<br> default:<br> cf_error("Can't empty that attribute");<br> }<br>@@ -816,6 +819,7 @@ term:<br> | '-' EMPTY '-' { $$ = f_new_inst(FI_CONSTANT, f_const_empty_clist); }<br> | '-' '-' EMPTY '-' '-' { $$ = f_new_inst(FI_CONSTANT, f_const_empty_eclist); }<br> | '-' '-' '-' EMPTY '-' '-' '-' { $$ = f_new_inst(FI_CONSTANT, f_const_empty_lclist); }<br>+ | '-' '-' '-' '-' EMPTY '-' '-' '-' '-' { $$ = f_new_inst(FI_CONSTANT, f_const_empty_cc_algo); }<br> | PREPEND '(' term ',' term ')' { $$ = f_new_inst(FI_PATH_PREPEND, $3, $5); }<br> | ADD '(' term ',' term ')' { $$ = f_new_inst(FI_CLIST_ADD, $3, $5); }<br> | DELETE '(' term ',' term ')' { $$ = f_new_inst(FI_CLIST_DEL, $3, $5); }<br>diff --git a/filter/data.c b/filter/data.c<br>index 56c1fb17..84e91d59 100644<br>--- a/filter/data.c<br>+++ b/filter/data.c<br>@@ -91,6 +91,9 @@ const struct f_val f_const_empty_path = {<br> }, f_const_empty_lclist = {<br> .type = T_LCLIST,<br> .<a href="http://val.ad">val.ad</a> = &null_adata,<br>+}, f_const_empty_cc_algo = {<br>+ .type = T_STRING,<br>+ .<a href="http://val.ad">val.ad</a> = &null_adata,<br> };<br> <br> static struct adata *<br>diff --git a/filter/data.h b/filter/data.h<br>index 4cb6b7a8..46216e92 100644<br>--- a/filter/data.h<br>+++ b/filter/data.h<br>@@ -262,6 +262,7 @@ trie_match_next_longest_ip6(net_addr_ip6 *n, ip6_addr *found)<br> <br> <br> #define F_CMP_ERROR 999<br>+#define TCP_CA_NAME_MAX 16<br> <br> const char *f_type_name(enum f_type t);<br> <br>@@ -297,7 +298,7 @@ undef_value(struct f_val v)<br> (<a href="http://v.val.ad">v.val.ad</a> == &null_adata);<br> }<br> <br>-extern const struct f_val f_const_empty_path, f_const_empty_clist, f_const_empty_eclist, f_const_empty_lclist;<br>+extern const struct f_val f_const_empty_path, f_const_empty_clist, f_const_empty_eclist, f_const_empty_lclist, f_const_empty_cc_algo;<br> <br> enum filter_return f_eval(const struct f_line *expr, struct linpool *tmp_pool, struct f_val *pres);<br> <br>diff --git a/filter/f-inst.c b/filter/f-inst.c<br>index 901d2939..735c643a 100644<br>--- a/filter/f-inst.c<br>+++ b/filter/f-inst.c<br>@@ -709,6 +709,9 @@<br> case EAF_TYPE_LC_SET:<br> RESULT_(T_LCLIST, ad, e->u.ptr);<br> break;<br>+ case EAF_TYPE_CC_ALGO:<br>+ RESULT_(T_STRING, s, (const char *) e->u.ptr->data);<br>+ break;<br> case EAF_TYPE_UNDEF:<br> RESULT_VOID;<br> break;<br>@@ -758,7 +761,16 @@<br> case EAF_TYPE_LC_SET:<br> l->attrs[0].u.ptr = <a href="http://v1.val.ad">v1.val.ad</a>;<br> break;<br>-<br>+ case EAF_TYPE_CC_ALGO:<br>+ if (v1.type != T_STRING)<br>+ runtime( "Setting cc_algo attribute to non-string value" );<br>+ else if (strlen(v1.val.s) >= TCP_CA_NAME_MAX)<br>+ runtime( "Setting cc_algo attribute out of bounds (> 15 chars)" );<br>+ /* Store cc_algo string in byte[], making sure to copy the null terminator */<br>+ struct adata *d = lp_alloc_adata(fs->pool, strlen(v1.val.s) + 1);<br>+ memcpy(d->data, v1.val.s, d->length);<br>+ l->attrs[0].u.ptr = d;<br>+ break;<br> case EAF_TYPE_BITFIELD:<br> {<br> /* First, we have to find the old value */<br>diff --git a/filter/f-util.c b/filter/f-util.c<br>index 410999a6..3cf68270 100644<br>--- a/filter/f-util.c<br>+++ b/filter/f-util.c<br>@@ -121,6 +121,8 @@ ca_lookup(pool *p, const char *name, int f_type)<br> break;<br> case T_LCLIST:<br> ea_type = EAF_TYPE_LC_SET;<br>+ case T_STRING:<br>+ ea_type = EAF_TYPE_CC_ALGO;<br> break;<br> default:<br> cf_error("Custom route attribute of unsupported type");<br>diff --git a/nest/route.h b/nest/route.h<br>index 7930058a..053d2f1e 100644<br>--- a/nest/route.h<br>+++ b/nest/route.h<br>@@ -587,6 +587,7 @@ const char *ea_custom_name(uint ea);<br> #define EAF_TYPE_INT_SET 0x0a /* Set of u32's (e.g., a community list) */<br> #define EAF_TYPE_EC_SET 0x0e /* Set of pairs of u32's - ext. community list */<br> #define EAF_TYPE_LC_SET 0x12 /* Set of triplets of u32's - large community list */<br>+#define EAF_TYPE_CC_ALGO 0x18 /* String to specify congestion control algorithm */<br> #define EAF_TYPE_UNDEF 0x1f /* `force undefined' entry */<br> #define EAF_EMBEDDED 0x01 /* Data stored in eattr.u.data (part of type spec) */<br> #define EAF_VAR_LENGTH 0x02 /* Attribute length is variable (part of type spec) */<br>diff --git a/nest/rt-attr.c b/nest/rt-attr.c<br>index c630aa95..b4fdfd85 100644<br>--- a/nest/rt-attr.c<br>+++ b/nest/rt-attr.c<br>@@ -969,6 +969,9 @@ ea_show(struct cli *c, const eattr *e)<br> case EAF_TYPE_LC_SET:<br> ea_show_lc_set(c, ad, pos, buf, end);<br> return;<br>+ case EAF_TYPE_CC_ALGO:<br>+ bsnprintf(pos, ad->length, "%s", (char *)ad->data);<br>+ return;<br> case EAF_TYPE_UNDEF:<br> default:<br> bsprintf(pos, "<type %02x>", e->type);<br>diff --git a/sysdep/linux/krt-sys.h b/sysdep/linux/krt-sys.h<br>index 8897f889..f265ceb7 100644<br>--- a/sysdep/linux/krt-sys.h<br>+++ b/sysdep/linux/krt-sys.h<br>@@ -39,7 +39,7 @@ static inline struct ifa * kif_get_primary_ip(struct iface *i UNUSED) { return N<br> #define EA_KRT_SCOPE EA_CODE(PROTOCOL_KERNEL, 0x12)<br> <br> <br>-#define KRT_METRICS_MAX 0x10 /* RTAX_QUICKACK+1 */<br>+#define KRT_METRICS_MAX 0x11 /* RTAX_CC_ALGO+1 */<br> #define KRT_METRICS_OFFSET 0x20 /* Offset of EA_KRT_* vs RTAX_* */<br> <br> #define KRT_FEATURES_MAX 4<br>@@ -64,7 +64,7 @@ static inline struct ifa * kif_get_primary_ip(struct iface *i UNUSED) { return N<br> #define EA_KRT_RTO_MIN EA_CODE(PROTOCOL_KERNEL, 0x2d)<br> #define EA_KRT_INITRWND EA_CODE(PROTOCOL_KERNEL, 0x2e)<br> #define EA_KRT_QUICKACK EA_CODE(PROTOCOL_KERNEL, 0x2f)<br>-<br>+#define EA_KRT_CC_ALGO EA_CODE(PROTOCOL_KERNEL, 0x30)<br> <br> struct krt_params {<br> u32 table_id; /* Kernel table ID we sync with */<br>diff --git a/sysdep/linux/netlink.Y b/sysdep/linux/netlink.Y<br>index 487ad1d8..b07fa842 100644<br>--- a/sysdep/linux/netlink.Y<br>+++ b/sysdep/linux/netlink.Y<br>@@ -14,7 +14,7 @@ CF_KEYWORDS(KERNEL, TABLE, METRIC, NETLINK, RX, BUFFER,<br> KRT_PREFSRC, KRT_REALM, KRT_SCOPE, KRT_MTU, KRT_WINDOW,<br> KRT_RTT, KRT_RTTVAR, KRT_SSTRESH, KRT_CWND, KRT_ADVMSS, KRT_REORDERING,<br> KRT_HOPLIMIT, KRT_INITCWND, KRT_RTO_MIN, KRT_INITRWND, KRT_QUICKACK,<br>- KRT_LOCK_MTU, KRT_LOCK_WINDOW, KRT_LOCK_RTT, KRT_LOCK_RTTVAR,<br>+ KRT_CC_ALGO, KRT_LOCK_MTU, KRT_LOCK_WINDOW, KRT_LOCK_RTT, KRT_LOCK_RTTVAR,<br> KRT_LOCK_SSTRESH, KRT_LOCK_CWND, KRT_LOCK_ADVMSS, KRT_LOCK_REORDERING,<br> KRT_LOCK_HOPLIMIT, KRT_LOCK_RTO_MIN, KRT_FEATURE_ECN, KRT_FEATURE_ALLFRAG)<br> <br>@@ -45,6 +45,7 @@ dynamic_attr: KRT_INITCWND { $$ = f_new_dynamic_attr(EAF_TYPE_INT, T_INT, EA_KRT<br> dynamic_attr: KRT_RTO_MIN { $$ = f_new_dynamic_attr(EAF_TYPE_INT, T_INT, EA_KRT_RTO_MIN); } ;<br> dynamic_attr: KRT_INITRWND { $$ = f_new_dynamic_attr(EAF_TYPE_INT, T_INT, EA_KRT_INITRWND); } ;<br> dynamic_attr: KRT_QUICKACK { $$ = f_new_dynamic_attr(EAF_TYPE_INT, T_INT, EA_KRT_QUICKACK); } ;<br>+dynamic_attr: KRT_CC_ALGO { $$ = f_new_dynamic_attr(EAF_TYPE_CC_ALGO, T_STRING, EA_KRT_CC_ALGO); } ;<br> <br> /* Bits of EA_KRT_LOCK, based on RTAX_* constants */<br> <br>diff --git a/sysdep/linux/netlink.c b/sysdep/linux/netlink.c<br>index 29b744cb..4d4d704b 100644<br>--- a/sysdep/linux/netlink.c<br>+++ b/sysdep/linux/netlink.c<br>@@ -73,6 +73,10 @@<br> #define NETLINK_GET_STRICT_CHK 12<br> #endif<br> <br>+#ifndef TCP_CA_NAME_MAX<br>+#define TCP_CA_NAME_MAX 16<br>+#endif<br>+<br> #define krt_ipv4(p) ((p)->af == AF_INET)<br> #define krt_ecmp6(p) ((p)->af == AF_INET6)<br> <br>@@ -534,6 +538,9 @@ static inline u16 rta_get_u16(struct rtattr *a)<br> static inline u32 rta_get_u32(struct rtattr *a)<br> { return *(u32 *) RTA_DATA(a); }<br> <br>+static inline char *rta_get_str(struct rtattr *a)<br>+{ return (char *) RTA_DATA(a); }<br>+<br> static inline ip4_addr rta_get_ip4(struct rtattr *a)<br> { return ip4_ntoh(*(ip4_addr *) RTA_DATA(a)); }<br> <br>@@ -624,6 +631,12 @@ nl_add_attr_u32(struct nlmsghdr *h, uint bufsize, int code, u32 data)<br> nl_add_attr(h, bufsize, code, &data, 4);<br> }<br> <br>+static inline void<br>+nl_add_attr_str(struct nlmsghdr *h, unsigned bufsize, int code, char *str)<br>+{<br>+ nl_add_attr(h, bufsize, code, str, strlen(str) + 1);<br>+}<br>+<br> static inline void<br> nl_add_attr_ip4(struct nlmsghdr *h, uint bufsize, int code, ip4_addr ip4)<br> {<br>@@ -880,20 +893,25 @@ err:<br> }<br> <br> static void<br>-nl_add_metrics(struct nlmsghdr *h, uint bufsize, u32 *metrics, int max)<br>+nl_add_metrics(struct nlmsghdr *h, uint bufsize, u32 *metrics, char *cc_algo, int max)<br> {<br> struct rtattr *a = nl_open_attr(h, bufsize, RTA_METRICS);<br> int t;<br> <br>- for (t = 1; t < max; t++)<br>- if (metrics[0] & (1 << t))<br>- nl_add_attr_u32(h, bufsize, t, metrics[t]);<br>+ for (t = 1; t < max; t++) {<br>+ if (metrics[0] & (1 << t)) {<br>+ if (EA_CODE(PROTOCOL_KERNEL, KRT_METRICS_OFFSET + t) == EA_KRT_CC_ALGO)<br>+ nl_add_attr_str(h, bufsize, t, cc_algo);<br>+ else<br>+ nl_add_attr_u32(h, bufsize, t, metrics[t]);<br>+ }<br>+ }<br> <br> nl_close_attr(h, a);<br> }<br> <br> static int<br>-nl_parse_metrics(struct rtattr *hdr, u32 *metrics, int max)<br>+nl_parse_metrics(struct rtattr *hdr, u32 *metrics, char *cc_algo, int max)<br> {<br> struct rtattr *a = RTA_DATA(hdr);<br> int len = RTA_PAYLOAD(hdr);<br>@@ -911,7 +929,19 @@ nl_parse_metrics(struct rtattr *hdr, u32 *metrics, int max)<br> return -1;<br> <br> metrics[0] |= 1 << a->rta_type;<br>- metrics[a->rta_type] = rta_get_u32(a);<br>+<br>+ if (EA_CODE(PROTOCOL_KERNEL, KRT_METRICS_OFFSET + a->rta_type) == EA_KRT_CC_ALGO) {<br>+ char *str = rta_get_str(a);<br>+ if (strlen(str) < TCP_CA_NAME_MAX) {<br>+ memcpy(cc_algo, str, strlen(str) + 1);<br>+ } else {<br>+ log(L_ERR "KRT: Received route with cc_algo attribute out of bounds (> 15 chars)");<br>+ return -1;<br>+ }<br>+ metrics[a->rta_type] = 0;<br>+ } else {<br>+ metrics[a->rta_type] = rta_get_u32(a);<br>+ }<br> }<br> <br> if (len > 0)<br>@@ -1427,6 +1457,7 @@ nl_send_route(struct krt_proto *p, rte *e, int op, int dest, struct nexthop *nh)<br> <br> <br> u32 metrics[KRT_METRICS_MAX];<br>+ char cc_algo[TCP_CA_NAME_MAX];<br> metrics[0] = 0;<br> <br> struct ea_walk_state ews = { .eattrs = eattrs };<br>@@ -1434,11 +1465,16 @@ nl_send_route(struct krt_proto *p, rte *e, int op, int dest, struct nexthop *nh)<br> {<br> int id = ea->id - EA_KRT_METRICS;<br> metrics[0] |= 1 << id;<br>- metrics[id] = ea->u.data;<br>+ if(ea->id == EA_KRT_CC_ALGO) {<br>+ metrics[id] = 0;<br>+ memcpy(cc_algo, ea->u.ptr->data, ea->u.ptr->length);<br>+ } else {<br>+ metrics[id] = ea->u.data;<br>+ }<br> }<br> <br> if (metrics[0])<br>- nl_add_metrics(&r->h, rsize, metrics, KRT_METRICS_MAX);<br>+ nl_add_metrics(&r->h, rsize, metrics, cc_algo, KRT_METRICS_MAX);<br> <br> <br> dest:<br>@@ -1907,10 +1943,12 @@ nl_parse_route(struct nl_parse_state *s, struct nlmsghdr *h)<br> if (a[RTA_METRICS])<br> {<br> u32 metrics[KRT_METRICS_MAX];<br>+ char *cc_algo = lp_alloc(s->pool, TCP_CA_NAME_MAX);<br> ea_list *ea = lp_alloc(s->pool, sizeof(ea_list) + KRT_METRICS_MAX * sizeof(eattr));<br>+ struct adata *d = lp_alloc(s->pool, sizeof(struct adata) + TCP_CA_NAME_MAX);<br> int t, n = 0;<br> <br>- if (nl_parse_metrics(a[RTA_METRICS], metrics, ARRAY_SIZE(metrics)) < 0)<br>+ if (nl_parse_metrics(a[RTA_METRICS], metrics, cc_algo, ARRAY_SIZE(metrics)) < 0)<br> {<br> log(L_ERR "KRT: Received route %N with strange RTA_METRICS attribute", net->n.addr);<br> return;<br>@@ -1921,8 +1959,15 @@ nl_parse_route(struct nl_parse_state *s, struct nlmsghdr *h)<br> {<br> ea->attrs[n].id = EA_CODE(PROTOCOL_KERNEL, KRT_METRICS_OFFSET + t);<br> ea->attrs[n].flags = 0;<br>- ea->attrs[n].type = EAF_TYPE_INT; /* FIXME: Some are EAF_TYPE_BITFIELD */<br>- ea->attrs[n].u.data = metrics[t];<br>+ if (ea->attrs[n].id == EA_KRT_CC_ALGO) {<br>+ ea->attrs[n].type = EAF_TYPE_CC_ALGO;<br>+ d->length = strlen(cc_algo) + 1;<br>+ memcpy(d->data, cc_algo, d->length);<br>+ ea->attrs[n].u.ptr = d;<br>+ } else {<br>+ ea->attrs[n].type = EAF_TYPE_INT; /* FIXME: Some are EAF_TYPE_BITFIELD */<br>+ ea->attrs[n].u.data = metrics[t];<br>+ }<br> n++;<br> }<br> <br>@@ -2225,7 +2270,8 @@ krt_sys_copy_config(struct krt_config *d, struct krt_config *s)<br> <br> static const char *krt_metrics_names[KRT_METRICS_MAX] = {<br> NULL, "lock", "mtu", "window", "rtt", "rttvar", "sstresh", "cwnd", "advmss",<br>- "reordering", "hoplimit", "initcwnd", "features", "rto_min", "initrwnd", "quickack"<br>+ "reordering", "hoplimit", "initcwnd", "features", "rto_min", "initrwnd", "quickack",<br>+ "cc_algo"<br> };<br> <br> static const char *krt_features_names[KRT_FEATURES_MAX] = {<br></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal">---</p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal"><font face="arial, sans-serif" style=""><br></font></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal"><font face="arial, sans-serif" style="">Please also find the patch attached.</font></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal"><font face="arial, sans-serif" style=""><br></font></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal"><font face="arial, sans-serif" style="">Thank you,</font></p><p class="gmail-p1" style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal"><font face="arial, sans-serif" style="">Trisha</font></p><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><span style="color:rgb(136,136,136)">--</span><br style="color:rgb(136,136,136)"><div dir="ltr" style="color:rgb(136,136,136)"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div style="margin:0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px"><img src="https://www.fastly.com/img/sig.png"><br></div><div style="margin:0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px"><b>Trisha Biswas</b> | Sr. Software Engineer, Network Systems</div><div style="margin:0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px"><a href="http://fastly.com/" rel="nofollow" style="color:rgb(59,115,175)" target="_blank">fastly.com</a> | <a href="https://twitter.com/fastly" rel="nofollow" style="color:rgb(59,115,175)" target="_blank">@fastly</a> | <a href="http://www.linkedin.com/company/fastly" rel="nofollow" style="color:rgb(59,115,175)" target="_blank">LinkedIn</a></div></div></div></div></div></div></div></div></div>