31 Jul
2011
31 Jul
'11
8:38 a.m.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Alexander V. Chernikov wrote:
> On 22.07.2011 14:52, Ondrej Zajicek wrote:
>> On Fri, Jul 22, 2011 at 01:47:14AM +0400, Alexander V. Chernikov wrote:
>>>> Therefore there would be two types of routing tables - IP and MPLS. I
>>>> don't think it is a good idea to mix these. This may look inconsistent
>>>> with idea of embedding IPv4 to IPv6, but IP protocols are much more
>>>> similar, have a natural way to embed one in the other, have similar
>>>> roles and protocol structure. MPLS routing table could be used to LDP -
>>>> kernel interaction (routes imported from LDP and exported to kernel).
>>>> This solves your Case 2 without any hacks.
>>> So, from user point of view, I define
>>> table xxx; for both ipv4 and IPv6 routes and
>>> mpls table yyy; for MPLS routing table?
>>
>> Yes.
Patch permitting fibs to be used for any address family attached.
It should be considered as PoC patch for review. It works for my setup,
but I haven't tested it in production. netlink is not tested at all.
Some notes:
* fib has to have address type field (due to fib_get and other functions
using pointer to fib, not rtable)
* Due to address variable length we store it inside fib node this way:
|--------------------|
| struct fib_node |
| *addr --------\
|--------------------| |
| some user data | |
| | |
|--------------------| |
| address data <-------/
| |
|--------------------|
* Since we've got pointer to address data instead of data (ip_addr)
itself, all 9000 places with "%I/%d" needs to be changed, so more
general fib_print and fib2_print functions are implemented
* Several net_* calls were converted to fib_*
Btw, some IPv4/IPv6 merging questions/thoughts:
* show route will show complete mess for table with both v4 and v6
routes. Some sorting or 'afi ipv4|ipv6' has to be implemented.
* fill_in_sockaddr|get_sockaddr from io.c are somehow inconsequent:
fill_* uses OS-dependent set_inaddr to fill actual address data but
get_* uses direct calls to memcpy and ipa_ntoh instead of existing
OS-dependent get_inaddr. Moreover, set_ and get_ implementations are the
same for linux, bsd (and they should be the same for other UNIX-like
systems AFAIR, at least for IPv4/IPv6)
>>
>>> There should be base MPLS rtable (mpls_default, for example) as in IP.
>>> We can also add a hack for automatically subscribe protocols for MPLS
>>> routing table by type and other attributes. For example, every LDP
>>> instance gets connected to an MPLS table (default or defined in config).
>>> Kernel protocol instance gets connected to MPLS table only if its IP
>>> table is the default one (GRT) or 'mpls table' keyword is supplied
>>> explicitely. What about VPNv4/VPNv6 ? The same approach?
>>
>> Perhaps even default MPLS table should be explicitly configured [*]
>> (as i guess
>> not many BIRD users would use MPLS). Protocols requiring MPLS table would
>> fail if it is not configured, protocol with optional MPLS support
>> (kernel,
>> static?) just do not connect to MPLS in that case. The same approach
>> for VPNvX table.
>>
>> [*] probably like: mpls table XXX default;
> Maybe it's better to turn on "general" mpls support?
> e.g. 'mpls support;' or just 'mpls;' instead of propagating some table
> to be default?
>>
>>> Btw, how we will distinguish inet/inet6 rtes? (I'm talking about
>>> MP-BGP
>>> / IPv4-mapped cases)
>>
>> I planned to use IPv4-mapped prefix (::ffff:0:0/96), which is used for
>> similar purposes in IP stack. But this should not be checked directly
>> in protocols, there should be some macros in lib/ipv6.h for that.
>>
>>>> [*] when i wrote that i thought that labels are distributed just by LDP
>>>> and the purpose of label request is to propagate the label through LDP
>>>> area. i didn't noticed that BGP/MPLS also distributes labels so they
>>>> need to know assigned labels. So the idea would need some
>>>> modifications.
>>> Not sure this will work. Since t1 is an IP table cases when we need to
>>> request specific label for:
>>> * AToM
>>> * RSVP-TE tunnels
>>> will not work since there are no prefixes that can be mapped to such
>>> request.
>>
>> You are probably right. I originally thought about some specific
>> 'request table' (where requests coded as routes with specific AF),
>> but perhaps there should be used some other mechanism / other protocol
>> hook. But it should be generic enough (some bus, allows at least more
>> 'producers' and perhaps more 'consumers').
> Okay, i see this as follows:
> New rtable hook, service_hook, with uint32_3 bitmask specifying request
> classes we are responsible to:
> /* Defined classes */
> #define RCLASS_LABEL 0x01 /* MPLS label request */
>
> Some request function:
> int
> request_data(rtable *t, struct service_request *req, void **buf, size_t
> *bufsize)
>
> struct service_request {
> uint32_t request; /* Single request class set */
> uint32_t subclass; /* Subclass specific for request */
> proto *p; /* caller protocol */
> char data[0]; /* request-specific data follows */
> }
>
> function loops thru all registered hooks for given _class_ checking for
> reply until SR_OK or SR_FAIL is returned. It is up to protocol hook to
> check subclass.
> #define SR_OK 0x01 /* Request successful */
> #define SR_FAIL 0x02 /* Request failed */
> #define SR_NEXT 0x03 /* Request skipped */
> #define SR_UNAVAIL 0x04 /* No providers for this request */
>
> As a result, caller get SR_UNAVAIL in case of no providers were able to
> serve request or SR_OK|SR_FAIL.
>
> caller can setup buffer itself and pass pointer to pointer to buffer and
> pointer to buffer size to function, or request provider to allocate data
> for him setting *buf to NULL and bufsize to 0
>
> struct service_reply { /* is returned in reply buffer */
> uint32_t request;
> uint32_t subclass;
> proto *p; /* protocol, providing data */
> char data[0]; /* request-specific data */
> }
>
>
>
>>
>>>> Internal LMAP table is examined, tracked IGP table is examined. If both
>>>> are ready (for given prefix), appropriate encapsulating and MPLS routes
>>>> are generated and propagated using rte_update(), otherwise nothing is
>>>> generated and the previously generated route is withdrawn (rte_update()
>>>> with NULL is called) (or perhaps an unreachable route is generated if
>>>> LMAP is here but IGP route is missing). Simple and elegant.
>>> .. and in case of label release we should remove label only and keep
>>> original route
>>
>> Yes.
>>
>>>> There are some tricky parts of IGP tracking - it is problematic
>>>> to use standard RA_OPTIMAL update for this purpose, because if
>>>> generated encapsulating routes are imported to the same table,
>>>> these probably became the optimal ones and IGP routes would be
>>>> shaded. Solution would be to use RA_ANY, and ignore notifications
>>>> containing encapsulating routes, similarly 'examining the tracked
>>>> IGP table' means looking up the fib node and find the best route,
>>>> ignoring encapsulating ones.
>>>>
>>>> For implementation of this behavior, there are two minor changes that
>>>> needs to be done to the rt table code: First, currently accept_ra_types
>>>> (RA_OPTIMAL/RA_ANY) is a property of a protocol, it needs to be a
>>>> property of an announce hook (as LDP would have two hooks with
>>>> RA_OPTIMAL and one hook with RA_ANY). Second, rte_announce() for
>>>> both in rte_recalculate should be moved after the route list
>>>> is updated/relinked.
>>
>>> Agreed. Distinguishing RA_OPTIMAL and RA_ANY in current code is not a
>>> trivial task and requires internals understanding. Either announce type
>>> should be passed to announce hook or new hook should be added for RA_ANY
>>> event. The latter is more appropriate IMHO since RA_ANY is used by
>>> pipe
>>> protocol only.
>>
>> I thought about that when i created RA_ANY and have chosen this approach.
>> Probably best way is just to change rt_notify to have appropriate
>> struct announce_hook as a second argument instead of struct rtable.
>> struct announce_hook would contain RA_ANY/RA_OPTIMAL and possibly
>> some protocol-specific data. As (probably) all protocols are in-tree,
>> doing some wide but trivial changes is not a problem.
>>
>>> Kernel protocol should track RA_ANY protocol hooks
>>> looking for update source (LDP / RSVP) and re-install appropriate
>>> routes.
>>
>> I think kernel protocol should use RA_OPTIMAL as usual. This kind
>> of RA_ANY usage is for protocols that export routes to the same
>> table they listen (so 'source' routes would be shaded by their
>> routes). These routes (LDP / RSVP) should have just highest
>> priority.
>>
>>> The only downside is situation when LDP signalling starts faster
>>> than IGP. In that case we will get 3 updates instead of one (at least in
>>> RTSOCK):
>>> * RTM_ADD for original prefix
>>> * RTM_DEL for this prefix (as part of krt_set_notify())
>>> * RTM_ADD for modified prefix
>>>
>>> RTM_CHANGE can be used in notify, but still: this gives 2 updates
>>> instead of one.
>>
>> No, because RA_ANY is handled strictly before RA_OPTIMAL and routes
>> are propagated synchronously depth-first:
>>
>> OSPF --RA_ANY--> LDP
>> LDP --RA_OPTIMAL--> kernel
>> OSPF --RA_OPTIMAL--> kernel
>>
> Still I can't understand how exactly I can modify an announced IP route
> (still, from FreeBSD kernel point of view encapsulated route is a usual
> route with an attribute attached. From Linux point of view this should
> be more or less the same since an IP route lookup have to be done for
> incoming packet anyway and doing several different lookups is not a best
> idea). I've got RA_ANY hook called for a new route (and I should know
> that it is actually RA_OPTIMAL without some complex logic!), what I
> should do next ?
>
>> But it is true that this is much dependent on internal implementation
>> of route propagation. The first idea i had was to use separate
>> tables for original and labeled routes (when just RA_OPTIMAL hooks),
>> but that looks too cumbersome for users and ability to push a better
>> route to the same (input) table has other possible usages.
>>
>>>> Therefore, it is probably a good idea to extend FIBs in a way you
>>>> suggested, with minor details changed. FIB / rtables would be uniform
>>>> (AF_ bound), but there are just three AFs (IP, MPLS, VPN) - IPv4 and
>>>> IPv6
>>>> could be handled as one AF, embedded, the same for VPNv4 and VPNv6). To
>>>> minimize code changes, struct fib_node would have ip_addr prefix, but
>>>> might be allocated larger.
>>> Okay, so for IPv4+IPv6-enabled daemon we will allocate an ip_addr large
>>> enough for holding IPv6 address? This can bump memory consumption for
>>> setups with several full-views significantly.
>>
>> It increases memory consumtion, but not so much in a relative view - for
>> each struct network there is at least one struct rte and in both of them
>> there is just one ip_addr and both structures are nontrivial. So this
>> relative increase would be about 1.15-1.2. Really big users would
>> probably keep current splitted setting.
> Okay, it's much easier from developer point of view. If you're not
> afraid of your users :)
>>
>>>> Because each protocol and each its announce_hook have appropriate role,
>>>> it is IMHO unnecessary to have AF in protocol hooks, but there
>>>> should be
>>>> check whether protocol/announce_hook is connected to appropriate
>>>> rtable.
>>>>
>>>
>>> To summarize required changes (please correct me):
>>> 1) Differentiate between RA_ANY and RA_OPTIMAL (new hook, possibly)
>>> 2) Add 3 AFs (AF_IP, AF_MPLS, AF_VPN) to the following structures:
>>> * rtable
>>> * fib
>>> * rte
>>> 3) Add fib2_init with sizeof(AF object) supplied. Add appropriate field
>>> to struct fib to hold this value.
>>> 4) Move to memcmp() in fib_find / fib_get
>>> 5) Set up default rtable for every supported AF. Connect protocol
>>> instances to such default AFs based on protocol types
>>
>> 1a) other changes in rte_recalculate() related to propagation
>> (clean up the table before calling RA_ANY hook).
>>
>> 1) and 1a) i will do myself and send you the patch, and also make
>> some trivial example for exporting to the same table.
>>
>> 2) i am not sure if there is a reason to put explicit AF info
>> to struct fib, AF compatibility could be handled on higher level
>> (struct rtable in general, other direct users probably use just
>> one AF).
> No problem, I misinterpreted "FIB / rtables would be uniform (AF_
> bound)" as "FIB / rtable needs AF infor in structure fields"
>>
>> 3) and hashing callback (and perhaps fib_route, but not sure if this is
>> needed).
>>
>> 4) probably encapsulate that to some static inline key_equal() function.
>>
>> 5) see my related note above. Protocol binding to tables should check
>> AFs.
>>
>> more:
>>
>> 6) RTD_MPLS in dest field, struct rta_mpls, as i wrote in the previous
>> mail:
>>
>>>> i think encapsulation
>>>> routes should be represented by routes with new destination type
>>>> (RTD_MPLS in dest field of struct rta) and whole NHLFE should be stored
>>>> in new struct rta_mpls (or rta_nhlfe), which would be extension of
>>>> struct rta (containing struct rta in the first field and NHLFE after
>>>> that). Such structure could be easily passed as struct rta and
>>>> functions
>>>> from rt-attr.c can work with that, with jome some minor modifications
>>>> (allocating, freeing and printing) dispatched based on dest field.
>>
>>>> This rta could be used without changes also for MPLS routes.
>
> I'll try to send you patches for all these as I see it in several days.
>>
>>
>>> Most of this are more or less trivial changes not MPLS-bound (VPNv4/6
>>> can be used in case of bird used as RR in MPLS network, for example).
>>> Should I supply patches for these? What are your plans about commit
>>> routemap ?
>>
>> I create GIT branch 'mpls' and would merge these patches to that branch
>> soon. When we will have some major release, we could merge 'mpls' branch
>> to master if there is some sufficient usage (i think that even just
>> static and kernel protocol support for MPLS would be a good example
>> usage). Other protocols (LDP, ...) probably should be merged when they
>> are reasonable ready.
> Will this branch available from official git repo ? It is not accessible
> (from its web interface at least).
>
>
> Btw, some bird/LDP "status" report:
>
> bird> show ldp neighbour
> Peer LDP Ident: 10.2.33.4:0; Local LDP Ident 10.0.0.88:0
> TCP connection: 10.2.33.4.11212 - 0.0.0.0.0
> State: Operational; Msgs sent/rcvd: 21/61; Downstream
> Up time: 00:02:27
> LDP discovery sources:
> em1, Src IP addr: 10.1.5.4
> Peer LDP Ident: 10.2.33.3:0; Local LDP Ident 10.0.0.88:0
> TCP connection: 10.2.33.3.11009 - 0.0.0.0.0
> State: Operational; Msgs sent/rcvd: 29/60; Downstream
> Up time: 00:02:20
> LDP discovery sources:
> em2, Src IP addr: 10.1.6.3
> bird> show ldp bindings
> lib entry: 10.2.0.0/30
> local binding: label: 25
> remote binding: lsr: 10.2.33.4:0, label: ImpNULL
> remote binding: lsr: 10.2.33.3:0, label: 23
> lib entry: 10.1.6.0/24
> remote binding: lsr: 10.2.33.3:0, label: ImpNULL
> remote binding: lsr: 10.2.33.4:0, label: 25
> lib entry: 10.0.0.0/24
> remote binding: lsr: 10.2.33.3:0, label: 19
> remote binding: lsr: 10.2.33.4:0, label: 23
> lib entry: 10.2.0.2/32
> local binding: label: 26
> remote binding: lsr: 10.2.33.4:0, label: 16
> remote binding: lsr: 10.2.33.3:0, label: 24
> lib entry: 10.1.4.0/24
> local binding: label: 29
> remote binding: lsr: 10.2.33.4:0, label: ImpNULL
> remote binding: lsr: 10.2.33.3:0, label: ImpNULL
> lib entry: 10.1.5.0/24
> remote binding: lsr: 10.2.33.4:0, label: ImpNULL
> remote binding: lsr: 10.2.33.3:0, label: ImpNULL
> lib entry: 1.2.3.5/32
> remote binding: lsr: 10.2.33.3:0, label: 20
> remote binding: lsr: 10.2.33.4:0, label: 21
> lib entry: 10.1.33.0/24
> local binding: label: 28
> remote binding: lsr: 10.2.33.4:0, label: ImpNULL
> remote binding: lsr: 10.2.33.3:0, label: ImpNULL
> lib entry: 10.2.33.3/32
> local binding: label: 31
> remote binding: lsr: 10.2.33.3:0, label: ImpNULL
> lib entry: 10.2.33.4/32
> local binding: label: 27
> remote binding: lsr: 10.2.33.4:0, label: ImpNULL
> remote binding: lsr: 10.2.33.3:0, label: 25
> lib entry: 10.1.6.88/32
> remote binding: lsr: 10.2.33.3:0, label: 18
> remote binding: lsr: 10.2.33.4:0, label: 19
> lib entry: 10.0.0.88/32
> remote binding: lsr: 10.2.33.4:0, label: 17
> remote binding: lsr: 10.2.33.3:0, label: 16
> lib entry: 10.1.5.88/32
> remote binding: lsr: 10.2.33.3:0, label: 21
> remote binding: lsr: 10.2.33.4:0, label: 18
> bird> show ldp forwardingtable
> Local Outgoing Prefix Bytes Label Outgoing Next Hop
> Label Label or VC or Tunnel Id Switched interface
> 20 SWAP 10.2.0.0/30 0 ? 10.1.5.4
> 21 SWAP 10.2.0.2/32 0 ? 10.1.5.4
> 22 SWAP 10.2.33.4/32 0 ? 10.1.5.4
> 23 SWAP 10.1.33.0/24 0 ? 10.1.5.4
> 24 SWAP 10.1.4.0/24 0 ? 10.1.5.4
> 25 SWAP 10.2.0.0/30 0 ? 10.1.5.4
> 26 SWAP 10.2.0.2/32 0 ? 10.1.5.4
> 27 SWAP 10.2.33.4/32 0 ? 10.1.5.4
> 28 SWAP 10.1.33.0/24 0 ? 10.1.5.4
> 29 SWAP 10.1.4.0/24 0 ? 10.1.5.4
> 30 SWAP 10.2.33.3/32 0 ? 10.1.6.3
> 31 SWAP 10.2.33.3/32 0 ? 10.1.6.3
>
>
>>
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk41FJEACgkQwcJ4iSZ1q2kZNwCfZHk19PuXn2esNZ/KrvXOir5v
zTMAoKe78CsexI0pPJ4li50e8teBCcpa
=yqPo
-----END PGP SIGNATURE-----
Index: filter/filter.c
===================================================================
--- filter/filter.c (revision 4962)
+++ filter/filter.c (working copy)
@@ -679,9 +679,9 @@ interpret(struct f_inst *what)
case T_STRING: /* Warning: this is a special case for proto attribute */
res.val.s = rta->proto->name;
break;
- case T_PREFIX: /* Warning: this works only for prefix of network */
+ case T_PREFIX: /* Warning: this works only for RT_IP prefix of network */
{
- res.val.px.ip = (*f_rte)->net->n.prefix;
+ res.val.px.ip = *FPREFIX_IP(&(*f_rte)->net->n);
res.val.px.len = (*f_rte)->net->n.pxlen;
break;
}
Index: proto/ospf/ospf.c
===================================================================
--- proto/ospf/ospf.c (revision 4962)
+++ proto/ospf/ospf.c (working copy)
@@ -812,7 +812,7 @@ ospf_sh(struct proto *p)
cli_msg(-1014, "\t\tArea networks:");
firstfib = 0;
}
- cli_msg(-1014, "\t\t\t%1I/%u\t%s\t%s", anet->fn.prefix, anet->fn.pxlen,
+ cli_msg(-1014, "\t\t\t%1I/%u\t%s\t%s", *FPREFIX_IP(&anet->fn), anet->fn.pxlen,
anet->hidden ? "Hidden" : "Advertise", anet->active ? "Active" : "");
}
FIB_WALK_END;
Index: proto/ospf/topology.c
===================================================================
--- proto/ospf/topology.c (revision 4962)
+++ proto/ospf/topology.c (working copy)
@@ -65,7 +65,7 @@ fibnode_to_lsaid(struct proto_ospf *po, struct fib
LSA ID for a network because different network appeared, we
choose a different way. */
- u32 id = _I(fn->prefix);
+ u32 id = _I(*FPREFIX_IP(fn));
if ((po->rfc1583) || (fn->pxlen == 0) || (fn->pxlen == 32))
return id;
@@ -764,8 +764,8 @@ originate_sum_net_lsa(struct ospf_area *oa, struct
struct ospf_lsa_header lsa;
void *body;
- OSPF_TRACE(D_EVENTS, "Originating net-summary-LSA for %I/%d (metric %d)",
- fn->prefix, fn->pxlen, metric);
+ OSPF_TRACE(D_EVENTS, "Originating net-summary-LSA for %s (metric %d)",
+ fib_print(fn), metric);
/* options argument is used in ORT_NET and OSPFv3 only */
lsa.age = 0;
@@ -780,8 +780,7 @@ originate_sum_net_lsa(struct ospf_area *oa, struct
{
if (check_sum_net_lsaid_collision(fn, en))
{
- log(L_ERR, "%s: LSAID collision for %I/%d",
- p->name, fn->prefix, fn->pxlen);
+ log(L_ERR, "%s: LSAID collision for %s", p->name, fib_print(fn));
return;
}
@@ -803,7 +802,7 @@ originate_sum_rt_lsa(struct ospf_area *oa, struct
struct proto *p = &po->proto;
struct top_hash_entry *en;
u32 dom = oa->areaid;
- u32 rid = ipa_to_rid(fn->prefix);
+ u32 rid = ipa_to_rid(*FPREFIX_IP(fn));
struct ospf_lsa_header lsa;
void *body;
@@ -850,7 +849,7 @@ flush_sum_lsa(struct ospf_area *oa, struct fib_nod
else
{
/* In OSPFv3, LSA ID is meaningless, but we still use Router ID of ASBR */
- lsa.id = ipa_to_rid(fn->prefix);
+ lsa.id = ipa_to_rid(*FPREFIX_IP(fn));
lsa.type = LSA_T_SUM_RT;
}
@@ -859,7 +858,7 @@ flush_sum_lsa(struct ospf_area *oa, struct fib_nod
if ((type == ORT_NET) && check_sum_net_lsaid_collision(fn, en))
{
log(L_ERR, "%s: LSAID collision for %I/%d",
- p->name, fn->prefix, fn->pxlen);
+ p->name, fib_print(fn));
return;
}
@@ -1011,8 +1010,7 @@ originate_ext_lsa(net * n, rte * e, struct proto_o
void *body;
struct ospf_area *oa;
- OSPF_TRACE(D_EVENTS, "Originating AS-external-LSA for %I/%d",
- fn->prefix, fn->pxlen);
+ OSPF_TRACE(D_EVENTS, "Originating AS-external-LSA for %s", fib_print(fn));
lsa.age = 0;
#ifdef OSPFv2
@@ -1040,8 +1038,7 @@ originate_ext_lsa(net * n, rte * e, struct proto_o
int rv = check_ext_lsa(en, fn, metric, gw, tag);
if (rv < 0)
{
- log(L_ERR, "%s: LSAID collision for %I/%d",
- p->name, fn->prefix, fn->pxlen);
+ log(L_ERR, "%s: LSAID collision for %s", p->name, fib_print(fn));
return;
}
@@ -1073,8 +1070,7 @@ flush_ext_lsa(net *n, struct proto_ospf *po)
struct fib_node *fn = &n->n;
struct top_hash_entry *en;
- OSPF_TRACE(D_EVENTS, "Flushing AS-external-LSA for %I/%d",
- fn->prefix, fn->pxlen);
+ OSPF_TRACE(D_EVENTS, "Flushing AS-external-LSA for %s", fib_print(fn));
u32 lsaid = fibnode_to_lsaid(po, fn);
@@ -1082,8 +1078,7 @@ flush_ext_lsa(net *n, struct proto_ospf *po)
{
if (check_ext_lsa(en, fn, 0, IPA_NONE, 0) < 0)
{
- log(L_ERR, "%s: LSAID collision for %I/%d",
- p->name, fn->prefix, fn->pxlen);
+ log(L_ERR, "%s: LSAID collision for %s", p->name, fib_print(fn));
return;
}
Index: proto/ospf/rt.c
===================================================================
--- proto/ospf/rt.c (revision 4962)
+++ proto/ospf/rt.c (working copy)
@@ -889,7 +889,7 @@ decide_sum_lsa(struct ospf_area *oa, ort *nf, int
return 1;
struct area_net *anet = (struct area_net *)
- fib_route(&nf->n.oa->net_fib, nf->fn.prefix, nf->fn.pxlen);
+ fib_route(&nf->n.oa->net_fib, FPREFIX_IP(&nf->fn), nf->fn.pxlen);
/* Condensed area network found */
if (anet)
@@ -915,7 +915,7 @@ check_sum_net_lsa(struct proto_ospf *po, ort *nf)
/* Find that area network */
WALK_LIST(anet_oa, po->area_list)
{
- anet = (struct area_net *) fib_find(&anet_oa->net_fib, &nf->fn.prefix, nf->fn.pxlen);
+ anet = (struct area_net *) fib_find(&anet_oa->net_fib, FPREFIX(&nf->fn), nf->fn.pxlen);
if (anet)
break;
}
@@ -1017,7 +1017,7 @@ ospf_rt_abr(struct proto_ospf *po)
/* Compute condensed area networks */
if (nf->n.type == RTS_OSPF)
{
- anet = (struct area_net *) fib_route(&nf->n.oa->net_fib, nf->fn.prefix, nf->fn.pxlen);
+ anet = (struct area_net *) fib_route(&nf->n.oa->net_fib, FPREFIX_IP(&nf->fn), nf->fn.pxlen);
if (anet)
{
if (!anet->active)
@@ -1025,7 +1025,7 @@ ospf_rt_abr(struct proto_ospf *po)
anet->active = 1;
/* Get a RT entry and mark it to know that it is an area network */
- ort *nfi = (ort *) fib_get(&po->rtf, &anet->fn.prefix, anet->fn.pxlen);
+ ort *nfi = (ort *) fib_get(&po->rtf, FPREFIX(&anet->fn), anet->fn.pxlen);
nfi->fn.x0 = 1; /* mark and keep persistent, to have stable UID */
/* 16.2. (3) */
@@ -1060,7 +1060,7 @@ ospf_rt_abr(struct proto_ospf *po)
{
nf = (ort *) nftmp;
if (nf->n.options & ORTA_ASBR)
- ri_install_asbr(po, &nf->fn.prefix, &nf->n);
+ ri_install_asbr(po, FPREFIX_IP(&nf->fn), &nf->n);
}
FIB_WALK_END;
}
@@ -1714,7 +1714,7 @@ again1:
if (reload || ort_changed(nf, &a0))
{
- net *ne = net_get(p->table, nf->fn.prefix, nf->fn.pxlen);
+ net *ne = fib_get(&p->table->fib, FPREFIX(&nf->fn), nf->fn.pxlen);
rta *a = rta_lookup(&a0);
rte *e = rte_get_temp(a);
@@ -1739,7 +1739,7 @@ again1:
rta_free(nf->old_rta);
nf->old_rta = NULL;
- net *ne = net_get(p->table, nf->fn.prefix, nf->fn.pxlen);
+ net *ne = fib_get(&p->table->fib, FPREFIX(&nf->fn), nf->fn.pxlen);
rte_update(p->table, ne, p, p, NULL);
}
Index: proto/bgp/packets.c
===================================================================
--- proto/bgp/packets.c (revision 4962)
+++ proto/bgp/packets.c (working copy)
@@ -222,10 +222,10 @@ bgp_encode_prefixes(struct bgp_proto *p, byte *w,
while (!EMPTY_LIST(buck->prefixes) && remains >= (1+sizeof(ip_addr)))
{
struct bgp_prefix *px = SKIP_BACK(struct bgp_prefix, bucket_node, HEAD(buck->prefixes));
- DBG("\tDequeued route %I/%d\n", px->n.prefix, px->n.pxlen);
+ DBG("\tDequeued route %s\n", fib_print(&px->n));
*w++ = px->n.pxlen;
bytes = (px->n.pxlen + 7) / 8;
- a = px->n.prefix;
+ a = *FPREFIX_IP(&px->n);
ipa_hton(a);
memcpy(w, &a, bytes);
w += bytes;
@@ -242,7 +242,7 @@ bgp_flush_prefixes(struct bgp_proto *p, struct bgp
while (!EMPTY_LIST(buck->prefixes))
{
struct bgp_prefix *px = SKIP_BACK(struct bgp_prefix, bucket_node, HEAD(buck->prefixes));
- log(L_ERR "%s: - route %I/%d skipped", p->p.name, px->n.prefix, px->n.pxlen);
+ log(L_ERR "%s: - route %s skipped", p->p.name, fib_print(&px->n));
rem_node(&px->bucket_node);
fib_delete(&p->prefix_fib, px);
}
Index: proto/bgp/attrs.c
===================================================================
--- proto/bgp/attrs.c (revision 4962)
+++ proto/bgp/attrs.c (working copy)
@@ -786,7 +786,8 @@ bgp_get_bucket(struct bgp_proto *p, net *n, ea_lis
for(i=0; i<ARRAY_SIZE(bgp_mandatory_attrs); i++)
if (!(seen & (1 << bgp_mandatory_attrs[i])))
{
- log(L_ERR "%s: Mandatory attribute %s missing in route %I/%d", p->p.name, bgp_attr_table[bgp_mandatory_attrs[i]].name, n->n.prefix, n->n.pxlen);
+ log(L_ERR "%s: Mandatory attribute %s missing in route %s", p->p.name,
+ bgp_attr_table[bgp_mandatory_attrs[i]].name, fib2_print(PROTO_FIB(&p->p), &n->n));
return NULL;
}
@@ -794,7 +795,7 @@ bgp_get_bucket(struct bgp_proto *p, net *n, ea_lis
a = ea_find(new, EA_CODE(EAP_BGP, BA_NEXT_HOP));
if (!a || ipa_equal(p->cf->remote_ip, *(ip_addr *)a->u.ptr->data))
{
- log(L_ERR "%s: Invalid NEXT_HOP attribute in route %I/%d", p->p.name, n->n.prefix, n->n.pxlen);
+ log(L_ERR "%s: Invalid NEXT_HOP attribute in route %s", p->p.name, fib2_print(PROTO_FIB(&p->p), &n->n));
return NULL;
}
@@ -838,7 +839,7 @@ bgp_rt_notify(struct proto *P, rtable *tbl UNUSED,
init_list(&buck->prefixes);
}
}
- px = fib_get(&p->prefix_fib, &n->n.prefix, n->n.pxlen);
+ px = fib_get(&p->prefix_fib, FPREFIX(&n->n), n->n.pxlen);
if (px->bucket_node.next)
{
DBG("\tRemoving old entry.\n");
Index: proto/rip/rip.c
===================================================================
--- proto/rip/rip.c (revision 4962)
+++ proto/rip/rip.c (working copy)
@@ -97,7 +97,7 @@ rip_tx_prepare(struct proto *p, struct rip_block *
int metric;
DBG( "." );
b->tag = htons( e->tag );
- b->network = e->n.prefix;
+ b->network = *FPREFIX_IP(&e->n);
metric = e->metric;
if (neigh_connected_to(p, &e->whotoldme, rif->iface)) {
DBG( "(split horizon)" );
@@ -498,8 +498,8 @@ rip_rx(sock *s, int size)
static void
rip_dump_entry( struct rip_entry *e )
{
- debug( "%I told me %d/%d ago: to %I/%d go via %I, metric %d ",
- e->whotoldme, e->updated-now, e->changed-now, e->n.prefix, e->n.pxlen, e->nexthop, e->metric );
+ debug( "%I told me %d/%d ago: to %s go via %I, metric %d ",
+ e->whotoldme, e->updated-now, e->changed-now, fib_print(&e->n), e->nexthop, e->metric );
debug( "\n" );
}
@@ -535,7 +535,7 @@ rip_timer(timer *t)
#endif
if (now - rte->lastmod > P_CF->timeout_time) {
- TRACE(D_EVENTS, "entry is too old: %I", rte->net->n.prefix );
+ TRACE(D_EVENTS, "entry is too old: %I", *FPREFIX_IP(&rte->net->n) );
if (rte->u.rip.entry) {
rte->u.rip.entry->metric = P_CF->infinity;
rte->u.rip.metric = P_CF->infinity;
@@ -543,7 +543,7 @@ rip_timer(timer *t)
}
if (now - rte->lastmod > P_CF->garbage_time) {
- TRACE(D_EVENTS, "entry is much too old: %I", rte->net->n.prefix );
+ TRACE(D_EVENTS, "entry is much too old: %I", *FPREFIX_IP(&rte->net->n) );
rte_discard(p->table, rte);
}
}
@@ -873,12 +873,12 @@ rip_rt_notify(struct proto *p, struct rtable *tabl
CHK_MAGIC;
struct rip_entry *e;
- e = fib_find( &P->rtable, &net->n.prefix, net->n.pxlen );
+ e = fib_find( &P->rtable, FPREFIX(&net->n), net->n.pxlen );
if (e)
fib_delete( &P->rtable, e );
if (new) {
- e = fib_get( &P->rtable, &net->n.prefix, net->n.pxlen );
+ e = fib_get( &P->rtable, FPREFIX(&net->n), net->n.pxlen );
e->nexthop = new->attrs->gw;
e->metric = 0;
Index: proto/pipe/pipe.c
===================================================================
--- proto/pipe/pipe.c (revision 4962)
+++ proto/pipe/pipe.c (working copy)
@@ -46,11 +46,11 @@ pipe_rt_notify(struct proto *P, rtable *src_table,
if (dest->pipe_busy)
{
- log(L_ERR "Pipe loop detected when sending %I/%d to table %s",
- n->n.prefix, n->n.pxlen, dest->name);
+ log(L_ERR "Pipe loop detected when sending %s to table %s",
+ fib_print(&n->n), dest->name);
return;
}
- nn = net_get(dest, n->n.prefix, n->n.pxlen);
+ nn = fib_get(&dest->fib, FPREFIX(&n->n), n->n.pxlen);
if (new)
{
memcpy(&a, new->attrs, sizeof(rta));
Index: sysdep/linux/krt-scan.c
===================================================================
--- sysdep/linux/krt-scan.c (revision 4962)
+++ sysdep/linux/krt-scan.c (working copy)
@@ -101,7 +101,7 @@ krt_parse_entry(byte *ent, struct krt_proto *p)
a.iface = ng->iface;
else
{
- log(L_WARN "Kernel told us to use non-neighbor %I for %I/%d", gw, net->n.prefix, net->n.pxlen);
+ log(L_WARN "Kernel told us to use non-neighbor %I for %s", gw, fib_print(&net->n));
return;
}
a.dest = RTD_ROUTER;
@@ -120,7 +120,7 @@ krt_parse_entry(byte *ent, struct krt_proto *p)
}
else
{
- log(L_WARN "Kernel reporting unknown route type to %I/%d", net->n.prefix, net->n.pxlen);
+ log(L_WARN "Kernel reporting unknown route type to %s", fib_print(&net->n));
return;
}
Index: sysdep/linux/netlink/netlink.c
===================================================================
--- sysdep/linux/netlink/netlink.c (revision 4962)
+++ sysdep/linux/netlink/netlink.c (working copy)
@@ -628,7 +628,7 @@ nl_send_route(struct krt_proto *p, rte *e, int new
char buf[64 + nh_bufsize(a->nexthops)];
} r;
- DBG("nl_send_route(%I/%d,new=%d)\n", net->n.prefix, net->n.pxlen, new);
+ DBG("nl_send_route(%s,new=%d)\n", fib2_print(e->rtype, &net->n), new);
bzero(&r.h, sizeof(r.h));
bzero(&r.r, sizeof(r.r));
@@ -642,7 +642,7 @@ nl_send_route(struct krt_proto *p, rte *e, int new
r.r.rtm_table = KRT_CF->scan.table_id;
r.r.rtm_protocol = RTPROT_BIRD;
r.r.rtm_scope = RT_SCOPE_UNIVERSE;
- nl_add_attr_ipa(&r.h, sizeof(r), RTA_DST, net->n.prefix);
+ nl_add_attr_ipa(&r.h, sizeof(r), RTA_DST, *FPREFIX_IP(&net->n));
if (ea = ea_find(a->eattrs, EA_KRT_PREFSRC))
nl_add_attr_ipa(&r.h, sizeof(r), RTA_PREFSRC, *(ip_addr *)ea->u.ptr->data);
@@ -807,8 +807,7 @@ nl_parse_route(struct nlmsghdr *h, int scan)
ra.nexthops = nl_parse_multipath(p, a[RTA_MULTIPATH]);
if (!ra.nexthops)
{
- log(L_ERR "KRT: Received strange multipath route %I/%d",
- net->n.prefix, net->n.pxlen);
+ log(L_ERR "KRT: Received strange multipath route %s", fib_print(&net->n));
return;
}
@@ -818,8 +817,8 @@ nl_parse_route(struct nlmsghdr *h, int scan)
ra.iface = if_find_by_index(oif);
if (!ra.iface)
{
- log(L_ERR "KRT: Received route %I/%d with unknown ifindex %u",
- net->n.prefix, net->n.pxlen, oif);
+ log(L_ERR "KRT: Received route %s with unknown ifindex %u",
+ fib_print(&net->n), oif);
return;
}
@@ -838,8 +837,8 @@ nl_parse_route(struct nlmsghdr *h, int scan)
(i->rtm_flags & RTNH_F_ONLINK) ? NEF_ONLINK : 0);
if (!ng || (ng->scope == SCOPE_HOST))
{
- log(L_ERR "KRT: Received route %I/%d with strange next-hop %I",
- net->n.prefix, net->n.pxlen, ra.gw);
+ log(L_ERR "KRT: Received route %s with strange next-hop %I",
+ fib_print(&net->n), ra.gw);
return;
}
}
Index: sysdep/unix/krt.c
===================================================================
--- sysdep/unix/krt.c (revision 4962)
+++ sysdep/unix/krt.c (working copy)
@@ -234,14 +234,14 @@ static inline void
krt_trace_in(struct krt_proto *p, rte *e, char *msg)
{
if (p->p.debug & D_PACKETS)
- log(L_TRACE "%s: %I/%d: %s", p->p.name, e->net->n.prefix, e->net->n.pxlen, msg);
+ log(L_TRACE "%s: %s: %s", p->p.name, fib2_print(PROTO_FIB(&p->p), &e->net->n), msg);
}
static inline void
krt_trace_in_rl(struct rate_limit *rl, struct krt_proto *p, rte *e, char *msg)
{
if (p->p.debug & D_PACKETS)
- log_rl(rl, L_TRACE "%s: %I/%d: %s", p->p.name, e->net->n.prefix, e->net->n.pxlen, msg);
+ log_rl(rl, L_TRACE "%s: %s: %s", p->p.name, fib2_print(PROTO_FIB(&p->p), &e->net->n), msg);
}
/*
@@ -266,7 +266,7 @@ krt_learn_announce_update(struct krt_proto *p, rte
net *n = e->net;
rta *aa = rta_clone(e->attrs);
rte *ee = rte_get_temp(aa);
- net *nn = net_get(p->p.table, n->n.prefix, n->n.pxlen);
+ net *nn = fib_get(&p->p.table->fib, FPREFIX(&n->n), n->n.pxlen);
ee->net = nn;
ee->pflags = 0;
ee->pref = p->p.preference;
@@ -277,7 +277,7 @@ krt_learn_announce_update(struct krt_proto *p, rte
static void
krt_learn_announce_delete(struct krt_proto *p, net *n)
{
- n = net_find(p->p.table, n->n.prefix, n->n.pxlen);
+ n = fib_find(&p->p.table->fib, FPREFIX(&n->n), n->n.pxlen);
if (n)
rte_update(p->p.table, n, &p->p, &p->p, NULL);
}
@@ -286,7 +286,7 @@ static void
krt_learn_scan(struct krt_proto *p, rte *e)
{
net *n0 = e->net;
- net *n = net_get(&p->krt_table, n0->n.prefix, n0->n.pxlen);
+ net *n = fib_get(&p->krt_table.fib, FPREFIX(&n0->n), n0->n.pxlen);
rte *m, **mm;
e->attrs->source = RTS_INHERIT;
@@ -358,7 +358,7 @@ again:
}
if (!n->routes)
{
- DBG("%I/%d: deleting\n", n->n.prefix, n->n.pxlen);
+ DBG("%s: deleting\n", fib2_print(fib, n));
if (old_best)
{
krt_learn_announce_delete(p, n);
@@ -387,8 +387,8 @@ static void
krt_learn_async(struct krt_proto *p, rte *e, int new)
{
net *n0 = e->net;
- net *n = net_get(&p->krt_table, n0->n.prefix, n0->n.pxlen);
rte *g, **gg, *best, **bestp, *old_best;
+ net *n = fib_get(&p->krt_table.fib, FPREFIX(&n0->n), n0->n.pxlen);
e->attrs->source = RTS_INHERIT;
Index: sysdep/bsd/krt-sock.c
===================================================================
--- sysdep/bsd/krt-sock.c (revision 4962)
+++ sysdep/bsd/krt-sock.c (working copy)
@@ -81,7 +81,7 @@ krt_sock_send(int cmd, rte *e)
sockaddr gate, mask, dst;
ip_addr gw;
- DBG("krt-sock: send %I/%d via %I\n", net->n.prefix, net->n.pxlen, a->gw);
+ DBG("krt-sock: send %s via %I\n", fib2_print(e->rtype, &net->n), a->gw);
bzero(&msg,sizeof (struct rt_msghdr));
msg.rtm.rtm_version = RTM_VERSION;
@@ -134,7 +134,8 @@ krt_sock_send(int cmd, rte *e)
_I0(gw) = 0xfe800000 | (i->index & 0x0000ffff);
#endif
- fill_in_sockaddr(&dst, net->n.prefix, 0);
+ /* XXX: more general approach should be used here */
+ fill_in_sockaddr(&dst, *FPREFIX_IP(&net->n), 0);
fill_in_sockaddr(&mask, ipa_mkmask(net->n.pxlen), 0);
fill_in_sockaddr(&gate, gw, 0);
@@ -181,7 +182,7 @@ krt_sock_send(int cmd, rte *e)
msg.rtm.rtm_msglen = l;
if ((l = write(rt_sock, (char *)&msg, l)) < 0) {
- log(L_ERR "KRT: Error sending route %I/%d to kernel", net->n.prefix, net->n.pxlen);
+ log(L_ERR "KRT: Error sending route %s to kernel", fib2_print(e->rtype, &net->n));
}
}
@@ -190,12 +191,12 @@ krt_set_notify(struct krt_proto *p UNUSED, net *ne
{
if (old)
{
- DBG("krt_remove_route(%I/%d)\n", net->n.prefix, net->n.pxlen);
+ DBG("krt_remove_route(%s)\n", fib2_print(PROTO_FIB(&p->p), &net->n));
krt_sock_send(RTM_DELETE, old);
}
if (new)
{
- DBG("krt_add_route(%I/%d)\n", net->n.prefix, net->n.pxlen);
+ DBG("krt_add_route(%s)\n", fib2_print(PROTO_FIB(&p->p), &net->n));
krt_sock_send(RTM_ADD, new);
}
}
@@ -355,8 +356,8 @@ krt_read_rt(struct ks_msg *msg, struct krt_proto *
a.iface = if_find_by_index(msg->rtm.rtm_index);
if (!a.iface)
{
- log(L_ERR "KRT: Received route %I/%d with unknown ifindex %u",
- net->n.prefix, net->n.pxlen, msg->rtm.rtm_index);
+ log(L_ERR "KRT: Received route %s with unknown ifindex %u",
+ fib2_print(PROTO_FIB(&p->p), &net->n), msg->rtm.rtm_index);
return;
}
@@ -380,8 +381,8 @@ krt_read_rt(struct ks_msg *msg, struct krt_proto *
if (ipa_classify(a.gw) == (IADDR_HOST | SCOPE_HOST))
return;
- log(L_ERR "KRT: Received route %I/%d with strange next-hop %I",
- net->n.prefix, net->n.pxlen, a.gw);
+ log(L_ERR "KRT: Received route %s with strange next-hop %I",
+ fib2_print(PROTO_FIB(&p->p), &net->n), a.gw);
return;
}
}
Index: nest/route.h
===================================================================
--- nest/route.h (revision 4962)
+++ nest/route.h (working copy)
@@ -38,7 +38,7 @@ struct fib_node {
byte flags; /* User-defined */
byte x0, x1; /* User-defined */
u32 uid; /* Unique ID based on hash */
- ip_addr prefix; /* In host order */
+ void *addr; /* Pointer to (already allocated) address data. Host order required */
};
struct fib_iterator { /* See lib/slists.h for an explanation */
@@ -50,6 +50,7 @@ struct fib_iterator { /* See lib/slists.h for an
};
typedef void (*fib_init_func)(struct fib_node *);
+typedef int (*fib_hash_func)(void *);
struct fib {
pool *fib_pool; /* Pool holding all our data */
@@ -58,15 +59,24 @@ struct fib {
unsigned int hash_size; /* Number of hash table entries (a power of two) */
unsigned int hash_order; /* Binary logarithm of hash_size */
unsigned int hash_shift; /* 16 - hash_log */
+ unsigned int addr_type; /* Type of addresses stored in fib */
+ unsigned int addr_size; /* size of address specified in entry */
+ unsigned int node_size; /* size of node to allocate */
unsigned int entries; /* Number of entries */
unsigned int entries_min, entries_max;/* Entry count limits (else start rehashing) */
fib_init_func init; /* Constructor */
+ fib_hash_func hash_f; /* Optional hash function */
};
void fib_init(struct fib *, pool *, unsigned node_size, unsigned hash_order, fib_init_func init);
-void *fib_find(struct fib *, ip_addr *, int); /* Find or return NULL if doesn't exist */
-void *fib_get(struct fib *, ip_addr *, int); /* Find or create new if nonexistent */
-void *fib_route(struct fib *, ip_addr, int); /* Longest-match routing lookup */
+//#define fib_init(f, p, node_size, hash_order, init) fib2_init(f, p, node_size, RT_IP, sizeof(ip_addr), hash_order, init, NULL)
+void fib2_init(struct fib *, pool *, unsigned node_size, unsigned int addr_type, unsigned int addr_size, \
+ unsigned hash_order, fib_init_func init, fib_hash_func hash_f);
+void *fib_find(struct fib *, void *, int); /* Find or return NULL if doesn't exist */
+void *fib_get(struct fib *, void *, int); /* Find or create new if nonexistent */
+void *fib_route(struct fib *, ip_addr *, int); /* Longest-match routing lookup */
+char *fib_print(struct fib_node *); /* Prints human-readable fib_node prefix */
+char *fib2_print(int rtype, struct fib_node *); /* Prints human-readable fib_node prefix */
void fib_delete(struct fib *, void *); /* Remove fib entry */
void fib_free(struct fib *); /* Destroy the fib */
void fib_check(struct fib *); /* Consistency check for debugging */
@@ -75,6 +85,10 @@ void fit_init(struct fib_iterator *, struct fib *)
struct fib_node *fit_get(struct fib *, struct fib_iterator *);
void fit_put(struct fib_iterator *, struct fib_node *);
+#define FPREFIX_IP(n) ((ip_addr *)((n))->addr)
+#define FPREFIX(n) ((void *)((n))->addr)
+#define PROTO_FIB(x) ((x)->table->fib.addr_type)
+
#define FIB_WALK(fib, z) do { \
struct fib_node *z, **ff = (fib)->hash_table; \
unsigned int count = (fib)->hash_size; \
@@ -116,6 +130,7 @@ void fit_put(struct fib_iterator *, struct fib_nod
struct rtable_config {
node n;
char *name;
+ int rtype; /* table type (RT_IP, RT_VPN, ...) */
struct rtable *table;
struct proto_config *krt_attached; /* Kernel syncer attached to this table */
int gc_max_ops; /* Maximum number of operations before GC is run */
@@ -126,6 +141,7 @@ typedef struct rtable {
node n; /* Node in list of all tables */
struct fib fib;
char *name; /* Name of this table */
+ int rtype; /* Type of the table (IPv46, VPNv46, MPLS, etc..)*/
list hooks; /* List of announcement hooks */
int pipe_busy; /* Pipe loop detection */
int use_count; /* Number of protocols using this table */
@@ -179,6 +195,7 @@ struct hostentry {
typedef struct rte {
struct rte *next;
net *net; /* Network this RTE belongs to */
+ int rtype; /* RTE type: IP, MPLS, VPN, .. */
struct proto *sender; /* Protocol instance that sent the route to the routing table */
struct rta *attrs; /* Attributes of this route */
byte flags; /* Flags (REF_...) */
@@ -213,6 +230,11 @@ typedef struct rte {
#define REF_COW 1 /* Copy this rte on write */
+/* Types of routing tables/entries */
+#define RT_IP 1
+#define RT_VPN 2
+#define RT_MPLS 3
+
/* Types of route announcement, also used as flags */
#define RA_OPTIMAL 1 /* Announcement of optimal route change */
#define RA_ANY 2 /* Announcement of any route change */
@@ -240,7 +262,7 @@ void rt_dump_all(void);
int rt_feed_baby(struct proto *p);
void rt_feed_baby_abort(struct proto *p);
void rt_prune_all(void);
-struct rtable_config *rt_new_table(struct symbol *s);
+struct rtable_config *rt_new_table(struct symbol *s, int rtype);
struct rt_show_data {
ip_addr prefix;
Index: nest/rt-table.c
===================================================================
--- nest/rt-table.c (revision 4962)
+++ nest/rt-table.c (working copy)
@@ -66,6 +66,9 @@ net_route(rtable *tab, ip_addr a, int len)
ip_addr a0;
net *n;
+ if (tab->fib.addr_type != RT_IP)
+ return NULL;
+
while (len >= 0)
{
a0 = ipa_and(a, ipa_mkmask(len));
@@ -111,7 +114,7 @@ rte_find(net *net, struct proto *p)
*
* Create a temporary &rte and bind it with the attributes @a.
* Also set route preference to the default preference set for
- * the protocol.
+ * the protocol. RT_IP route type is assumed by default
*/
rte *
rte_get_temp(rta *a)
@@ -121,6 +124,7 @@ rte_get_temp(rta *a)
e->attrs = a;
e->flags = 0;
e->pref = a->proto->preference;
+ e->rtype = RT_IP;
return e;
}
@@ -166,7 +170,7 @@ rte_trace(struct proto *p, rte *e, int dir, char *
byte via[STD_ADDRESS_P_LENGTH+32];
rt_format_via(e, via);
- log(L_TRACE "%s %c %s %I/%d %s", p->name, dir, msg, e->net->n.prefix, e->net->n.pxlen, via);
+ log(L_TRACE "%s %c %s %s %s", p->name, dir, msg, fib2_print(e->rtype, &e->net->n), via);
}
static inline void
@@ -367,23 +371,27 @@ rte_announce(rtable *tab, unsigned type, net *net,
static inline int
-rte_validate(rte *e)
+rte_validate(struct fib *f, rte *e)
{
int c;
net *n = e->net;
- if ((n->n.pxlen > BITS_PER_IP_ADDRESS) || !ip_is_prefix(n->n.prefix,n->n.pxlen))
+ /* Do not bother checking non-IP routes at the moment */
+ if (f->addr_type != RT_IP)
+ return 1;
+
+ if ((n->n.pxlen > BITS_PER_IP_ADDRESS) || !ip_is_prefix(*FPREFIX_IP(&n->n),n->n.pxlen))
{
- log(L_WARN "Ignoring bogus prefix %I/%d received via %s",
- n->n.prefix, n->n.pxlen, e->sender->name);
+ log(L_WARN "Ignoring bogus prefix %s received via %s",
+ fib2_print(e->rtype, &n->n), e->sender->name);
return 0;
}
- c = ipa_classify_net(n->n.prefix);
+ c = ipa_classify_net(*FPREFIX_IP(&n->n));
if ((c < 0) || !(c & IADDR_HOST) || ((c & IADDR_SCOPE_MASK) <= SCOPE_LINK))
{
- log(L_WARN "Ignoring bogus route %I/%d received via %s",
- n->n.prefix, n->n.pxlen, e->sender->name);
+ log(L_WARN "Ignoring bogus route %s received via %s",
+ fib2_print(e->rtype, &n->n), n->n.pxlen, e->sender->name);
return 0;
}
@@ -453,8 +461,8 @@ rte_recalculate(rtable *table, net *net, struct pr
{
if (new)
{
- log(L_ERR "Pipe collision detected when sending %I/%d to table %s",
- net->n.prefix, net->n.pxlen, table->name);
+ log(L_ERR "Pipe collision detected when sending %s to table %s",
+ fib2_print(old->rtype, &net->n), table->name);
rte_free_quick(new);
}
return;
@@ -672,7 +680,7 @@ rte_update(rtable *table, net *net, struct proto *
#endif
stats->imp_updates_received++;
- if (!rte_validate(new))
+ if (!rte_validate(&table->fib, new))
{
rte_trace_in(D_FILTERS, p, new, "invalid");
stats->imp_updates_invalid++;
@@ -750,7 +758,7 @@ rte_dump(rte *e)
{
net *n = e->net;
if (n)
- debug("%-1I/%2d ", n->n.prefix, n->n.pxlen);
+ debug("%-1I/%2d ", *FPREFIX_IP(&n->n), n->n.pxlen);
else
debug("??? ");
debug("KF=%02x PF=%02x pref=%d lm=%d ", n->n.flags, e->pflags, e->pref, now-e->lastmod);
@@ -773,7 +781,7 @@ rt_dump(rtable *t)
net *n;
struct announce_hook *a;
- debug("Dump of routing table <%s>\n", t->name);
+ debug("Dump of routing table <%s>:%d\n", t->name, t->fib.addr_type);
#ifdef DEBUGGING
fib_check(&t->fib);
#endif
@@ -848,11 +856,17 @@ rt_event(void *ptr)
rt_prune(tab);
}
+
+/**
+ * rt_setup - initialize RT_IP routing table
+ *
+ * This function is called to set up rtable (hooks, lists, fib, ..)
+ */
void
rt_setup(pool *p, rtable *t, char *name, struct rtable_config *cf)
{
bzero(t, sizeof(*t));
- fib_init(&t->fib, p, sizeof(net), 0, rte_init);
+ fib2_init(&t->fib, p, sizeof(net), RT_IP, sizeof(ip_addr), 0, rte_init, NULL);
t->name = name;
t->config = cf;
init_list(&t->hooks);
@@ -953,7 +967,7 @@ rt_preconfig(struct config *c)
struct symbol *s = cf_find_symbol("master");
init_list(&c->tables);
- c->master_rtc = rt_new_table(s);
+ c->master_rtc = rt_new_table(s, RT_IP);
}
@@ -1098,12 +1112,13 @@ rt_next_hop_update(rtable *tab)
struct rtable_config *
-rt_new_table(struct symbol *s)
+rt_new_table(struct symbol *s, int rtype)
{
struct rtable_config *c = cfg_allocz(sizeof(struct rtable_config));
cf_define_symbol(s, SYM_TABLE, c);
c->name = s->name;
+ c->rtype = rtype;
add_tail(&new_config->tables, &c->n);
c->gc_max_ops = 1000;
c->gc_min_time = 5;
@@ -1461,7 +1476,7 @@ rt_notify_hostcache(rtable *tab, net *net)
if (tab->hcu_scheduled)
return;
- if (trie_match_prefix(hc->trie, net->n.prefix, net->n.pxlen))
+ if (trie_match_prefix(hc->trie, *FPREFIX_IP(&net->n), net->n.pxlen))
rt_schedule_hcu(tab);
}
@@ -1512,6 +1527,8 @@ rt_update_hostentry(rtable *tab, struct hostentry
rta *old_src = he->src;
int pxlen = 0;
+ /* XXX: check for non-IP address families ? */
+
/* Reset the hostentry */
he->src = NULL;
he->gw = IPA_NONE;
@@ -1527,8 +1544,8 @@ rt_update_hostentry(rtable *tab, struct hostentry
if (a->hostentry)
{
/* Recursive route should not depend on another recursive route */
- log(L_WARN "Next hop address %I resolvable through recursive route for %I/%d",
- he->addr, n->n.prefix, pxlen);
+ log(L_WARN "Next hop address %I resolvable through recursive route for %s",
+ he->addr, fib2_print(tab->fib.addr_type, &n->n));
goto done;
}
@@ -1675,13 +1692,11 @@ rt_show_rte(struct cli *c, byte *ia, rte *e, struc
}
static void
-rt_show_net(struct cli *c, net *n, struct rt_show_data *d)
+rt_show_net(struct cli *c, struct fib *f, net *n, struct rt_show_data *d)
{
rte *e, *ee;
- byte ia[STD_ADDRESS_P_LENGTH+8];
int ok;
- bsprintf(ia, "%I/%d", n->n.prefix, n->n.pxlen);
if (n->routes)
d->net_counter++;
for(e=n->routes; e; e=e->next)
@@ -1717,8 +1732,7 @@ static void
{
d->show_counter++;
if (d->stats < 2)
- rt_show_rte(c, ia, e, d, tmpa);
- ia[0] = 0;
+ rt_show_rte(c, fib2_print(f->addr_type, &n->n), e, d, tmpa);
}
if (e != ee)
{
@@ -1763,7 +1777,7 @@ rt_show_cont(struct cli *c)
FIB_ITERATE_PUT(it, f);
return;
}
- rt_show_net(c, n, d);
+ rt_show_net(c, fib, n, d);
}
FIB_ITERATE_END(f);
if (d->stats)
@@ -1803,7 +1817,7 @@ rt_show(struct rt_show_data *d)
n = net_find(d->table, d->prefix, d->pxlen);
if (n)
{
- rt_show_net(this_cli, n, d);
+ rt_show_net(this_cli, &d->table->fib, n, d);
cli_msg(0, "");
}
else
Index: nest/rt-fib.c
===================================================================
--- nest/rt-fib.c (revision 4962)
+++ nest/rt-fib.c (working copy)
@@ -73,9 +73,15 @@ fib_ht_free(struct fib_node **h)
}
static inline unsigned
-fib_hash(struct fib *f, ip_addr *a)
+fib_hash(struct fib *f, void *a)
{
- return ipa_hash(*a) >> f->hash_shift;
+ if (f->hash_f)
+ return f->hash_f(a);
+
+ if (f->addr_type == RT_IP)
+ return ipa_hash(*((ip_addr *)a)) >> f->hash_shift;
+
+ return 0;
}
static void
@@ -98,16 +104,42 @@ fib_dummy_init(struct fib_node *dummy UNUSED)
void
fib_init(struct fib *f, pool *p, unsigned node_size, unsigned hash_order, fib_init_func init)
{
+ fib2_init(f, p, node_size, RT_IP, sizeof(ip_addr), hash_order, init, NULL);
+}
+
+/**
+ * fib2_init - initialize a new FIB
+ * @f: the FIB to be initialized (the structure itself being allocated by the caller)
+ * @p: pool to allocate the nodes in
+ * @node_size: total node size to be used (each node consists of a standard header &fib_node
+ * followed by user data)
+ * @addr_type: type of addresses stored in fib (RT_*)
+ * @addr_size: size of address data
+ * @hash_order: initial hash order (a binary logarithm of hash table size), 0 to use default order
+ * (recommended)
+ * @init: pointer a function to be called to initialize a newly created node
+ * @hash_p: optional pointer a function to be called to hash node
+ *
+ * This function initializes a newly allocated FIB and prepares it for use.
+ */
+void
+fib2_init(struct fib *f, pool *p, unsigned node_size, unsigned int addr_type, unsigned int addr_size, \
+ unsigned hash_order, fib_init_func init, fib_hash_func hash_f)
+{
if (!hash_order)
hash_order = HASH_DEF_ORDER;
f->fib_pool = p;
- f->fib_slab = sl_new(p, node_size);
+ f->fib_slab = sl_new(p, node_size + addr_size);
f->hash_order = hash_order;
fib_ht_alloc(f);
bzero(f->hash_table, f->hash_size * sizeof(struct fib_node *));
+ f->addr_type = addr_type;
+ f->addr_size = addr_size;
+ f->node_size = node_size;
f->entries = 0;
f->entries_min = 0;
f->init = init ? : fib_dummy_init;
+ f->hash_f = hash_f;
}
static void
@@ -133,7 +165,7 @@ fib_rehash(struct fib *f, int step)
while (e = x)
{
x = e->next;
- nh = fib_hash(f, &e->prefix);
+ nh = fib_hash(f, FPREFIX(e));
while (nh > ni)
{
*t = NULL;
@@ -163,11 +195,11 @@ fib_rehash(struct fib *f, int step)
* a pointer to it or %NULL if no such node exists.
*/
void *
-fib_find(struct fib *f, ip_addr *a, int len)
+fib_find(struct fib *f, void *a, int len)
{
struct fib_node *e = f->hash_table[fib_hash(f, a)];
- while (e && (e->pxlen != len || !ipa_equal(*a, e->prefix)))
+ while (e && (e->pxlen != len || memcmp(a, FPREFIX(e), f->addr_size)))
e = e->next;
return e;
}
@@ -197,26 +229,26 @@ fib_histogram(struct fib *f)
/**
* fib_get - find or create a FIB node
* @f: FIB to work with
- * @a: pointer to IP address of the prefix
- * @len: prefix length
+ * @a: pointer to IP (or other family) address of the prefix
+ * @len: prefix length (if address family requires)
*
* Search for a FIB node corresponding to the given prefix and
* return a pointer to it. If no such node exists, create it.
*/
void *
-fib_get(struct fib *f, ip_addr *a, int len)
+fib_get(struct fib *f, void *a, int len)
{
- unsigned int h = ipa_hash(*a);
- struct fib_node **ee = f->hash_table + (h >> f->hash_shift);
+ unsigned int h = fib_hash(f, a);
+ struct fib_node **ee = f->hash_table + h;
struct fib_node *g, *e = *ee;
- u32 uid = h << 16;
+ u32 uid = h << (16 + f->hash_shift);
- while (e && (e->pxlen != len || !ipa_equal(*a, e->prefix)))
+ while (e && (e->pxlen != len || memcmp(a, FPREFIX(e), f->addr_size)))
e = e->next;
if (e)
return e;
#ifdef DEBUGGING
- if (len < 0 || len > BITS_PER_IP_ADDRESS || !ip_is_prefix(*a,len))
+ if ((f->addr_type == RT_IP) && (len < 0 || len > BITS_PER_IP_ADDRESS || !ip_is_prefix(*((ip_addr *)a),len)))
bug("fib_get() called for invalid address");
#endif
@@ -228,13 +260,14 @@ void *
uid++;
}
- if ((uid >> 16) != h)
+ if ((uid >> (16 + f->hash_shift)) != h)
log(L_ERR "FIB hash table chains are too long");
// log (L_WARN "FIB_GET %I %x %x", *a, h, uid);
e = sl_alloc(f->fib_slab);
- e->prefix = *a;
+ e->addr = (char *)e + f->node_size;
+ memcpy(e->addr, a, f->addr_size);
e->pxlen = len;
e->next = *ee;
e->uid = uid;
@@ -250,22 +283,25 @@ void *
/**
* fib_route - CIDR routing lookup
* @f: FIB to search in
- * @a: pointer to IP address of the prefix
- * @len: prefix length
+ * @a: pointer to IP (or other family) address of the prefix
+ * @len: prefix length (if address family requires)
*
* Search for a FIB node with longest prefix matching the given
* network, that is a node which a CIDR router would use for routing
- * that network.
+ * that network. Function should be called for IPv4/IPv6 routes only
*/
void *
-fib_route(struct fib *f, ip_addr a, int len)
+fib_route(struct fib *f, ip_addr *a, int len)
{
ip_addr a0;
void *t;
+ if (f->addr_type != RT_IP)
+ return NULL;
+
while (len >= 0)
{
- a0 = ipa_and(a, ipa_mkmask(len));
+ a0 = ipa_and(*a, ipa_mkmask(len));
t = fib_find(f, &a0, len);
if (t)
return t;
@@ -321,7 +357,7 @@ void
fib_delete(struct fib *f, void *E)
{
struct fib_node *e = E;
- unsigned int h = fib_hash(f, &e->prefix);
+ unsigned int h = fib_hash(f, FPREFIX(e));
struct fib_node **ee = f->hash_table + h;
struct fib_iterator *it;
@@ -413,7 +449,7 @@ fit_get(struct fib *f, struct fib_iterator *i)
if (k = i->next)
k->prev = j;
j->next = k;
- i->hash = fib_hash(f, &n->prefix);
+ i->hash = fib_hash(f, FPREFIX(n));
return n;
}
@@ -430,6 +466,54 @@ fit_put(struct fib_iterator *i, struct fib_node *n
i->prev = (struct fib_iterator *) n;
}
+/**
+ * fib_print - prints a FIB node
+ * @n: pointer to fib_node structure
+ *
+ * This function prints fib node address to static buffer and
+ * returns it to the caller. Up to PBUFS(4) different buffers are
+ * available. RT_IP address type is assumed
+ */
+char *
+fib_print(struct fib_node *n)
+{
+ return fib2_print(0, n);
+}
+#define PBUFS 4
+#define PSIZE 50
+/**
+ * fib2_print - prints a FIB node
+ * @rtype: address type
+ * @n: pointer to fib_node structure
+ *
+ * This function prints fib node address to static buffer and
+ * returns it to the caller. Up to PBUFS(4) different buffers are
+ * available.
+ */
+char *
+fib2_print(int rtype, struct fib_node *n)
+{
+ static int cntr;
+ static char buf[PBUFS][PSIZE];
+ char *x;
+
+ x = buf[cntr++ % PBUFS];
+ if (rtype == 0)
+ rtype = RT_IP;
+
+ switch (rtype)
+ {
+ case RT_IP:
+ bsnprintf(x, PSIZE, "%I/%d", *FPREFIX_IP(n), n->pxlen);
+ break;
+
+ default:
+ bsnprintf(x, PSIZE, "RT:%d", rtype);
+ }
+
+ return x;
+}
+
#ifdef DEBUGGING
/**
@@ -452,7 +536,7 @@ fib_check(struct fib *f)
for(n=f->hash_table[i]; n; n=n->next)
{
struct fib_iterator *j, *j0;
- unsigned int h0 = ipa_hash(n->prefix);
+ unsigned int h0 = fib_hash(f, FPREFIX(n));
if (h0 < lo)
bug("fib_check: discord in hash chains");
lo = h0;
@@ -491,14 +575,15 @@ void dump(char *m)
{
unsigned int i;
- debug("%s ... order=%d, size=%d, entries=%d\n", m, f.hash_order, f.hash_size, f.hash_size);
+ debug("%s ... type=%d order=%d, size=%d, entries=%d\n", m, f.addr_type, f.hash_order, f.hash_size, f.hash_size);
for(i=0; i<f.hash_size; i++)
{
struct fib_node *n;
struct fib_iterator *j;
for(n=f.hash_table[i]; n; n=n->next)
{
- debug("%04x %04x %p %I/%2d", i, ipa_hash(n->prefix), n, n->prefix, n->pxlen);
+ debug("%04x %04x %p %s", i, fib_hash(&f, FPREFIX(n)) << f->hash_shift, n, fib_print(&f, n));
+
for(j=n->readers; j; j=j->next)
debug(" %p[%p]", j, j->node);
debug("\n");
Index: nest/config.Y
===================================================================
--- nest/config.Y (revision 4962)
+++ nest/config.Y (working copy)
@@ -107,7 +107,7 @@ listen_opt:
CF_ADDTO(conf, newtab)
newtab: TABLE SYM {
- rt_new_table($2);
+ rt_new_table($2, RT_IP);
}
;