On Tue, Jul 12, 2011 at 04:47:06AM +0400, Alexander V. Chernikov wrote:
To show "overall" view we have to describe what we will add and what will be required from BIRD first.
Thanks for the great overview. Sorry for a late answer, it took a while for me to get into MPLS and think about it.
* UNDER THE HOOD *** KERNEL INTERACTION ***
So essentially, there are three kinds of routes: * Standard IP routes with IP nexthops, which we already support. * MPLS routes, keyed by MPLS label and with MPLS action (NHLFE), these form a MPLS routing table (ILM). I will call these MPLS routes. * IP routes with MPLS action, used for encapsulation of incoming IP packets (FTN mapping), these share a routing table with standard IP routes (because depending of which route is chosen packet is either forwarded in a standard way, or encapsulated to MPLS). I will call these encapsulating routes. If i understand correctly your mail, you use some EAP_ADDITIONAL external attribute to represent encapsulating routes and use some new hook to attach these routes by third party protocol. I think this is not a good idea - to be semantically consistent, i think encapsulation routes should be represented by routes with new destination type (RTD_MPLS in dest field of struct rta) and whole NHLFE should be stored in new struct rta_mpls (or rta_nhlfe), which would be extension of struct rta (containing struct rta in the first field and NHLFE after that). Such structure could be easily passed as struct rta and functions from rt-attr.c can work with that, with jome some minor modifications (allocating, freeing and printing) dispatched based on dest field. Otherwise, they are very similar to standard IP routes and probably would need just some minor tweaks (and obviously kernel protocol support). Therefore, such encapsulating route should be generated in a standard way as a new route - by rte_update, with LDP (or some other protocol) as true originator (in rta->proto and rta->source). I will comment that later. MPLS routes could use the same struct rta_mpls as encapsulating routes, but struct network (their fib_node) contains MPLS label instead of IP address. As MPLS label is small (and complex action is outside) i don't see any problems in reuse ip_addr prefix. Most things would work without modifications. There should be AF field in struct rtable and struct rte to distinguish routes. Therefore there would be two types of routing tables - IP and MPLS. I don't think it is a good idea to mix these. This may look inconsistent with idea of embedding IPv4 to IPv6, but IP protocols are much more similar, have a natural way to embed one in the other, have similar roles and protocol structure. MPLS routing table could be used to LDP - kernel interaction (routes imported from LDP and exported to kernel). This solves your Case 2 without any hacks.
Case 1: Route update can happen differently: we can install updated route IFF * LDP label exists * IGP nexthop is one of advertised LDP neighbour nexthops.
I think it is possible to handle all these cases and protocol interaction in an elegant way. LDP protocol, instead of just import and export to one table, could be connected to more tables, with different meanings. There are four interactions of LDP protocol - generating MPLS routes, generating encapsulating routes, importing label requests (can be handled as routes) [*] and tracking IGP table (to update nexthops of generated routes). These all can be handled as import or export of routes to proper tables. Standard table connection (to IP table) could be used for import (from LDP) of generated encapsulating routes and export (to LDP) of label requests. Another connection to MPLS table would be used for import (from LDP) of generated MPLS routes, and the last one is used for tracking IGP changes: protocol ldp { export all; # label requests import all; # encapsulating routes mpls import all; # MPLS routes # it is probably pointless to have configurable filters for IGP tracking table t1; # table for import label requests and export encapsulating routes mpls table t2; # table for MPLS routes igp table t1; # table for tracking IGP routes, usually (and by default) the same as main table. } [*] when i wrote that i thought that labels are distributed just by LDP and the purpose of label request is to propagate the label through LDP area. i didn't noticed that BGP/MPLS also distributes labels so they need to know assigned labels. So the idea would need some modifications. (I assume that LDP generates encapsulating routes as a true originator, as i wrote before, not just attaching some attribute to the existing route.) So my idea of your Case 1 scenario is like this: In both subcases (LDP LMAP arrives and internal table with LMAPs changed; rt_notify() on 'tracked IGP connection' is used to signalize that tracked table changed), the same procedure is executed: Internal LMAP table is examined, tracked IGP table is examined. If both are ready (for given prefix), appropriate encapsulating and MPLS routes are generated and propagated using rte_update(), otherwise nothing is generated and the previously generated route is withdrawn (rte_update() with NULL is called) (or perhaps an unreachable route is generated if LMAP is here but IGP route is missing). Simple and elegant. If the encapsulated routes are saved Case 2 scenarions are trivial - just standard updates. There are some tricky parts of IGP tracking - it is problematic to use standard RA_OPTIMAL update for this purpose, because if generated encapsulating routes are imported to the same table, these probably became the optimal ones and IGP routes would be shaded. Solution would be to use RA_ANY, and ignore notifications containing encapsulating routes, similarly 'examining the tracked IGP table' means looking up the fib node and find the best route, ignoring encapsulating ones. For implementation of this behavior, there are two minor changes that needs to be done to the rt table code: First, currently accept_ra_types (RA_OPTIMAL/RA_ANY) is a property of a protocol, it needs to be a property of an announce hook (as LDP would have two hooks with RA_OPTIMAL and one hook with RA_ANY). Second, rte_announce() for both in rte_recalculate should be moved after the route list is updated/relinked. BTW, this whole dependency 'IGP table -> LDP function' is a bit similar to situation with recursive nexthops in IBGP, where IGP change also leads to change of IBGP route nexthop. In IBGP case it is handled automatically by rtable code (see rta_set_recursive_next_hop() discussion in route.h, hostcache and hostentry), LDP situation is a bit different, but perhaps the same mechanism could be extended to call protocol hook instead just update nexthop. This mechanism is useful if protocol waits for a change of a result of some recursive lookup in tracked table. But the LDP situation is much simpler, it just waits for an exact match change in tracked table.
* There is no need to call all other protocols since they should not be interested in such update
Not true, other protocol may have filters that changes answer if you do some changes to route attributes. Ignoring that would lead to inconsistencies in route propagation.
* By upgrading FIB / rtable: If (from the point of user) config tables will be not AF_bound (e.g. IPv4+IPv6) we will have to do enhance FIB api.
My vision is the following: * make fib AF_ bound, specifying AF and sizeof(object) at fib_init (or fib2_init) * pass pointers to all fib_* related functions instead of addresses * do compare by memcpy() for searching (and use AF-dependent hash based on value passed in _init) * Pass AF in appropriate protocol hooks
As i wrote above, if we consider just IP (v4 and v6) and MPLS routes, i think that fixed size fib would be enough. But problems are with VPNvX AFs. Originally i thought that having FIB / rtable with VPNvX routes is not a good idea - these AFs are just some wire representation of multiple independent IP spaces, and we already have better representation of that - just multiple routing tables. Having both these representations seemed unnecessary and would require some conversion between the parts that request the first representation and the parts that request the second one. But not having VPNvX routes is also cumbersome - protocols that uses these have to be bound to multple routing tables through some multiplexer. So it is probably easier to have tables with VPNvX AFs. Therefore, it is probably a good idea to extend FIBs in a way you suggested, with minor details changed. FIB / rtables would be uniform (AF_ bound), but there are just three AFs (IP, MPLS, VPN) - IPv4 and IPv6 could be handled as one AF, embedded, the same for VPNv4 and VPNv6). To minimize code changes, struct fib_node would have ip_addr prefix, but might be allocated larger. Because each protocol and each its announce_hook have appropriate role, it is IMHO unnecessary to have AF in protocol hooks, but there should be check whether protocol/announce_hook is connected to appropriate rtable. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."