Re: Merging bird and bird6

17 Jul 2011

      On Tue, Jul 12, 2011 at 04:47:06AM +0400, Alexander V. Chernikov wrote:
...
To show "overall" view we have to describe what we will add and what
will be required from BIRD first.
Thanks for the great overview. Sorry for a late answer, it took
a while for me to get into MPLS and think about it.
...
* UNDER THE HOOD
*** KERNEL INTERACTION ***
So essentially, there are three kinds of routes:

 * Standard IP routes with IP nexthops, which we already support.

 * MPLS routes, keyed by MPLS label and with MPLS action (NHLFE), these
form a MPLS routing table (ILM). I will call these MPLS routes.

 * IP routes with MPLS action, used for encapsulation of incoming
IP packets (FTN mapping), these share a routing table with standard IP
routes (because depending of which route is chosen packet is either
forwarded in a standard way, or encapsulated to MPLS). I will call these
encapsulating routes.

If i understand correctly your mail, you use some EAP_ADDITIONAL
external attribute to represent encapsulating routes and use some new
hook to attach these routes by third party protocol. I think this is not
a good idea - to be semantically consistent, i think encapsulation
routes should be represented by routes with new destination type
(RTD_MPLS in dest field of struct rta) and whole NHLFE should be stored
in new struct rta_mpls (or rta_nhlfe), which would be extension of
struct rta (containing struct rta in the first field and NHLFE after
that). Such structure could be easily passed as struct rta and functions
from rt-attr.c can work with that, with jome some minor modifications
(allocating, freeing and printing) dispatched based on dest field.
Otherwise, they are very similar to standard IP routes and probably
would need just some minor tweaks (and obviously kernel protocol support).

Therefore, such encapsulating route should be generated in a standard
way as a new route - by rte_update, with LDP (or some other protocol)
as true originator (in rta->proto and rta->source). I will comment
that later.

MPLS routes could use the same struct rta_mpls as encapsulating routes,
but struct network (their fib_node) contains MPLS label instead of IP address.
As MPLS label is small (and complex action is outside) i don't see any problems
in reuse ip_addr prefix. Most things would work without modifications.
There should be AF field in struct rtable and struct rte to distinguish
routes.

Therefore there would be two types of routing tables - IP and MPLS. I
don't think it is a good idea to mix these. This may look inconsistent
with idea of embedding IPv4 to IPv6, but IP protocols are much more
similar, have a natural way to embed one in the other, have similar
roles and protocol structure. MPLS routing table could be used to LDP -
kernel interaction (routes imported from LDP and exported to kernel).
This solves your Case 2 without any hacks.
...
Case 1:
Route update can happen differently: we can install updated route IFF
* LDP label exists
* IGP nexthop is one of advertised LDP neighbour nexthops.
I think it is possible to handle all these cases and protocol
interaction in an elegant way. LDP protocol, instead of just import and
export to one table, could be connected to more tables, with different
meanings. There are four interactions of LDP protocol - generating MPLS
routes, generating encapsulating routes, importing label requests (can
be handled as routes) [*] and tracking IGP table (to update nexthops of
generated routes). These all can be handled as import or export of
routes to proper tables. Standard table connection (to IP table) could
be used for import (from LDP) of generated encapsulating routes and
export (to LDP) of label requests. Another connection to MPLS table
would be used for import (from LDP) of generated MPLS routes, and the
last one is used for tracking IGP changes:

protocol ldp {
	export all; # label requests
	import all; # encapsulating routes
	mpls import all;  # MPLS routes
	# it is probably pointless to have configurable filters for IGP tracking

	table t1; # table for import label requests and export encapsulating routes
	mpls table t2; # table for MPLS routes
	igp table t1; # table for tracking IGP routes, usually (and by default) the same as main table.
}

[*] when i wrote that i thought that labels are distributed just by LDP
and the purpose of label request is to propagate the label through LDP
area. i didn't noticed that BGP/MPLS also distributes labels so they
need to know assigned labels. So the idea would need some modifications.

(I assume that LDP generates encapsulating routes as a true originator,
as i wrote before, not just attaching some attribute to the existing
route.)

So my idea of your Case 1 scenario is like this:

In both subcases (LDP LMAP arrives and internal table with LMAPs changed;
rt_notify() on 'tracked IGP connection' is used to signalize that
tracked table changed), the same procedure is executed:

Internal LMAP table is examined, tracked IGP table is examined. If both
are ready (for given prefix), appropriate encapsulating and MPLS routes
are generated and propagated using rte_update(), otherwise nothing is
generated and the previously generated route is withdrawn (rte_update()
with NULL is called) (or perhaps an unreachable route is generated if
LMAP is here but IGP route is missing). Simple and elegant.
If the encapsulated routes are saved

Case 2 scenarions are trivial - just standard updates.

There are some tricky parts of IGP tracking - it is problematic
to use standard RA_OPTIMAL update for this purpose, because if
generated encapsulating routes are imported to the same table,
these probably became the optimal ones and IGP routes would be
shaded. Solution would be to use RA_ANY, and ignore notifications
containing encapsulating routes, similarly 'examining the tracked
IGP table' means looking up the fib node and find the best route,
ignoring encapsulating ones.

For implementation of this behavior, there are two minor changes that
needs to be done to the rt table code: First, currently accept_ra_types
(RA_OPTIMAL/RA_ANY) is a property of a protocol, it needs to be a
property of an announce hook (as LDP would have two hooks with
RA_OPTIMAL and one hook with RA_ANY). Second, rte_announce() for 
both in rte_recalculate should be moved after the route list
is updated/relinked.

BTW, this whole dependency 'IGP table -> LDP function' is a bit similar
to situation with recursive nexthops in IBGP, where IGP change also
leads to change of IBGP route nexthop. In IBGP case it is handled
automatically by rtable code (see rta_set_recursive_next_hop()
discussion in route.h, hostcache and hostentry), LDP situation is a bit
different, but perhaps the same mechanism could be extended to call
protocol hook instead just update nexthop. This mechanism is useful if
protocol waits for a change of a result of some recursive lookup in tracked
table. But the LDP situation is much simpler, it just waits for an exact match
change in tracked table.
...
* There is no need to call all other protocols since they should not be
interested in such update
Not true, other protocol may have filters that changes answer if you
do some changes to route attributes. Ignoring that would lead to
inconsistencies in route propagation.
...
* By upgrading FIB / rtable:
If (from the point of user) config tables will be not AF_bound (e.g.
IPv4+IPv6) we will have to do enhance FIB api.
My vision is the following:
* make fib AF_ bound, specifying AF and sizeof(object) at fib_init (or
fib2_init)
* pass pointers to all fib_* related functions instead of addresses
* do compare by memcpy() for searching (and use AF-dependent hash based
on value passed in _init)
* Pass AF in appropriate protocol hooks
As i wrote above, if we consider just IP (v4 and v6) and MPLS routes,
i think that fixed size fib would be enough. But problems are with
VPNvX AFs.

Originally i thought that having FIB / rtable with VPNvX routes is not a
good idea - these AFs are just some wire representation of multiple 
independent IP spaces, and we already have better representation of that
 - just multiple routing tables. Having both these representations seemed
unnecessary and would require some conversion between the parts that
request the first representation and the parts that request the second
one. But not having VPNvX routes is also cumbersome - protocols that
uses these have to be bound to multple routing tables through some
multiplexer. So it is probably easier to have tables with VPNvX
AFs.

Therefore, it is probably a good idea to extend FIBs in a way you
suggested, with minor details changed. FIB / rtables would be uniform
(AF_ bound), but there are just three AFs (IP, MPLS, VPN) - IPv4 and IPv6
could be handled as one AF, embedded, the same for VPNv4 and VPNv6). To
minimize code changes, struct fib_node would have ip_addr prefix, but
might be allocated larger. 

Because each protocol and each its announce_hook have appropriate role,
it is IMHO unnecessary to have AF in protocol hooks, but there should be
check whether protocol/announce_hook is connected to appropriate rtable.

-- 
Elen sila lumenn' omentielvo

Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."

Re: Merging bird and bird6

Ondrej Zajicek