Merging bird and bird6

Fri Jul 22 12:52:07 CEST 2011

On Fri, Jul 22, 2011 at 01:47:14AM +0400, Alexander V. Chernikov wrote:
> > Therefore there would be two types of routing tables - IP and MPLS. I
> > don't think it is a good idea to mix these. This may look inconsistent
> > with idea of embedding IPv4 to IPv6, but IP protocols are much more
> > similar, have a natural way to embed one in the other, have similar
> > roles and protocol structure. MPLS routing table could be used to LDP -
> > kernel interaction (routes imported from LDP and exported to kernel).
> > This solves your Case 2 without any hacks.
> So, from user point of view, I define
> table xxx; for both ipv4 and IPv6 routes and
> mpls table yyy; for MPLS routing table?

Yes.

> There should be base MPLS rtable (mpls_default, for example) as in IP.
> We can also add a hack for automatically subscribe protocols for MPLS
> routing table by type and other attributes. For example, every LDP
> instance gets connected to an MPLS table (default or defined in config).
> Kernel protocol instance gets connected to MPLS table only if its IP
> table is the default one (GRT) or 'mpls table' keyword is supplied
> explicitely. What about VPNv4/VPNv6 ? The same approach?

Perhaps even default MPLS table should be explicitly configured [*] (as i guess
not many BIRD users would use MPLS). Protocols requiring MPLS table would
fail if it is not configured, protocol with optional MPLS support (kernel,
static?) just do not connect to MPLS in that case. The same approach
for VPNvX table.

[*] probably like: mpls table XXX default;

>  Btw, how we will distinguish inet/inet6 rtes? (I'm talking about MP-BGP
> / IPv4-mapped cases)

I planned to use IPv4-mapped prefix (::ffff:0:0/96), which is used for
similar purposes in IP stack. But this should not be checked directly
in protocols, there should be some macros in lib/ipv6.h for that.

> > [*] when i wrote that i thought that labels are distributed just by LDP
> > and the purpose of label request is to propagate the label through LDP
> > area. i didn't noticed that BGP/MPLS also distributes labels so they
> > need to know assigned labels. So the idea would need some modifications.
> Not sure this will work. Since t1 is an IP table cases when we need to
> request specific label for:
> * AToM
> * RSVP-TE tunnels
> will not work since there are no prefixes that can be mapped to such
> request.

You are probably right. I originally thought about some specific
'request table' (where requests coded as routes with specific AF),
but perhaps there should be used some other mechanism / other protocol
hook. But it should be generic enough (some bus, allows at least more
'producers' and perhaps more 'consumers').

> > Internal LMAP table is examined, tracked IGP table is examined. If both
> > are ready (for given prefix), appropriate encapsulating and MPLS routes
> > are generated and propagated using rte_update(), otherwise nothing is
> > generated and the previously generated route is withdrawn (rte_update()
> > with NULL is called) (or perhaps an unreachable route is generated if
> > LMAP is here but IGP route is missing). Simple and elegant.
> .. and in case of label release we should remove label only and keep
> original route

Yes.

> > There are some tricky parts of IGP tracking - it is problematic
> > to use standard RA_OPTIMAL update for this purpose, because if
> > generated encapsulating routes are imported to the same table,
> > these probably became the optimal ones and IGP routes would be
> > shaded. Solution would be to use RA_ANY, and ignore notifications
> > containing encapsulating routes, similarly 'examining the tracked
> > IGP table' means looking up the fib node and find the best route,
> > ignoring encapsulating ones.
> > 
> > For implementation of this behavior, there are two minor changes that
> > needs to be done to the rt table code: First, currently accept_ra_types
> > (RA_OPTIMAL/RA_ANY) is a property of a protocol, it needs to be a
> > property of an announce hook (as LDP would have two hooks with
> > RA_OPTIMAL and one hook with RA_ANY). Second, rte_announce() for 
> > both in rte_recalculate should be moved after the route list
> > is updated/relinked.

> Agreed. Distinguishing RA_OPTIMAL and RA_ANY in current code is not a
> trivial task and requires internals understanding. Either announce type
> should be passed to announce hook or new hook should be added for RA_ANY
>  event. The latter is more appropriate IMHO since RA_ANY is used by pipe
> protocol only.

I thought about that when i created RA_ANY and have chosen this approach.
Probably best way is just to change rt_notify to have appropriate
struct announce_hook as a second argument instead of struct rtable.
struct announce_hook would contain RA_ANY/RA_OPTIMAL and possibly
some protocol-specific data. As (probably) all protocols are in-tree,
doing some wide but trivial changes is not a problem.

> Kernel protocol should track RA_ANY protocol hooks
> looking for update source (LDP / RSVP) and re-install appropriate
> routes.

I think kernel protocol should use RA_OPTIMAL as usual. This kind
of RA_ANY usage is for protocols that export routes to the same
table they listen (so 'source' routes would be shaded by their
routes). These routes (LDP / RSVP) should have just highest
priority.

> The only downside is situation when LDP signalling starts faster
> than IGP. In that case we will get 3 updates instead of one (at least in
> RTSOCK):
> * RTM_ADD for original prefix
> * RTM_DEL for this prefix (as part of krt_set_notify())
> * RTM_ADD for modified prefix
> 
> RTM_CHANGE can be used in notify, but still: this gives 2 updates
> instead of one.

No, because RA_ANY is handled strictly before RA_OPTIMAL and routes
are propagated synchronously depth-first:

OSPF --RA_ANY--> LDP
LDP --RA_OPTIMAL--> kernel
OSPF --RA_OPTIMAL--> kernel

But it is true that this is much dependent on internal implementation
of route propagation. The first idea i had was to use separate
tables for original and labeled routes (when just RA_OPTIMAL hooks),
but that looks too cumbersome for users and ability to push a better
route to the same (input) table has other possible usages.

> > Therefore, it is probably a good idea to extend FIBs in a way you
> > suggested, with minor details changed. FIB / rtables would be uniform
> > (AF_ bound), but there are just three AFs (IP, MPLS, VPN) - IPv4 and IPv6
> > could be handled as one AF, embedded, the same for VPNv4 and VPNv6). To
> > minimize code changes, struct fib_node would have ip_addr prefix, but
> > might be allocated larger. 
> Okay, so for IPv4+IPv6-enabled daemon we will allocate an ip_addr large
> enough for holding IPv6 address? This can bump memory consumption for
> setups with several full-views significantly.

It increases memory consumtion, but not so much in a relative view - for
each struct network there is at least one struct rte and in both of them
there is just one ip_addr and both structures are nontrivial. So this
relative increase would be about 1.15-1.2. Really big users would
probably keep current splitted setting.

> > Because each protocol and each its announce_hook have appropriate role,
> > it is IMHO unnecessary to have AF in protocol hooks, but there should be
> > check whether protocol/announce_hook is connected to appropriate rtable.
> > 
> 
> To summarize required changes (please correct me):
> 1) Differentiate between RA_ANY and RA_OPTIMAL (new hook, possibly)
> 2) Add 3 AFs (AF_IP, AF_MPLS, AF_VPN) to the following structures:
> * rtable
> * fib
> * rte
> 3) Add fib2_init with sizeof(AF object) supplied. Add appropriate field
> to struct fib to hold this value.
> 4) Move to memcmp() in fib_find / fib_get
> 5) Set up default rtable for every supported AF. Connect protocol
> instances to such default AFs based on protocol types

1a) other changes in rte_recalculate() related to propagation
(clean up the table before calling RA_ANY hook).

1) and 1a) i will do myself and send you the patch, and also make
some trivial example for exporting to the same table.

2) i am not sure if there is a reason to put explicit AF info 
to struct fib, AF compatibility could be handled on higher level
(struct rtable in general, other direct users probably use just
one AF).

3) and hashing callback (and perhaps fib_route, but not sure if this is
needed).

4) probably encapsulate that to some static inline key_equal() function.

5) see my related note above. Protocol binding to tables should check AFs.

more:

6) RTD_MPLS in dest field, struct rta_mpls, as i wrote in the previous mail:

> > i think encapsulation
> > routes should be represented by routes with new destination type
> > (RTD_MPLS in dest field of struct rta) and whole NHLFE should be stored
> > in new struct rta_mpls (or rta_nhlfe), which would be extension of
> > struct rta (containing struct rta in the first field and NHLFE after
> > that). Such structure could be easily passed as struct rta and functions
> > from rt-attr.c can work with that, with jome some minor modifications
> > (allocating, freeing and printing) dispatched based on dest field.

> > This rta could be used without changes also for MPLS routes.

> Most of this are more or less trivial changes not MPLS-bound (VPNv4/6
> can be used in case of bird used as RR in MPLS network, for example).
> Should I supply patches for these? What are your plans about commit
> routemap ?

I create GIT branch 'mpls' and would merge these patches to that branch
soon. When we will have some major release, we could merge 'mpls' branch
to master if there is some sufficient usage (i think that even just
static and kernel protocol support for MPLS would be a good example
usage). Other protocols (LDP, ...) probably should be merged when they
are reasonable ready.

-- 
Elen sila lumenn' omentielvo

Ondrej 'SanTiago' Zajicek (email: santiago at crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20110722/7d9941e4/attachment-0001.asc>