bird6: "Netlink: No such process" error from kernel proto on OSPF multipath prefixes

Israel G. Lugo israel.lugo at tecnico.ulisboa.pt
Thu Jul 23 21:43:27 CEST 2015


Hello,

I am seeing periodic "Netlink: No such process" messages from bird6,
apparently related to prefixes learned from OSPF where there are two
equal cost paths. I've got ECMP on, and am installing routes to the
kernel via kernel protocol.

Operating system is GNU/Linux (Debian), kernel 3.16.

The problem happens when kernel protocol starts pruning table master.
Apparently, it's always deciding to update these multiple-path routes,
and has some failure while trying. Log excerpt:


bird6: kernel1: fec0:0:0:ffff::2/128: seen
bird6: kernel1: fec0:0:0:ffff::3/128: seen
bird6: kernel1: ::/0: seen
bird6: kernel1: Pruning table master
bird6: kernel1: 2001:xxx:yyy:z1::/64: updating
bird6: Netlink: No such process
bird6: kernel1: 2001:xxx:yyy:z2::/64: updating
bird6: Netlink: No such process
...

This is what "show route" has to say about one of these routes:

bird> show route 2001:xxx:yyy:z1::/64 all
2001:xxx:yyy:z1::/64 multipath [backbone 2015-07-21] * IA (150/20)
[193.aa.bb.137]
        via fe80::21e:bff:fec1:8c4a on eth1 weight 1
        via fe80::21e:bff:fec1:8c50 on eth1 weight 1
        Type: OSPF-IA unicast univ
        OSPF.metric1: 20
        OSPF.metric2: 16777215
        OSPF.tag: 0x00000000
        OSPF.router_id: 193.aa.bb.137

"show route" periodically displays these routes with a '!' instead of
'*', indicating a synchronization error.

These "No such process" messages seem to occur every 40 seconds or so.

I am seeing this error both on BIRD 1.4.5 and BIRD 1.5.0.

Relevant topology: Four BIRD routers, A, B, C and D. All in area 0. A+B
are ABR for area 194, and C+D are ABR for area 165.

A and B announce IA prefixes z1, z2, z3, z4, z5, z6, z7 and z8 with
equal cost. C and D announce IA prefixes z9 and z10 with equal cost. A
and B have the error on prefixes from C and D, and vice-versa.

Routes seem to disappear from kernel periodically, and are reinstalled
again (monitoring with "ip -6 r").

Other routes, non-multipath from other routers, do not seem affected.

Relevant config, from routers C and D:

# common to all routers
protocol kernel {
        learn;
        persist;
        scan time 20;
        export all;
}

protocol ospf backbone {
        tick 1;
        ecmp yes;
        area 0.0.0.0 {
                stub no;
                interface "eth1" { check link yes; };
        };
        area 0.0.165.0 {
                stub yes;
                summary yes;
                interface "eth0.2000" {
                        type ptp;
                        check link yes;
                };
                interface "eth0.141", "eth0.165", "eth0.1411" {
                        stub;
                        check link yes;
                };
                networks {
                        2001:xxx:yyy:z9::/64;
                        2001:xxx:yyy:z10::/64;
                };
        };
}

Routers A and B are similar but different prefixes of course.

I do not see this problem with IPv4 bird (also OSPF, similar
configuration). Could this be some bug with kernel protocol and
multipath routes?

I am available for further explanations or more details (logs, configs).

Best regards,

-- 
Israel G. Lugo
Núcleo de Redes e Comunicações
Direção de Serviços de Informática
Instituto Superior Técnico

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20150723/5fbfe1ec/attachment.asc>


More information about the Bird-users mailing list