Long io loop / missing netlink messages

Nico Schottelius nico.schottelius at ungleich.ch
Sun Dec 22 16:52:20 CET 2019


Hello Maria,

Maria Matějka <maria.matejka at nic.cz> writes:

> Hello!
>
> How much time does it take to list the kernel table?
>
> time ip r > /dev/null

[16:47] router1.place6:~# time ip r >/dev/null
real	0m 6.06s
user	0m 4.90s
sys	0m 1.14s
[16:47] router1.place6:~# time ip r >/dev/null
real	0m 5.92s
user	0m 4.73s
sys	0m 1.17s
[16:47] router1.place6:~#

Question: is a 6 second lookup expected with full IPv4 table?

> How many routes do you have in bird table?
>
> show route count

[16:47] router1.place6:~# birdc show route count
BIRD 2.0.7 ready.
778706 of 778706 routes for 778704 networks in table master4
89166 of 89166 routes for 78105 networks in table master6
Total: 867872 of 867872 routes for 856809 networks in 2 tables

> And what export filter do you have for the kernel protocol in bird?

In short:

every route <= /24 (IPv4) and <= /48 (IPv6) of the global routing table.

In long:

define net_to_router_v4 = [  0.0.0.0/0+ ];
define net_to_router_v6 = [ ::/0+ ];

protocol kernel kernel_v6 {
    metric 64;
    learn;
    scan time 20;

    ipv6 {
      import filter ungleich_networks;
      export filter to_ungleich_routers;
    };
}

function is_for_router_kernel()
{
  if ((net.type = NET_IP6 && net ~ net_to_router_v6) ||
      (net.type = NET_IP4 && net ~ net_to_router_v4)) then {
      return true;
  }
  if ((net.type = NET_IP6 && net.len <= 29) ||
      (net.type = NET_IP4 && net.len <= 10)) then {
      return true;
  }


  return false;
}

/* skipping functions that are very small routes */

filter to_ungleich_routers {
  if(is_inside_ungleich()) then accept;

  if(is_from_netstream()) then accept;
  if(is_from_sunrise()) then accept;

  if(is_default_route()) then accept;
  if(is_for_router_kernel()) then accept;

  reject;
}

Best,

Nico

> On December 21, 2019 1:07:25 PM GMT+01:00, Nico Schottelius <nico.schottelius at ungleich.ch> wrote:
>>
>>Good morning,
>>
>>on a fresh new router running the full routing table,
>>with Alpine, Linux 4.19.80-0-vanilla, bird-2.0.7
>>
>>I see a lot of these messages:
>>
>>Dec 21 12:37:15 router1 daemon.warn bird: Kernel dropped some netlink
>>messages, will resync on next scan.
>>Dec 21 12:37:34 router1 daemon.warn bird: I/O loop cycle took 5000 ms
>>for 1 events
>>Dec 21 12:38:14 router1 daemon.warn bird: Kernel dropped some netlink
>>messages, will resync on next scan.
>>Dec 21 12:38:35 router1 daemon.warn bird: I/O loop cycle took 5328 ms
>>for 1 events
>>Dec 21 12:38:54 router1 daemon.warn bird: I/O loop cycle took 5013 ms
>>for 1 events
>>Dec 21 12:39:07 router1 daemon.warn bird: Kernel dropped some netlink
>>messages, will resync on next scan.
>>Dec 21 12:39:14 router1 daemon.warn bird: I/O loop cycle took 5053 ms
>>for 1 events
>>Dec 21 12:39:34 router1 daemon.warn bird: Kernel dropped some netlink
>>messages, will resync on next scan.
>>Dec 21 12:40:14 router1 daemon.warn bird: I/O loop cycle took 5041 ms
>>for 1 events
>>
>>[12:49] router1.place6:~# ip -6 r | wc -l; ip r | wc -l
>>78212
>>779342
>>
>>With "debug latency;" I get the following additional messages:
>>
>>Dec 21 12:54:31 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 4449 ms
>>Dec 21 12:54:52 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 5608 ms
>>
>>The system is overall idle with bird spiking to 50-100% cpu usage every
>>couple of seconds. I first thougt they are only logged after stating
>>bird (where it might make sense), but the events continue to be logged
>>around every 30s:
>>
>>Dec 21 13:04:11 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 4596 ms
>>Dec 21 13:04:32 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 5096 ms
>>Dec 21 13:04:52 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 5102 ms
>>Dec 21 13:05:11 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 4676 ms
>>Dec 21 13:05:31 router1 daemon.warn bird: Event 0x000055a21afb8144
>>0x0000000000000000 took 4645 ms
>>
>>Which might loosely correlate to the scan time "20" that is setup for
>>device and kernel protocols.
>>
>>How do I best debug this issue?
>>
>>Best,
>>
>>Nico
>>
>>
>>
>>--
>>Modern, affordable, Swiss Virtual Machines. Visit
>>www.datacenterlight.ch


--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch



More information about the Bird-users mailing list