Hello Maria, Maria Matějka <maria.matejka@nic.cz> writes:
Hello!
How much time does it take to list the kernel table?
time ip r > /dev/null
[16:47] router1.place6:~# time ip r >/dev/null real 0m 6.06s user 0m 4.90s sys 0m 1.14s [16:47] router1.place6:~# time ip r >/dev/null real 0m 5.92s user 0m 4.73s sys 0m 1.17s [16:47] router1.place6:~# Question: is a 6 second lookup expected with full IPv4 table?
How many routes do you have in bird table?
show route count
[16:47] router1.place6:~# birdc show route count BIRD 2.0.7 ready. 778706 of 778706 routes for 778704 networks in table master4 89166 of 89166 routes for 78105 networks in table master6 Total: 867872 of 867872 routes for 856809 networks in 2 tables
And what export filter do you have for the kernel protocol in bird?
In short: every route <= /24 (IPv4) and <= /48 (IPv6) of the global routing table. In long: define net_to_router_v4 = [ 0.0.0.0/0+ ]; define net_to_router_v6 = [ ::/0+ ]; protocol kernel kernel_v6 { metric 64; learn; scan time 20; ipv6 { import filter ungleich_networks; export filter to_ungleich_routers; }; } function is_for_router_kernel() { if ((net.type = NET_IP6 && net ~ net_to_router_v6) || (net.type = NET_IP4 && net ~ net_to_router_v4)) then { return true; } if ((net.type = NET_IP6 && net.len <= 29) || (net.type = NET_IP4 && net.len <= 10)) then { return true; } return false; } /* skipping functions that are very small routes */ filter to_ungleich_routers { if(is_inside_ungleich()) then accept; if(is_from_netstream()) then accept; if(is_from_sunrise()) then accept; if(is_default_route()) then accept; if(is_for_router_kernel()) then accept; reject; } Best, Nico
On December 21, 2019 1:07:25 PM GMT+01:00, Nico Schottelius <nico.schottelius@ungleich.ch> wrote:
Good morning,
on a fresh new router running the full routing table, with Alpine, Linux 4.19.80-0-vanilla, bird-2.0.7
I see a lot of these messages:
Dec 21 12:37:15 router1 daemon.warn bird: Kernel dropped some netlink messages, will resync on next scan. Dec 21 12:37:34 router1 daemon.warn bird: I/O loop cycle took 5000 ms for 1 events Dec 21 12:38:14 router1 daemon.warn bird: Kernel dropped some netlink messages, will resync on next scan. Dec 21 12:38:35 router1 daemon.warn bird: I/O loop cycle took 5328 ms for 1 events Dec 21 12:38:54 router1 daemon.warn bird: I/O loop cycle took 5013 ms for 1 events Dec 21 12:39:07 router1 daemon.warn bird: Kernel dropped some netlink messages, will resync on next scan. Dec 21 12:39:14 router1 daemon.warn bird: I/O loop cycle took 5053 ms for 1 events Dec 21 12:39:34 router1 daemon.warn bird: Kernel dropped some netlink messages, will resync on next scan. Dec 21 12:40:14 router1 daemon.warn bird: I/O loop cycle took 5041 ms for 1 events
[12:49] router1.place6:~# ip -6 r | wc -l; ip r | wc -l 78212 779342
With "debug latency;" I get the following additional messages:
Dec 21 12:54:31 router1 daemon.warn bird: Event 0x000055a21afb8144 0x0000000000000000 took 4449 ms Dec 21 12:54:52 router1 daemon.warn bird: Event 0x000055a21afb8144 0x0000000000000000 took 5608 ms
The system is overall idle with bird spiking to 50-100% cpu usage every couple of seconds. I first thougt they are only logged after stating bird (where it might make sense), but the events continue to be logged around every 30s:
Dec 21 13:04:11 router1 daemon.warn bird: Event 0x000055a21afb8144 0x0000000000000000 took 4596 ms Dec 21 13:04:32 router1 daemon.warn bird: Event 0x000055a21afb8144 0x0000000000000000 took 5096 ms Dec 21 13:04:52 router1 daemon.warn bird: Event 0x000055a21afb8144 0x0000000000000000 took 5102 ms Dec 21 13:05:11 router1 daemon.warn bird: Event 0x000055a21afb8144 0x0000000000000000 took 4676 ms Dec 21 13:05:31 router1 daemon.warn bird: Event 0x000055a21afb8144 0x0000000000000000 took 4645 ms
Which might loosely correlate to the scan time "20" that is setup for device and kernel protocols.
How do I best debug this issue?
Best,
Nico
-- Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
-- Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch