Running BIRD 1.3.4 on custom config Linux 3.8.13 kernel
We testing BIRD on a custom 3.8.13 Linux kernel and seeing this message every 20 seconds. We did not see this on 3.2 kernel release by Ubuntu 12.04 LTS release. Is there any chance that there's a kernel config that's needed by BIRD that we are not aware of? I notice that the "wait 20" is the only time that matches this interval in the log... May 20 23:29:05 sja15 bird: Netlink: Invalid argument May 20 23:29:26 sja15 bird: Netlink: Invalid argument May 20 23:29:46 sja15 bird: Netlink: Invalid argument May 20 23:30:06 sja15 bird: Netlink: Invalid argument May 20 23:30:25 sja15 bird: Netlink: Invalid argument May 20 23:30:45 sja15 bird: Netlink: Invalid argument May 20 23:31:06 sja15 bird: Netlink: Invalid argument May 20 23:31:26 sja15 bird: Netlink: Invalid argument Here's our OSPF config. protocol ospf { import filter import_OSPF; export filter export_OSPF; ecmp 4; area 0 { interface "eth0", "eth1" { cost 100; type broadcast; hello 10; retransmit 5; wait 20; dead 40; }; interface "dummy0" { stub; }; }; Here's the kernel config related to the NETLINK stuff. root@sja15:~# cat config | grep NETLINK CONFIG_NETFILTER_NETLINK=m CONFIG_NETFILTER_NETLINK_ACCT=m CONFIG_NETFILTER_NETLINK_QUEUE=m CONFIG_NETFILTER_NETLINK_LOG=m CONFIG_NF_CT_NETLINK=m CONFIG_NF_CT_NETLINK_TIMEOUT=m CONFIG_NF_CT_NETLINK_HELPER=m CONFIG_NETFILTER_NETLINK_QUEUE_CT=y CONFIG_SCSI_NETLINK=y CONFIG_QUOTA_NETLINK_INTERFACE=y -bn 0216331C
On Mon, May 20, 2013 at 04:36:02PM -0700, Bao Nguyen wrote:
We testing BIRD on a custom 3.8.13 Linux kernel and seeing this message every 20 seconds. We did not see this on 3.2 kernel release by Ubuntu 12.04 LTS release. Is there any chance that there's a kernel config that's needed by BIRD that we are not aware of? I notice that the "wait 20" is the only time that matches this interval in the log...
I don't think the issue is related to OSPF, that would be strange. Most probably to kernel protocol. But default scan time is 60 seconds, so unless you have used 'scan time' option, it is probably related to some route exports. I see that you use 'ecmp', do you have ecmp support in kernel? Could you enable 'debug all' for ospf and kernel protocols and send me a log with netlink errors interleaved with trace messages from debug all? Is your custom 3.8.13 Linux kernel just customly compiled, or some local code modification? -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej, You are correct, the issue is not related to OSPF but related to BIRD's ability to install a route into the kernel routing table. It look like the issue was that the new kernel doesn't have this option enabled (which is hidden under the Kernel Wireless options) CONFIG_COMPAT_NETLINK_MESSAGES=y Enabling this option fixes the issue we were seeing. Reading up on this [1] it seemed it's an very old mode. "This option makes it possible to send different netlink messages to tasks depending on whether the task is a compat task or not. To achieve this, you need to set skb_shinfo(skb)->frag_list to the compat skb before sending the skb, the netlink code will sort out which message to actually pass to the task. Newly written code should NEVER need this option but do compat-independent messages instead!" [1] http://cateee.net/lkddb/web-lkddb/COMPAT_NETLINK_MESSAGES.html -bn 0216331C On Wed, May 22, 2013 at 1:48 AM, Ondrej Zajicek <santiago@crfreenet.org> wrote:
On Mon, May 20, 2013 at 04:36:02PM -0700, Bao Nguyen wrote:
We testing BIRD on a custom 3.8.13 Linux kernel and seeing this message every 20 seconds. We did not see this on 3.2 kernel release by Ubuntu
12.04
LTS release. Is there any chance that there's a kernel config that's needed by BIRD that we are not aware of? I notice that the "wait 20" is the only time that matches this interval in the log...
I don't think the issue is related to OSPF, that would be strange. Most probably to kernel protocol. But default scan time is 60 seconds, so unless you have used 'scan time' option, it is probably related to some route exports.
I see that you use 'ecmp', do you have ecmp support in kernel?
Could you enable 'debug all' for ospf and kernel protocols and send me a log with netlink errors interleaved with trace messages from debug all?
Is your custom 3.8.13 Linux kernel just customly compiled, or some local code modification?
-- Elen sila lumenn' omentielvo
Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux)
iEYEARECAAYFAlGchmgACgkQw1GB2RHercNVqgCeNP+YEnV4OvNECEVmbf0QSF17 d4MAnRhLqWwN1Z7OBVl+di/OBCql/Xzk =Xt46 -----END PGP SIGNATURE-----
On Wed, May 22, 2013 at 10:29:49AM -0700, Bao Nguyen wrote:
Ondrej,
You are correct, the issue is not related to OSPF but related to BIRD's ability to install a route into the kernel routing table. It look like the issue was that the new kernel doesn't have this option enabled (which is hidden under the Kernel Wireless options)
CONFIG_COMPAT_NETLINK_MESSAGES=y
Well, this is a bit strange because this option is here since 2.6.32 and it works for me with both 2.6.32 and 3.0 without this option. I would check it with 3.8 to see if i get the same problem. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Thu, May 23, 2013 at 11:57:07AM +0200, Ondrej Zajicek wrote:
On Wed, May 22, 2013 at 10:29:49AM -0700, Bao Nguyen wrote:
Ondrej,
You are correct, the issue is not related to OSPF but related to BIRD's ability to install a route into the kernel routing table. It look like the issue was that the new kernel doesn't have this option enabled (which is hidden under the Kernel Wireless options)
CONFIG_COMPAT_NETLINK_MESSAGES=y
Well, this is a bit strange because this option is here since 2.6.32 and it works for me with both 2.6.32 and 3.0 without this option. I would check it with 3.8 to see if i get the same problem.
Hello Tested with 3.9.2 and worked for me with COMPAT_NETLINK_MESSAGES disabled, so it is probably some different problem. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Interesting. Did you happen to test installing the route (in our case default route from OSPF) to the kernel routing table from OSPF? -bn 0216331C On Fri, May 24, 2013 at 1:44 AM, Ondrej Zajicek <santiago@crfreenet.org>wrote:
On Thu, May 23, 2013 at 11:57:07AM +0200, Ondrej Zajicek wrote:
On Wed, May 22, 2013 at 10:29:49AM -0700, Bao Nguyen wrote:
Ondrej,
You are correct, the issue is not related to OSPF but related to BIRD's ability to install a route into the kernel routing table. It look like the issue was that the new kernel doesn't have this option enabled (which is hidden under the Kernel Wireless options)
CONFIG_COMPAT_NETLINK_MESSAGES=y
Well, this is a bit strange because this option is here since 2.6.32 and it works for me with both 2.6.32 and 3.0 without this option. I would check it with 3.8 to see if i get the same problem.
Hello
Tested with 3.9.2 and worked for me with COMPAT_NETLINK_MESSAGES disabled, so it is probably some different problem.
-- Elen sila lumenn' omentielvo
Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux)
iEYEARECAAYFAlGfKHQACgkQw1GB2RHercPKRgCeKBdc0udLJtVuEB3EyFj8h3Li 5RgAmgNbGMJXIVhHQ5MzIiESa70n9yCw =cORf -----END PGP SIGNATURE-----
On Fri, May 24, 2013 at 08:52:23AM -0700, Bao Nguyen wrote:
Interesting. Did you happen to test installing the route (in our case default route from OSPF) to the kernel routing table from OSPF?
No, i tried static routes, but there is no difference in this for kernel protocol. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ok. I will try to test this more throughly with the other kernel at one point and see if I can report anything. Right now the issue have gone away with that single CONFIG change. On Monday, May 27, 2013, Ondrej Zajicek wrote:
On Fri, May 24, 2013 at 08:52:23AM -0700, Bao Nguyen wrote:
Interesting. Did you happen to test installing the route (in our case default route from OSPF) to the kernel routing table from OSPF?
No, i tried static routes, but there is no difference in this for kernel protocol.
-- Elen sila lumenn' omentielvo
Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
participants (2)
-
Bao Nguyen -
Ondrej Zajicek