Fwd: ECMP BGP issue of more than 16 paths
Hi, I have an issue with bird routing daemon where I have EBGP between S1 and S2 with 62 paths. I am able to learn all the BGP routes with 62 paths in bird. But the kernel only showing 16 paths. I configured “merge paths 255” under protocol kernel, but still it shows only 16 paths. Can somebody help me with this? 120.100.104.96/28 via 100.1.2.0 on br1003 [AG1-CR1-link0 17:06:48] * (100) [i] via 100.1.62.0 on br1063 [AG1-CR1-link61 17:07:14] (100) [i] via 100.1.59.0 on br1060 [AG1-CR1-link58 17:07:14] (100) [i] via 100.1.57.0 on br1058 [AG1-CR1-link56 17:07:14] (100) [i] via 100.1.56.0 on br1057 [AG1-CR1-link55 17:07:12] (100) [i] via 100.1.55.0 on br1056 [AG1-CR1-link54 17:07:12] (100) [i] via 100.1.54.0 on br1055 [AG1-CR1-link53 17:07:12] (100) [i] via 100.1.53.0 on br1054 [AG1-CR1-link52 17:07:12] (100) [i] via 100.1.52.0 on br1053 [AG1-CR1-link51 17:07:10] (100) [i] via 100.1.50.0 on br1051 [AG1-CR1-link49 17:07:10] (100) [i] via 100.1.47.0 on br1048 [AG1-CR1-link46 17:07:10] (100) [i] via 100.1.46.0 on br1047 [AG1-CR1-link45 17:07:10] (100) [i] via 100.1.45.0 on br1046 [AG1-CR1-link44 17:07:08] (100) [i] via 100.1.44.0 on br1045 [AG1-CR1-link43 17:07:08] (100) [i] via 100.1.42.0 on br1043 [AG1-CR1-link41 17:07:08] (100) [i] via 100.1.41.0 on br1042 [AG1-CR1-link40 17:07:08] (100) [i] via 100.1.37.0 on br1038 [AG1-CR1-link36 17:07:04] (100) [i] via 100.1.35.0 on br1036 [AG1-CR1-link34 17:07:04] (100) [i] via 100.1.34.0 on br1035 [AG1-CR1-link33 17:07:04] (100) [i] via 100.1.32.0 on br1033 [AG1-CR1-link31 17:07:04] (100) [i] via 100.1.31.0 on br1032 [AG1-CR1-link30 17:07:01] (100) [i] via 100.1.30.0 on br1031 [AG1-CR1-link29 17:07:01] (100) [i] via 100.1.29.0 on br1030 [AG1-CR1-link28 17:07:01] (100) [i] via 100.1.27.0 on br1028 [AG1-CR1-link25 17:07:01] (100) [i] via 100.1.26.0 on br1027 [AG1-CR1-link24 17:06:57] (100) [i] via 100.1.25.0 on br1026 [AG1-CR1-link23 17:06:57] (100) [i] via 100.1.21.0 on br1022 [AG1-CR1-link19 17:06:57] (100) [i] via 100.1.19.0 on br1020 [AG1-CR1-link17 17:06:57] (100) [i] via 100.1.18.0 on br1019 [AG1-CR1-link16 17:06:54] (100) [i] via 100.1.17.0 on br1018 [AG1-CR1-link15 17:06:54] (100) [i] via 100.1.15.0 on br1016 [AG1-CR1-link13 17:06:54] (100) [i] via 100.1.13.0 on br1014 [AG1-CR1-link11 17:06:54] (100) [i] via 100.1.12.0 on br1013 [AG1-CR1-link10 17:06:51] (100) [i] via 100.1.9.0 on br1010 [AG1-CR1-link7 17:06:51] (100) [i] via 100.1.7.0 on br1008 [AG1-CR1-link5 17:06:51] (100) [i] via 100.1.5.0 on br1006 [AG1-CR1-link3 17:06:51] (100) [i] via 100.1.3.0 on br1004 [AG1-CR1-link1 17:04:43] (100) [i] via 100.1.63.0 on br1064 [AG1-CR1-link62 17:05:54] (100) [i] via 100.1.61.0 on br1062 [AG1-CR1-link60 17:05:54] (100) [i] via 100.1.60.0 on br1061 [AG1-CR1-link59 17:05:54] (100) [i] via 100.1.58.0 on br1059 [AG1-CR1-link57 17:05:49] (100) [i] via 100.1.51.0 on br1052 [AG1-CR1-link50 17:05:49] (100) [i] via 100.1.49.0 on br1050 [AG1-CR1-link48 17:05:49] (100) [i] via 100.1.48.0 on br1049 [AG1-CR1-link47 17:05:49] (100) [i] via 100.1.43.0 on br1044 [AG1-CR1-link42 17:05:43] (100) [i] via 100.1.40.0 on br1041 [AG1-CR1-link39 17:05:43] (100) [i] via 100.1.39.0 on br1040 [AG1-CR1-link38 17:05:43] (100) [i] via 100.1.38.0 on br1039 [AG1-CR1-link37 17:05:43] (100) [i] via 100.1.36.0 on br1037 [AG1-CR1-link35 17:05:39] (100) [i] via 100.1.33.0 on br1034 [AG1-CR1-link32 17:05:39] (100) [i] via 100.1.28.0 on br1029 [AG1-CR1-link27 17:05:39] (100) [i] via 100.1.24.0 on br1025 [AG1-CR1-link22 17:05:39] (100) [i] via 100.1.23.0 on br1024 [AG1-CR1-link21 17:05:36] (100) [i] via 100.1.22.0 on br1023 [AG1-CR1-link20 17:05:36] (100) [i] via 100.1.20.0 on br1021 [AG1-CR1-link18 17:05:36] (100) [i] via 100.1.16.0 on br1017 [AG1-CR1-link14 17:05:36] (100) [i] via 100.1.14.0 on br1015 [AG1-CR1-link12 17:05:35] (100) [i] via 100.1.11.0 on br1012 [AG1-CR1-link9 17:05:35] (100) [i] via 100.1.10.0 on br1011 [AG1-CR1-link8 17:05:35] (100) [i] via 100.1.8.0 on br1009 [AG1-CR1-link6 17:05:35] (100) [i] via 100.1.6.0 on br1007 [AG1-CR1-link4 17:05:35] (100) [i] via 100.1.4.0 on br1005 [AG1-CR1-link2 17:05:35] (100) [i] # ip route list 120.100.104.96/28 120.100.104.96/28 proto bird nexthop via 100.1.2.0 dev br1003 weight 1 nexthop via 100.1.3.0 dev br1004 weight 1 nexthop via 100.1.4.0 dev br1005 weight 1 nexthop via 100.1.5.0 dev br1006 weight 1 nexthop via 100.1.6.0 dev br1007 weight 1 nexthop via 100.1.7.0 dev br1008 weight 1 nexthop via 100.1.8.0 dev br1009 weight 1 nexthop via 100.1.9.0 dev br1010 weight 1 nexthop via 100.1.10.0 dev br1011 weight 1 nexthop via 100.1.11.0 dev br1012 weight 1 nexthop via 100.1.12.0 dev br1013 weight 1 nexthop via 100.1.13.0 dev br1014 weight 1 nexthop via 100.1.14.0 dev br1015 weight 1 nexthop via 100.1.15.0 dev br1016 weight 1 nexthop via 100.1.16.0 dev br1017 weight 1 nexthop via 100.1.17.0 dev br1018 weight 1 router id 100.0.2.1; log syslog all; protocol kernel { scan time 60; import all; merge paths 255; export all; # Actually insert routes into the kernel routing table } Thanks, Madhu
On Tue, Oct 04, 2016 at 04:30:50PM -0700, Madhu wrote:
Hi,
I have an issue with bird routing daemon where I have EBGP between S1 and S2 with 62 paths. I am able to learn all the BGP routes with 62 paths in bird. But the kernel only showing 16 paths.
I configured “merge paths 255” under protocol kernel, but still it shows only 16 paths. Can somebody help me with this?
Hi Proper syntax is 'merge paths yes limit 255'. Does it work with this? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Thanks. I will try it out. Looks like the command is different in the bird documentation. On Tue, Oct 4, 2016 at 11:02 PM, Ondrej Zajicek <santiago@crfreenet.org> wrote:
On Tue, Oct 04, 2016 at 04:30:50PM -0700, Madhu wrote:
Hi,
I have an issue with bird routing daemon where I have EBGP between S1 and S2 with 62 paths. I am able to learn all the BGP routes with 62 paths in bird. But the kernel only showing 16 paths.
I configured “merge paths 255” under protocol kernel, but still it shows only 16 paths. Can somebody help me with this?
Hi
Proper syntax is 'merge paths yes limit 255'. Does it work with this?
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Wed, Oct 05, 2016 at 10:22:03AM -0700, Madhu wrote:
Hi
Proper syntax is 'merge paths yes limit 255'. Does it work with this?
Thanks. I will try it out.
Looks like the command is different in the bird documentation.
IMHO it is the same as in the BIRD documentation: merge paths <switch> [limit <number>] -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Thanks, But When I use 1.6.0 , only one path is installed in kernel. Is the command introduced in the latest version 1.6.2? Thanks, Madhu On Wed, Oct 5, 2016 at 12:43 PM, Ondrej Zajicek <santiago@crfreenet.org> wrote:
On Wed, Oct 05, 2016 at 10:22:03AM -0700, Madhu wrote:
Hi
Proper syntax is 'merge paths yes limit 255'. Does it work with this?
Thanks. I will try it out.
Looks like the command is different in the bird documentation.
IMHO it is the same as in the BIRD documentation:
merge paths <switch> [limit <number>]
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Thu, Oct 06, 2016 at 10:31:57PM -0700, Madhu wrote:
Thanks, But When I use 1.6.0 , only one path is installed in kernel.
Is the command introduced in the latest version 1.6.2?
No, for IPv4 it should work since 1.6.0 Am i understand it correctly that 'merge paths' works with 16 paths, but 'merge paths yes limit 255' installed only one path? Could you try some different value, like 32? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Yeah in version 1.6.0, it installed with only one path even if i just specify "merge paths" After killing the bird process, it showed as "netlink receive error No buffer space available (105)" When I upgraded to 1.6.2, "merge paths yes limit 255" command worked with all the paths installed. But the bird process occupying the CPU and memory , the kernel crashed with out of memory issue. Not sure how to proceed with this :) On Fri, Oct 7, 2016 at 12:28 AM, Ondrej Zajicek <santiago@crfreenet.org> wrote:
On Thu, Oct 06, 2016 at 10:31:57PM -0700, Madhu wrote:
Thanks, But When I use 1.6.0 , only one path is installed in kernel.
Is the command introduced in the latest version 1.6.2?
No, for IPv4 it should work since 1.6.0
Am i understand it correctly that 'merge paths' works with 16 paths, but 'merge paths yes limit 255' installed only one path?
Could you try some different value, like 32?
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
My bad. I configured MED to influence just one path on other side. I removed and reconfigured. Now its learnt with all the paths. But I still see the out of memory crash. Below is the trace. [ 1098.912753] Call Trace: [ 1098.915481] [<ffffffff81b96521>] dump_stack+0x64/0x82 [ 1098.921225] [<ffffffff81b91a46>] dump_header+0x7f/0x1f1 [ 1098.927163] [<ffffffff810b3e76>] ? put_online_cpus+0x56/0x80 [ 1098.933587] [<ffffffff811110ac>] ? rcu_oom_notify+0xcc/0xf0 [ 1098.939914] [<ffffffff811a2735>] oom_kill_process+0x205/0x360 [ 1098.946435] [<ffffffff814595b5>] ? security_capable_noaudit+0x15/0x20 [ 1098.953733] [<ffffffff811a2ee2>] out_of_memory+0x492/0x4d0 [ 1098.959962] [<ffffffff811a8f30>] __alloc_pages_nodemask+0xa00/0xb60 [ 1098.967067] [<ffffffff811eb1fa>] alloc_pages_vma+0x9a/0x160 [ 1098.973393] [<ffffffff811cd5c8>] handle_mm_fault+0xd38/0x1080 [ 1098.979914] [<ffffffff811d4b74>] ? change_protection+0x594/0x850 [ 1098.986726] [<ffffffff8109f5ae>] __do_page_fault+0x19e/0x560 [ 1098.993149] [<ffffffff811d4f81>] ? mprotect_fixup+0x151/0x290 [ 1098.999669] [<ffffffff8109f9a1>] do_page_fault+0x31/0x70 [ 1099.005704] [<ffffffff81ba40e8>] page_fault+0x28/0x30 Thanks On Fri, Oct 7, 2016 at 10:11 AM, Madhu <informmadhu@gmail.com> wrote:
Yeah in version 1.6.0, it installed with only one path even if i just specify "merge paths"
After killing the bird process, it showed as "netlink receive error No buffer space available (105)"
When I upgraded to 1.6.2, "merge paths yes limit 255" command worked with all the paths installed. But the bird process occupying the CPU and memory , the kernel crashed with out of memory issue.
Not sure how to proceed with this :)
On Fri, Oct 7, 2016 at 12:28 AM, Ondrej Zajicek <santiago@crfreenet.org> wrote:
On Thu, Oct 06, 2016 at 10:31:57PM -0700, Madhu wrote:
Thanks, But When I use 1.6.0 , only one path is installed in kernel.
Is the command introduced in the latest version 1.6.2?
No, for IPv4 it should work since 1.6.0
Am i understand it correctly that 'merge paths' works with 16 paths, but 'merge paths yes limit 255' installed only one path?
Could you try some different value, like 32?
-- Elen sila lumenn' omentielvo
Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
participants (2)
-
Madhu -
Ondrej Zajicek