bird version 2.0.4 - peering with cisco version ios xe 16.3.5 - invalid open message
Hi there, I'm trying to setup iBGP peering with a cisco router which is giving me an "invalid open message" that I can't seem to fix. - The same router is peering with Juniper and other bird 1.6.3 routers without issues. - The router is also doing IPv6 based peering with all the other routers, which gives the exact same scenario as with IPv4. The relevant parts of the config look as follows (actual IP and AS spaces replaced with private spaces): -----< cut here >----- router id 10.0.0.1; protocol static default_v4 { ipv4 { preference 50; }; route 0.0.0.0/0 unreachable; } filter bgp_in_v4 { if net ~ [ 10.1.0.0/28 ] then accept; reject; } template bgp type1_v4 { direct; local 10.0.0.1 as 64512; ipv4 { export where proto = "default_v4"; import filter bgp_in_v4; import keep filtered; }; } protocol bgp peer_type1a_v4 from type1_v4 { neighbor 10.0.0.2 as 64512; } -----< cut here >----- The debug messages I'm getting are these: -----< cut here >----- 2019-03-08 11:02:33.199 <TRACE> peer_type1a_v4: Incoming connection from 10.0.0.2 (port 18581) accepted 2019-03-08 11:02:33.199 <TRACE> peer_type1a_v4: Sending OPEN(ver=4,as=64512,hold=240,id=0a000001) 2019-03-08 11:02:33.200 <TRACE> peer_type1a_v4: Got OPEN(as=64512,hold=180,id=10.0.0.2) 2019-03-08 11:02:33.200 <TRACE> peer_type1a_v4: Sending KEEPALIVE 2019-03-08 11:02:33.201 <RMT> peer_type1a_v4: Received: Invalid OPEN message 2019-03-08 11:02:33.201 <TRACE> peer_type1a_v4: State changed to stop 2019-03-08 11:02:33.201 <TRACE> peer_type1a_v4: Down -----< cut here >----- My suspicion is that the ipv6 like address representation in the sent open message router id might confuse the cisco. So the opening message we're sending seems to be having the router id represented in hex form, like 32 bits of an IPv6 address. Is this correct, and is there a way to either fix or work around this? Many thanks! Marco van Tol
On Fri, Mar 08, 2019 at 12:35:30PM +0100, Marco van Tol wrote:
Hi there,
I'm trying to setup iBGP peering with a cisco router which is giving me an "invalid open message" that I can't seem to fix.
- The same router is peering with Juniper and other bird 1.6.3 routers without issues.
Hi You mean the same BIRD rotuer or the same Cisco router?
- The router is also doing IPv6 based peering with all the other routers, which gives the exact same scenario as with IPv4. -----< cut here >-----
The debug messages I'm getting are these: -----< cut here >----- 2019-03-08 11:02:33.199 <TRACE> peer_type1a_v4: Incoming connection from 10.0.0.2 (port 18581) accepted 2019-03-08 11:02:33.199 <TRACE> peer_type1a_v4: Sending OPEN(ver=4,as=64512,hold=240,id=0a000001) 2019-03-08 11:02:33.200 <TRACE> peer_type1a_v4: Got OPEN(as=64512,hold=180,id=10.0.0.2) 2019-03-08 11:02:33.200 <TRACE> peer_type1a_v4: Sending KEEPALIVE 2019-03-08 11:02:33.201 <RMT> peer_type1a_v4: Received: Invalid OPEN message 2019-03-08 11:02:33.201 <TRACE> peer_type1a_v4: State changed to stop 2019-03-08 11:02:33.201 <TRACE> peer_type1a_v4: Down -----< cut here >-----
My suspicion is that the ipv6 like address representation in the sent open message router id might confuse the cisco. So the opening message we're sending seems to be having the router id represented in hex form, like 32 bits of an IPv6 address.
That is just a textual representation in logs, there is no difference in the packet. For some historical reasons there is a different formatting for 'Sending OPEN' and 'Got OPEN' log messages. Could you try the 2.0.2 or 2.0.3 versions if they work with the Cisco router? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Op 8 mrt. 2019, om 13:04 heeft Ondrej Zajicek <santiago@crfreenet.org> het volgende geschreven:
On Fri, Mar 08, 2019 at 12:35:30PM +0100, Marco van Tol wrote:
Hi there,
I'm trying to setup iBGP peering with a cisco router which is giving me an "invalid open message" that I can't seem to fix.
- The same router is peering with Juniper and other bird 1.6.3 routers without issues.
Hi
You mean the same BIRD rotuer or the same Cisco router?
Hi, Many thanks for your quick answer! Good point, I meant the same BIRD router. So we have a BIRD 2.0.4 router that is: - Having issues to peer with a Cisco ios xe 16.3.5 version router on both ipv4 and ipv6 in the exact same way - Is having no issues to peer with a Juniper and another BIRD 1.6.3 router, both ipv4 and ipv6.
- The router is also doing IPv6 based peering with all the other routers, which gives the exact same scenario as with IPv4. -----< cut here >-----
The debug messages I'm getting are these: -----< cut here >----- 2019-03-08 11:02:33.199 <TRACE> peer_type1a_v4: Incoming connection from 10.0.0.2 (port 18581) accepted 2019-03-08 11:02:33.199 <TRACE> peer_type1a_v4: Sending OPEN(ver=4,as=64512,hold=240,id=0a000001) 2019-03-08 11:02:33.200 <TRACE> peer_type1a_v4: Got OPEN(as=64512,hold=180,id=10.0.0.2) 2019-03-08 11:02:33.200 <TRACE> peer_type1a_v4: Sending KEEPALIVE 2019-03-08 11:02:33.201 <RMT> peer_type1a_v4: Received: Invalid OPEN message 2019-03-08 11:02:33.201 <TRACE> peer_type1a_v4: State changed to stop 2019-03-08 11:02:33.201 <TRACE> peer_type1a_v4: Down -----< cut here >-----
My suspicion is that the ipv6 like address representation in the sent open message router id might confuse the cisco. So the opening message we're sending seems to be having the router id represented in hex form, like 32 bits of an IPv6 address.
That is just a textual representation in logs, there is no difference in the packet. For some historical reasons there is a different formatting for 'Sending OPEN' and 'Got OPEN' log messages.
Okay that makes sense, thanks for confirming.
Could you try the 2.0.2 or 2.0.3 versions if they work with the Cisco router?
I will have a go at that and let you know. Many thanks! -- Marco van Tol
Op 8 mrt. 2019, om 13:31 heeft Marco van Tol <marco@tols.org> het volgende geschreven:
Op 8 mrt. 2019, om 13:04 heeft Ondrej Zajicek <santiago@crfreenet.org> het volgende geschreven:
On Fri, Mar 08, 2019 at 12:35:30PM +0100, Marco van Tol wrote:
Hi there,
I'm trying to setup iBGP peering with a cisco router which is giving me an "invalid open message" that I can't seem to fix.
- The same router is peering with Juniper and other bird 1.6.3 routers without issues.
Hi
You mean the same BIRD rotuer or the same Cisco router?
Hi,
Many thanks for your quick answer!
Good point, I meant the same BIRD router.
So we have a BIRD 2.0.4 router that is: - Having issues to peer with a Cisco ios xe 16.3.5 version router on both ipv4 and ipv6 in the exact same way - Is having no issues to peer with a Juniper and another BIRD 1.6.3 router, both ipv4 and ipv6.
- The router is also doing IPv6 based peering with all the other routers, which gives the exact same scenario as with IPv4. -----< cut here >-----
The debug messages I'm getting are these: -----< cut here >----- 2019-03-08 11:02:33.199 <TRACE> peer_type1a_v4: Incoming connection from 10.0.0.2 (port 18581) accepted 2019-03-08 11:02:33.199 <TRACE> peer_type1a_v4: Sending OPEN(ver=4,as=64512,hold=240,id=0a000001) 2019-03-08 11:02:33.200 <TRACE> peer_type1a_v4: Got OPEN(as=64512,hold=180,id=10.0.0.2) 2019-03-08 11:02:33.200 <TRACE> peer_type1a_v4: Sending KEEPALIVE 2019-03-08 11:02:33.201 <RMT> peer_type1a_v4: Received: Invalid OPEN message 2019-03-08 11:02:33.201 <TRACE> peer_type1a_v4: State changed to stop 2019-03-08 11:02:33.201 <TRACE> peer_type1a_v4: Down -----< cut here >-----
My suspicion is that the ipv6 like address representation in the sent open message router id might confuse the cisco. So the opening message we're sending seems to be having the router id represented in hex form, like 32 bits of an IPv6 address.
That is just a textual representation in logs, there is no difference in the packet. For some historical reasons there is a different formatting for 'Sending OPEN' and 'Got OPEN' log messages.
Okay that makes sense, thanks for confirming.
Could you try the 2.0.2 or 2.0.3 versions if they work with the Cisco router?
I will have a go at that and let you know.
Hi, So I replaced bird 2.0.4 for which I downloaded the rpm from your site with bird 2.0.2 which I installed using yum on a centos 7 system. I noticed 2 things: - The Centos 7 2.0.2 rpm makes the bird daemon drop privileges and resumes as user bird - The 2.0.4 package from your site, when started with the supplied .service file, remains as root, with the exact same bird.conf - Version 2.0.2 has no issues to peer with the cisco router, it peers fine. So now I'm curious how worried I should be about the "import bgp fixes" in the 2.0.4 release notes. :-) Many thanks! -- Marco van Tol
On Fri, Mar 08, 2019 at 01:51:31PM +0100, Marco van Tol wrote:
Hi,
So I replaced bird 2.0.4 for which I downloaded the rpm from your site with bird 2.0.2 which I installed using yum on a centos 7 system.
I noticed 2 things: - The Centos 7 2.0.2 rpm makes the bird daemon drop privileges and resumes as user bird - The 2.0.4 package from your site, when started with the supplied .service file, remains as root, with the exact same bird.conf
Hi That is defined by cmdline options, so they are probably missing in the .service file.
- Version 2.0.2 has no issues to peer with the cisco router, it peers fine.
So now I'm curious how worried I should be about the "import bgp fixes" in the 2.0.4 release notes. :-)
That is unlikely as that does not change OPEN message. Could you try 2.0.4 with 'long lived graceful restart off;' option? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Op 8 mrt. 2019, om 14:45 heeft Ondrej Zajicek <santiago@crfreenet.org> het volgende geschreven:
On Fri, Mar 08, 2019 at 01:51:31PM +0100, Marco van Tol wrote:
Hi,
So I replaced bird 2.0.4 for which I downloaded the rpm from your site with bird 2.0.2 which I installed using yum on a centos 7 system.
I noticed 2 things: - The Centos 7 2.0.2 rpm makes the bird daemon drop privileges and resumes as user bird - The 2.0.4 package from your site, when started with the supplied .service file, remains as root, with the exact same bird.conf
Hi
That is defined by cmdline options, so they are probably missing in the .service file.
Ah right, okay thanks.
- Version 2.0.2 has no issues to peer with the cisco router, it peers fine.
So now I'm curious how worried I should be about the "import bgp fixes" in the 2.0.4 release notes. :-)
That is unlikely as that does not change OPEN message.
Could you try 2.0.4 with 'long lived graceful restart off;' option?
Hi, This fixed it. I added this option only to the cisco neighbors which made them accept peering. Much appreciated, and let me know when you need more information from me. -- Marco van Tol
On Fri, Mar 08, 2019 at 04:09:58PM +0100, Marco van Tol wrote:
Op 8 mrt. 2019, om 14:45 heeft Ondrej Zajicek <santiago@crfreenet.org> het volgende geschreven:
On Fri, Mar 08, 2019 at 01:51:31PM +0100, Marco van Tol wrote:
Hi,
So I replaced bird 2.0.4 for which I downloaded the rpm from your site with bird 2.0.2 which I installed using yum on a centos 7 system.
I noticed 2 things: - The Centos 7 2.0.2 rpm makes the bird daemon drop privileges and resumes as user bird - The 2.0.4 package from your site, when started with the supplied .service file, remains as root, with the exact same bird.conf
Hi
That is defined by cmdline options, so they are probably missing in the .service file.
Ah right, okay thanks.
- Version 2.0.2 has no issues to peer with the cisco router, it peers fine.
So now I'm curious how worried I should be about the "import bgp fixes" in the 2.0.4 release notes. :-)
That is unlikely as that does not change OPEN message.
Could you try 2.0.4 with 'long lived graceful restart off;' option?
Hi,
This fixed it. I added this option only to the cisco neighbors which made them accept peering.
Much appreciated, and let me know when you need more information from me.
Hi What version of Cisco is that? Could you try if you get the same result with 1.6.6? Could you save the failed session initiation attempt by tcpdump? (e.g. tcpdump -s 0 -w file.pcap ...) -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Op 11 mrt. 2019, om 02:55 heeft Ondrej Zajicek <santiago@crfreenet.org> het volgende geschreven:
On Fri, Mar 08, 2019 at 04:09:58PM +0100, Marco van Tol wrote:
Op 8 mrt. 2019, om 14:45 heeft Ondrej Zajicek <santiago@crfreenet.org> het volgende geschreven:
On Fri, Mar 08, 2019 at 01:51:31PM +0100, Marco van Tol wrote:
Hi,
So I replaced bird 2.0.4 for which I downloaded the rpm from your site with bird 2.0.2 which I installed using yum on a centos 7 system.
I noticed 2 things: - The Centos 7 2.0.2 rpm makes the bird daemon drop privileges and resumes as user bird - The 2.0.4 package from your site, when started with the supplied .service file, remains as root, with the exact same bird.conf
Hi
That is defined by cmdline options, so they are probably missing in the .service file.
Ah right, okay thanks.
- Version 2.0.2 has no issues to peer with the cisco router, it peers fine.
So now I'm curious how worried I should be about the "import bgp fixes" in the 2.0.4 release notes. :-)
That is unlikely as that does not change OPEN message.
Could you try 2.0.4 with 'long lived graceful restart off;' option?
Hi,
This fixed it. I added this option only to the cisco neighbors which made them accept peering.
Much appreciated, and let me know when you need more information from me.
Hi
Hi
What version of Cisco is that?
What I wrote in the subject and the first message, ios xe 16.3.5 :-) If you need more info in this area let me know. I have close to no experience with Cisco, but I have someone around who does.
Could you try if you get the same result with 1.6.6?
The sessions come up with version 1.6.4. Is that good enough or do you need me to try with 1.6.6?
Could you save the failed session initiation attempt by tcpdump? (e.g. tcpdump -s 0 -w file.pcap ...)
I can but I would like to exchange the file personally rather than on this list. Do you have a proposal on how to exchange the file? Thanks! Marco van Tol P.S. About my other message about the "protocol rpki" on centos 7, the exact same config file works fine on a "BIRD on FreeBSD" system, so I assume that means I have the syntax okay :-) -- Marco van Tol
On Mon, Mar 11, 2019 at 09:44:16AM +0100, Marco van Tol wrote:
What version of Cisco is that?
What I wrote in the subject and the first message, ios xe 16.3.5 :-) If you need more info in this area let me know. I have close to no experience with Cisco, but I have someone around who does.
Could you try if you get the same result with 1.6.6?
The sessions come up with version 1.6.4. Is that good enough or do you need me to try with 1.6.6?
Yes, i would like 1.6.6, because 1.6.4 is in this matter matching 2.0.2 (which worked).
Could you save the failed session initiation attempt by tcpdump? (e.g. tcpdump -s 0 -w file.pcap ...)
I can but I would like to exchange the file personally rather than on this list. Do you have a proposal on how to exchange the file?
Send it directly to my personal e-mail instead of mailing list? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Op 11 mrt. 2019, om 12:38 heeft Ondrej Zajicek <santiago@crfreenet.org> het volgende geschreven:
On Mon, Mar 11, 2019 at 09:44:16AM +0100, Marco van Tol wrote:
What version of Cisco is that?
What I wrote in the subject and the first message, ios xe 16.3.5 :-) If you need more info in this area let me know. I have close to no experience with Cisco, but I have someone around who does.
Could you try if you get the same result with 1.6.6?
The sessions come up with version 1.6.4. Is that good enough or do you need me to try with 1.6.6?
Yes, i would like 1.6.6, because 1.6.4 is in this matter matching 2.0.2 (which worked).
Ah right, okay. I will try this and let you know.
Could you save the failed session initiation attempt by tcpdump? (e.g. tcpdump -s 0 -w file.pcap ...)
I can but I would like to exchange the file personally rather than on this list. Do you have a proposal on how to exchange the file?
Send it directly to my personal e-mail instead of mailing list?
That's okay, will do. Many thanks! -- Marco van Tol
Op 11 mrt. 2019, om 14:11 heeft Marco van Tol <marco@tols.org> het volgende geschreven:
Op 11 mrt. 2019, om 12:38 heeft Ondrej Zajicek <santiago@crfreenet.org> het volgende geschreven:
On Mon, Mar 11, 2019 at 09:44:16AM +0100, Marco van Tol wrote:
What version of Cisco is that?
What I wrote in the subject and the first message, ios xe 16.3.5 :-) If you need more info in this area let me know. I have close to no experience with Cisco, but I have someone around who does.
Could you try if you get the same result with 1.6.6?
The sessions come up with version 1.6.4. Is that good enough or do you need me to try with 1.6.6?
Yes, i would like 1.6.6, because 1.6.4 is in this matter matching 2.0.2 (which worked).
Ah right, okay. I will try this and let you know.
Could you save the failed session initiation attempt by tcpdump? (e.g. tcpdump -s 0 -w file.pcap ...)
I can but I would like to exchange the file personally rather than on this list. Do you have a proposal on how to exchange the file?
Send it directly to my personal e-mail instead of mailing list?
That's okay, will do.
Unfortunately I didn't get around to it today and tomorrow have a day off. I'll try to answer Wednesday. Sorry, -- Marco van Tol
Op 11 mrt. 2019, om 14:11 heeft Marco van Tol <marco@tols.org> het volgende geschreven:
Op 11 mrt. 2019, om 12:38 heeft Ondrej Zajicek <santiago@crfreenet.org> het volgende geschreven:
On Mon, Mar 11, 2019 at 09:44:16AM +0100, Marco van Tol wrote:
What version of Cisco is that?
What I wrote in the subject and the first message, ios xe 16.3.5 :-) If you need more info in this area let me know. I have close to no experience with Cisco, but I have someone around who does.
Could you try if you get the same result with 1.6.6?
The sessions come up with version 1.6.4. Is that good enough or do you need me to try with 1.6.6?
Yes, i would like 1.6.6, because 1.6.4 is in this matter matching 2.0.2 (which worked).
Ah right, okay. I will try this and let you know.
So the result is this: - version 1.6.6 or version 2.0.4 without "long lived graceful restart off" => fails - version 1.6.6 or version 2.0.4 with "long lived graceful restart off" => succeeds I generated a pcap for the failing sessions for both version 1.6.6 and version 2.0.4. I will send the pcaps to you directly after submitting this email Thanks! -- Marco van Tol
On Mon, Mar 11, 2019 at 09:44:16AM +0100, Marco van Tol wrote:
Hi
What version of Cisco is that?
What I wrote in the subject and the first message, ios xe 16.3.5 :-) If you need more info in this area let me know. I have close to no experience with Cisco, but I have someone around who does.
Hi It is likely this Cisco bug: https://quickview.cloudapps.cisco.com/quickview/bug/CSCva92216 https://www.reddit.com/r/Cisco/comments/7lm3pj/avoid_denali_1635_bug_cscva92... Seems like specific for 16.3.5 and 16.4, while 16.3.4 should work. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Op 13 mrt. 2019, om 12:58 heeft Ondrej Zajicek <santiago@crfreenet.org> het volgende geschreven:
On Mon, Mar 11, 2019 at 09:44:16AM +0100, Marco van Tol wrote:
Hi
What version of Cisco is that?
What I wrote in the subject and the first message, ios xe 16.3.5 :-) If you need more info in this area let me know. I have close to no experience with Cisco, but I have someone around who does.
Hi
It is likely this Cisco bug:
https://quickview.cloudapps.cisco.com/quickview/bug/CSCva92216
https://www.reddit.com/r/Cisco/comments/7lm3pj/avoid_denali_1635_bug_cscva92...
Seems like specific for 16.3.5 and 16.4, while 16.3.4 should work.
Ah that's good to know. Many thanks for your time, much appreciated! -- Marco van Tol
participants (2)
-
Marco van Tol -
Ondrej Zajicek