segfault when adding OSPF virtual link on Bird 1.6.4
Hello everybody, Today I was experimenting with a Bird setup. All the Bird machines are CentOS 7.6 VMs running in Virtualbox on my laptop. When I add an OSPF virtual link to hook up two area 0's together I get a segfault of Bird 1.6.4 itself. All the links between the VMs are defined as "Internal Network" links. Each connection between routers got it's own internal network, so there should be no overlap between those connections. Steps to reproduce: 1. Create a bird.conf with the following contents specified in the attached bird.conf 2. Start Bird 3. Add the following line to the area 1 config: virtual link 2.2.2.2; 4. Run `birdc configure` 5. Observe the following error in your syslog: Dec 8 21:01:35 r3 kernel: bird[3469]: segfault at 32 ip 0000000000429fe0 sp 00007fff599d3350 error 4 in bird[400000+73 I've ran Bird 1.6.4 in a gdb session with the debug symbols installed and I created a stack trace after the crash. This gives the output which can be found in the attached gdb.txt. What is interesting to note is that the segfault only occurs when I reconfigure Bird. I can start it just fine right after the crash and the OSPF sessions come back online and everything. I hope this gives some useful information about what is happening. If you need any more information please don't hesitate to ask. It is a test network, so I can tell you absolutely everything about it, nothing is a company secret or something like that :). Kind regards, Cybertinus
On Sat, Dec 08, 2018 at 09:48:52PM +0100, Cybertinus wrote:
Hello everybody,
Today I was experimenting with a Bird setup. All the Bird machines are CentOS 7.6 VMs running in Virtualbox on my laptop. When I add an OSPF virtual link to hook up two area 0's together I get a segfault of Bird 1.6.4 itself. All the links between the VMs are defined as "Internal Network" links. Each connection between routers got it's own internal network, so there should be no overlap between those connections.
Hello Thanks for the thorough bugreport, the attached patch should fix the issue.
Steps to reproduce: 1. Create a bird.conf with the following contents specified in the attached bird.conf 2. Start Bird 3. Add the following line to the area 1 config: virtual link 2.2.2.2; 4. Run `birdc configure` 5. Observe the following error in your syslog: Dec 8 21:01:35 r3 kernel: bird[3469]: segfault at 32 ip 0000000000429fe0 sp 00007fff599d3350 error 4 in bird[400000+73
Technically it would crash during reconfiguration when vlink is already defined, even if the reconfiguration changes nothing. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Hello Ondrej, Thanks for the quick response and patch! I just applied it on a stock 1.6.4 source, compiled it on my CentOS 7.6 testmachines (with ./configure --prefix=/ --enable-debug; make; make install) and loaded my config with it. My console/logfile is now being flooded with messages like: bird: ospf1: Bad DBDES packet from nbr 2.2.2.2 on enp0s8 - MS-bit mismatch (7) bird: ospf1: Bad DBDES packet from nbr 2.2.2.2 on enp0s8 - DD sequence number mismatch (4281812757) bird: ospf1: Bad DBDES packet from nbr 2.2.2.2 on enp0s8 - DD sequence number mismatch (4177508748) bird: ospf1: Bad DBDES packet from nbr 2.2.2.2 on enp0s8 - MS-bit mismatch (7) bird: ospf1: Bad DBDES packet from nbr 2.2.2.2 on enp0s8 - too late for DD exchange (7) But I suspect that this is because I have something wrong in my Bird config (it is a test network for a reason off course ;) ). Will this patch be included in a 1.6.5 release? And do you know when such a version will be available? Does 2.0.x have the same problem and can this patch fix the issue there too? Kind regards, Cybertinus On 2018-12-10 02:04, Ondrej Zajicek wrote:
On Sat, Dec 08, 2018 at 09:48:52PM +0100, Cybertinus wrote:
Hello everybody,
Today I was experimenting with a Bird setup. All the Bird machines are CentOS 7.6 VMs running in Virtualbox on my laptop. When I add an OSPF virtual link to hook up two area 0's together I get a segfault of Bird 1.6.4 itself. All the links between the VMs are defined as "Internal Network" links. Each connection between routers got it's own internal network, so there should be no overlap between those connections.
Hello
Thanks for the thorough bugreport, the attached patch should fix the issue.
Steps to reproduce: 1. Create a bird.conf with the following contents specified in the attached bird.conf 2. Start Bird 3. Add the following line to the area 1 config: virtual link 2.2.2.2; 4. Run `birdc configure` 5. Observe the following error in your syslog: Dec 8 21:01:35 r3 kernel: bird[3469]: segfault at 32 ip 0000000000429fe0 sp 00007fff599d3350 error 4 in bird[400000+73
Technically it would crash during reconfiguration when vlink is already defined, even if the reconfiguration changes nothing.
Hello Ondrej, I think I found the root cause of the messages on my console/log: I think I managed to run bird twice from the same config at the same time. This would cause horrible things to happen. When I killed Bird on all my routers (VMs) and started everything in an orderly manner, the messages were gone. I had an error in my config, which prevented the virtual link to start, but when I fixed that (I had defined all the connections between three routers in area 1 as pointopoint links, which doesn't work off course) the virtual link came up and everything works now, as can be seen in the output I've attached to this e-mail in the file bird_output.txt To be absolutely clear: thanks for the patch, this fixes the original segfault issue. Kind regards, Cybertinus On 2018-12-10 21:17, Cybertinus wrote:
Hello Ondrej,
Thanks for the quick response and patch! I just applied it on a stock 1.6.4 source, compiled it on my CentOS 7.6 testmachines (with ./configure --prefix=/ --enable-debug; make; make install) and loaded my config with it. My console/logfile is now being flooded with messages like: bird: ospf1: Bad DBDES packet from nbr 2.2.2.2 on enp0s8 - MS-bit mismatch (7) bird: ospf1: Bad DBDES packet from nbr 2.2.2.2 on enp0s8 - DD sequence number mismatch (4281812757) bird: ospf1: Bad DBDES packet from nbr 2.2.2.2 on enp0s8 - DD sequence number mismatch (4177508748) bird: ospf1: Bad DBDES packet from nbr 2.2.2.2 on enp0s8 - MS-bit mismatch (7) bird: ospf1: Bad DBDES packet from nbr 2.2.2.2 on enp0s8 - too late for DD exchange (7)
But I suspect that this is because I have something wrong in my Bird config (it is a test network for a reason off course ;) ).
Will this patch be included in a 1.6.5 release? And do you know when such a version will be available? Does 2.0.x have the same problem and can this patch fix the issue there too?
Kind regards, Cybertinus
On 2018-12-10 02:04, Ondrej Zajicek wrote:
On Sat, Dec 08, 2018 at 09:48:52PM +0100, Cybertinus wrote:
Hello everybody,
Today I was experimenting with a Bird setup. All the Bird machines are CentOS 7.6 VMs running in Virtualbox on my laptop. When I add an OSPF virtual link to hook up two area 0's together I get a segfault of Bird 1.6.4 itself. All the links between the VMs are defined as "Internal Network" links. Each connection between routers got it's own internal network, so there should be no overlap between those connections.
Hello
Thanks for the thorough bugreport, the attached patch should fix the issue.
Steps to reproduce: 1. Create a bird.conf with the following contents specified in the attached bird.conf 2. Start Bird 3. Add the following line to the area 1 config: virtual link 2.2.2.2; 4. Run `birdc configure` 5. Observe the following error in your syslog: Dec 8 21:01:35 r3 kernel: bird[3469]: segfault at 32 ip 0000000000429fe0 sp 00007fff599d3350 error 4 in bird[400000+73
Technically it would crash during reconfiguration when vlink is already defined, even if the reconfiguration changes nothing.
participants (2)
-
Cybertinus -
Ondrej Zajicek