BIRD 2.0.4 segfaulting on ARM
Hello bird-users, bird crashes a few seconds after startup on my Hardkernel ODROID HC-2 running Ubuntu. Kernel version is 4.14.111-158. How can I debug this? I tried: $ bird -c bird.conf -d -D debug.log --> debug.log is 0 bytes. next try: $ bird -c bird.conf $ birdc # debug all all # echo all Last lines of output is:
MyOSPF: HELLO packet sent via eth0.1000 MyOSPF: HELLO packet received from nbr 10.99.0.1 on eth0.1000 MyOSPF: HELLO packet received from nbr 10.2.0.0 on eth0.1000 MyOSPF: LSACK packet sent via eth0.1000 MyOSPF: length 56 MyOSPF: router 33.3.0.0 MyOSPF: LSA Type: 4005, Id: 0.0.0.1, Rt: 10.99.0.1, Seq: 8000068a, Age: 1, Sum: 97a9 MyOSPF: LSA Type: 0008, Id: 0.0.0.40, Rt: 10.99.0.1, Seq: 80000633, Age: 1, Sum: e252 Connection closed by server
I would be glad helping fixing this bug. Thanks a lot, Lorenz
Hello! ulimit -c unlimited bird ... (after it crashes) gdb bird core <<EOF backtrace full EOF Or, alternatively, try running it in Valgrind. Maria On April 25, 2019 7:50:32 PM GMT+02:00, lorenz@irmhil.de wrote:
Hello bird-users,
bird crashes a few seconds after startup on my Hardkernel ODROID HC-2 running Ubuntu. Kernel version is 4.14.111-158.
How can I debug this?
I tried:
$ bird -c bird.conf -d -D debug.log
--> debug.log is 0 bytes.
next try:
$ bird -c bird.conf
$ birdc
# debug all all
# echo all
Last lines of output is:
MyOSPF: HELLO packet sent via eth0.1000 MyOSPF: HELLO packet received from nbr 10.99.0.1 on eth0.1000 MyOSPF: HELLO packet received from nbr 10.2.0.0 on eth0.1000 MyOSPF: LSACK packet sent via eth0.1000 MyOSPF: length 56 MyOSPF: router 33.3.0.0 MyOSPF: LSA Type: 4005, Id: 0.0.0.1, Rt: 10.99.0.1, Seq: 8000068a, Age: 1, Sum: 97a9 MyOSPF: LSA Type: 0008, Id: 0.0.0.40, Rt: 10.99.0.1, Seq: 80000633, Age: 1, Sum: e252 Connection closed by server
I would be glad helping fixing this bug.
Thanks a lot,
Lorenz
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Or, probably what would be even better – could you please help me running QEMU with this specific guest architecture? This is definitely an architecture which we would like to include in our test suite. Thanks Maria On April 25, 2019 11:36:11 PM GMT+02:00, "Maria Matějka" <jan.matejka@nic.cz> wrote:
Hello!
ulimit -c unlimited bird ... (after it crashes) gdb bird core <<EOF backtrace full EOF
Or, alternatively, try running it in Valgrind. Maria
On April 25, 2019 7:50:32 PM GMT+02:00, lorenz@irmhil.de wrote:
Hello bird-users,
bird crashes a few seconds after startup on my Hardkernel ODROID HC-2 running Ubuntu. Kernel version is 4.14.111-158.
How can I debug this?
I tried:
$ bird -c bird.conf -d -D debug.log
--> debug.log is 0 bytes.
next try:
$ bird -c bird.conf
$ birdc
# debug all all
# echo all
Last lines of output is:
MyOSPF: HELLO packet sent via eth0.1000 MyOSPF: HELLO packet received from nbr 10.99.0.1 on eth0.1000 MyOSPF: HELLO packet received from nbr 10.2.0.0 on eth0.1000 MyOSPF: LSACK packet sent via eth0.1000 MyOSPF: length 56 MyOSPF: router 33.3.0.0 MyOSPF: LSA Type: 4005, Id: 0.0.0.1, Rt: 10.99.0.1, Seq: 8000068a, Age: 1, Sum: 97a9 MyOSPF: LSA Type: 0008, Id: 0.0.0.40, Rt: 10.99.0.1, Seq: 80000633, Age: 1, Sum: e252 Connection closed by server
I would be glad helping fixing this bug.
Thanks a lot,
Lorenz
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
My knowledge of QEMU is quite limited. On the qemu wiki, I don't see specific support for Exynos, but for a lot of ARM-Boards. Perhaps the Odroid XU4-Image is booting: https://wiki.odroid.com/odroid-xu4/os_images/linux/ubuntu_4.14/ubuntu_4.14 The hardware specs of the XU4 is described on https://wiki.odroid.com/odroid-xu4/hardware/hardware Good luck :-) Lorenz Am 26.04.19 um 00:38 schrieb Maria Matějka:
Or, probably what would be even better – could you please help me running QEMU with this specific guest architecture? This is definitely an architecture which we would like to include in our test suite.
Thanks Maria
On April 25, 2019 11:36:11 PM GMT+02:00, "Maria Matějka" <jan.matejka@nic.cz> wrote:
Hello!
ulimit -c unlimited bird ... (after it crashes) gdb bird core <<EOF backtrace full EOF
Or, alternatively, try running it in Valgrind. Maria
On April 25, 2019 7:50:32 PM GMT+02:00, lorenz@irmhil.de wrote:
Hello bird-users,
bird crashes a few seconds after startup on my Hardkernel ODROID HC-2 running Ubuntu. Kernel version is 4.14.111-158.
How can I debug this?
I tried:
$ bird -c bird.conf -d -D debug.log
--> debug.log is 0 bytes.
next try:
$ bird -c bird.conf
$ birdc
# debug all all
# echo all
Last lines of output is:
MyOSPF: HELLO packet sent via eth0.1000 MyOSPF: HELLO packet received from nbr 10.99.0.1 on eth0.1000 MyOSPF: HELLO packet received from nbr 10.2.0.0 on eth0.1000 MyOSPF: LSACK packet sent via eth0.1000 MyOSPF: length 56 MyOSPF: router 33.3.0.0 MyOSPF: LSA Type: 4005, Id: 0.0.0.1, Rt: 10.99.0.1, Seq:
8000068a, Age: 1, Sum: 97a9
MyOSPF: LSA Type: 0008, Id: 0.0.0.40, Rt: 10.99.0.1, Seq:
80000633, Age: 1, Sum: e252 Connection closed by server
I would be glad helping fixing this bug.
Thanks a lot,
Lorenz
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Hello again! I'm new to gdb - thank you for your quick advice. I ran bird again, about 10 seconds later it segfaulted again and dumped core. Looks like some strange metrics? I tried running bird on another ARM v7-box (Odroid XU4, nearly the same hardware as the Odroid HC-2) on the same network with a similar config. That bird doesn't crash. Perhaps something happend on compiling or installing bird, I'll try recompiling and reinstalling it. Thanks for any support! Lorenz The backtrace is: --- snip --- Core was generated by `bird -c bird.conf'. Program terminated with signal SIGSEGV, Segmentation fault. #0 ea__find (id=1554, e=0x81000601, e@entry=0x0) at nest/rt-attr.c:389 389 if (e->flags & EALF_BISECT) (gdb) #0 ea__find (id=1554, e=0x81000601, e@entry=0x0) at nest/rt-attr.c:389 a = <optimized out> l = <optimized out> r = <optimized out> m = <optimized out> a = <optimized out> l = <optimized out> r = <optimized out> m = <optimized out> #1 ea_find (e=e@entry=0x0, id=id@entry=1554) at nest/rt-attr.c:426 a = <optimized out> #2 0x005367ba in nl_send_route (p=p@entry=0x585748, e=e@entry=0x597170, op=op@entry=1536, dest=<optimized out>, nh=<optimized out>, nh@entry=0x5acf3c) at sysdep/linux/netlink.c:1269 ea = <optimized out> net = 0x598174 a = 0x5acf08 eattrs = <optimized out> bufsize = 284 priority = <optimized out> r = 0xbe90c190 rsize = 312 metrics = {16, 0, 0, 3197158292, 12, 5747316, 5862368, 5866468, 3197158468, 5848760, 5480477, 2147483648, 5862680, 5866468, 3197158468, 0} ews = {eattrs = 0xfc754f7b, ea = 0xb6fd1968 <__stack_chk_guard>, visited = {5789512, 5862680, 5862420, 8192}} #3 0x005371b8 in nl_add_rte (e=0x597170, p=0x585748) at sysdep/linux/netlink.c:1351 a = 0x5acf08 err = 0 a = <optimized out> err = <optimized out> nh = <optimized out> #4 krt_replace_rte (p=p@entry=0x585748, n=n@entry=0x598174, new=new@entry=0x597170, old=old@entry=0x5974e4) at sysdep/linux/netlink.c:1387 err = 0 #5 0x0053a4d2 in krt_prune (p=0x585748) at sysdep/unix/krt.c:751 verdict = 2 new = <optimized out> old = 0x5974e4 rt_free = 0x0 fn_ = 0x59817c ff_ = 0x584694 count_ = <optimized out> n = <optimized out> t = 0x584330 t = <optimized out> fn_ = <optimized out> ff_ = <optimized out> count_ = <optimized out> n = <optimized out> verdict = <optimized out> new = <optimized out> old = <optimized out> rt_free = <optimized out> #6 krt_scan (t=<optimized out>) at sysdep/unix/krt.c:838 p = 0x585748 q = 0x5858a8 #7 0x00502b86 in timers_fire (loop=loop@entry=0x57b7f0 <main_timeloop>) at lib/timer.c:235 ---Type <return> to continue, or q <return> to quit--- base_time = 55775154043 t = <optimized out> #8 0x0053976e in io_loop () at sysdep/unix/io.c:2193 poll_tout = <optimized out> timeout = <optimized out> nfds = <optimized out> events = 0 pout = <optimized out> t = <optimized out> s = <optimized out> n = <optimized out> fdmax = 256 pfd = 0x595df0 #9 0x004eabc6 in main (argc=<optimized out>, argv=<optimized out>) at sysdep/unix/main.c:884 use_uid = <optimized out> use_gid = <optimized out> conf = <optimized out> (gdb) quit --- snap --- Am 25.04.19 um 23:36 schrieb Maria Matějka:
Hello!
ulimit -c unlimited bird ... (after it crashes) gdb bird core <<EOF backtrace full EOF
Or, alternatively, try running it in Valgrind. Maria
On April 25, 2019 7:50:32 PM GMT+02:00, lorenz@irmhil.de wrote:
Hello bird-users,
bird crashes a few seconds after startup on my Hardkernel ODROID HC-2 running Ubuntu. Kernel version is 4.14.111-158.
How can I debug this?
I tried:
$ bird -c bird.conf -d -D debug.log
--> debug.log is 0 bytes.
next try:
$ bird -c bird.conf
$ birdc
# debug all all
# echo all
Last lines of output is:
MyOSPF: HELLO packet sent via eth0.1000 MyOSPF: HELLO packet received from nbr 10.99.0.1 on eth0.1000 MyOSPF: HELLO packet received from nbr 10.2.0.0 on eth0.1000 MyOSPF: LSACK packet sent via eth0.1000 MyOSPF: length 56 MyOSPF: router 33.3.0.0 MyOSPF: LSA Type: 4005, Id: 0.0.0.1, Rt: 10.99.0.1, Seq:
8000068a, Age: 1, Sum: 97a9
MyOSPF: LSA Type: 0008, Id: 0.0.0.40, Rt: 10.99.0.1, Seq:
80000633, Age: 1, Sum: e252 Connection closed by server
I would be glad helping fixing this bug.
Thanks a lot,
Lorenz
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Hello, after a "make clean", "./configure" and "make" I got this compile-time warning: --- snip --- sysdep/unix/io.c: In function ‘times_init’: sysdep/unix/io.c:135:45: warning: comparison is always false due to limited range of data type [-Wtype-limits] if ((ts.tv_sec < 0) || (((s64) ts.tv_sec) > ((s64) 1 << 40))) ^ --- snap --- But unfortunately the segmentation fault is still there. Is there anything I can do? --- snip --- Core was generated by `bird -c bird.conf'. Program terminated with signal SIGSEGV, Segmentation fault. #0 ea__find (id=1554, e=0x81000601, e@entry=0x0) at nest/rt-attr.c:389 389 if (e->flags & EALF_BISECT) (gdb) #0 ea__find (id=1554, e=0x81000601, e@entry=0x0) at nest/rt-attr.c:389 a = <optimized out> l = <optimized out> r = <optimized out> m = <optimized out> a = <optimized out> l = <optimized out> r = <optimized out> m = <optimized out> #1 ea_find (e=e@entry=0xf1ede200, id=id@entry=1554) at nest/rt-attr.c:426 a = <optimized out> #2 0x005267ba in nl_send_route (p=p@entry=0x575748, e=e@entry=0x587170, op=op@entry=1536, dest=<optimized out>, nh=<optimized out>, nh@entry=0x59c664) at sysdep/linux/netlink.c:1269 ea = <optimized out> net = 0x588174 a = 0x59c630 eattrs = <optimized out> bufsize = 284 priority = <optimized out> r = 0xbee4f180 rsize = 312 metrics = {16, 0, 0, 3202675588, 12, 5681780, 5796832, 5800932, 3202675764, 5783224, 5414941, 2147483648, 5797092, 5800932, 3202675764, 0} ews = {eattrs = 0x58e174, ea = 0x52aafb <krt_got_route+574>, visited = {5585652, 5797092, 5796884, 5585808}} #3 0x005271b8 in nl_add_rte (e=0x587170, p=0x575748) at sysdep/linux/netlink.c:1351 a = 0x59c630 err = 0 a = <optimized out> err = <optimized out> nh = <optimized out> #4 krt_replace_rte (p=p@entry=0x575748, n=n@entry=0x588174, new=new@entry=0x587170, old=old@entry=0x5874b0) at sysdep/linux/netlink.c:1387 err = 0 #5 0x0052a4d2 in krt_prune (p=0x575748) at sysdep/unix/krt.c:751 verdict = 2 new = <optimized out> old = 0x5874b0 rt_free = 0x0 fn_ = 0x58817c ff_ = 0x574694 count_ = <optimized out> n = <optimized out> t = 0x574330 t = <optimized out> fn_ = <optimized out> ff_ = <optimized out> count_ = <optimized out> n = <optimized out> verdict = <optimized out> new = <optimized out> old = <optimized out> rt_free = <optimized out> #6 krt_scan (t=<optimized out>) at sysdep/unix/krt.c:838 p = 0x575748 q = 0x5758a8 #7 0x004f2b86 in timers_fire (loop=loop@entry=0x56b7f0 <main_timeloop>) at lib/timer.c:235 ---Type <return> to continue, or q <return> to quit--- base_time = 74049570330 t = <optimized out> #8 0x0052976e in io_loop () at sysdep/unix/io.c:2193 poll_tout = <optimized out> timeout = <optimized out> nfds = <optimized out> events = 1 pout = <optimized out> t = <optimized out> s = <optimized out> n = <optimized out> fdmax = 256 pfd = 0x585df0 #9 0x004dabc6 in main (argc=<optimized out>, argv=<optimized out>) at sysdep/unix/main.c:884 use_uid = <optimized out> use_gid = <optimized out> conf = <optimized out> (gdb) quit --- snap --- Am 26.04.19 um 08:09 schrieb lorenz@irmhil.de:
Hello again!
I'm new to gdb - thank you for your quick advice.
I ran bird again, about 10 seconds later it segfaulted again and dumped core.
Looks like some strange metrics?
I tried running bird on another ARM v7-box (Odroid XU4, nearly the same hardware as the Odroid HC-2) on the same network with a similar config. That bird doesn't crash. Perhaps something happend on compiling or installing bird, I'll try recompiling and reinstalling it.
Thanks for any support!
Lorenz
The backtrace is:
--- snip --- ...
On Fri, Apr 26, 2019 at 01:08:24PM +0200, lorenz@irmhil.de wrote:
Hello,
after a "make clean", "./configure" and "make" I got this compile-time warning:
Hello Could you try to build it with "./configure --enable-debug" ?
--- snip ---
sysdep/unix/io.c: In function ‘times_init’: sysdep/unix/io.c:135:45: warning: comparison is always false due to limited range of data type [-Wtype-limits] if ((ts.tv_sec < 0) || (((s64) ts.tv_sec) > ((s64) 1 << 40))) ^ --- snap ---
This warning is OK. -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Hello again, new core dump with "--enable-debug": Core was generated by `bird -c bird.conf'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00475b48 in ea__find (e=0x81000601, id=1554) at nest/rt-attr.c:389 389 if (e->flags & EALF_BISECT) (gdb) #0 0x00475b48 in ea__find (e=0x81000601, id=1554) at nest/rt-attr.c:389 a = 0x4b7347 <nl_add_attr_u32+32> l = 8 r = -1097182544 m = 2 #1 0x00475c14 in ea_find (e=0x52b068, id=1554) at nest/rt-attr.c:426 a = 0xbe9a5280 #2 0x004b87f6 in nl_send_route (p=0x50e748, e=0x5200d4, op=1536, dest=1, nh=0x52b09c) at sysdep/linux/netlink.c:1269 ea = 0xbe9a5288 net = 0x5210a4 a = 0x52b068 eattrs = 0x52b068 bufsize = 284 priority = 32 r = 0xbe9a5280 rsize = 312 metrics = {5265448, 5419112, 5361128, 3197785152, 0, 5263816, 5361128, 24, 5263816, 0, 3197785260, 5162725, 2, 3197784136, 3197784136, 3197785160} ews = {eattrs = 0x0, ea = 0x101, visited = {5374040, 80, 5263816, 5302088}} #3 0x004b8a80 in nl_add_rte (p=0x50e748, e=0x5200d4) at sysdep/linux/netlink.c:1351 a = 0x52b068 err = 0 #4 0x004b8af2 in krt_replace_rte (p=0x50e748, n=0x5210a4, new=0x5200d4, old=0x5203e0) at sysdep/linux/netlink.c:1387 err = 0 #5 0x004bf532 in krt_prune (p=0x50e748) at sysdep/unix/krt.c:751 verdict = 2 new = 0x5200d4 old = 0x5203e0 rt_free = 0x0 fn_ = 0x5210ac ff_ = 0x50d698 count_ = 872 n = 0x5210a4 t = 0x50d330 #6 0x004bf6e6 in krt_scan (t=0x517490) at sysdep/unix/krt.c:838 p = 0x50e748 q = 0x50e8a8 #7 0x0046b062 in timers_fire (loop=0x504b80 <main_timeloop>) at lib/timer.c:235 base_time = 1435815133 t = 0x517490 #8 0x004bdcc4 in io_loop () at sysdep/unix/io.c:2193 poll_tout = 3 timeout = 3 nfds = 4 events = 1 pout = 0 t = 0x517490 s = 0x530900 n = 0x5046e4 <sock_list+4> fdmax = 256 pfd = 0x51ed20 #9 0x004c2030 in main (argc=3, argv=0xbe9a5734) at sysdep/unix/main.c:884 use_uid = 0 use_gid = 0 ---Type <return> to continue, or q <return> to quit--- conf = 0x505af0 (gdb) quit Am 26.04.19 um 13:26 schrieb Ondrej Zajicek:
On Fri, Apr 26, 2019 at 01:08:24PM +0200, lorenz@irmhil.de wrote:
Hello,
after a "make clean", "./configure" and "make" I got this compile-time warning: Hello
Could you try to build it with "./configure --enable-debug" ?
--- snip ---
sysdep/unix/io.c: In function ‘times_init’: sysdep/unix/io.c:135:45: warning: comparison is always false due to limited range of data type [-Wtype-limits] if ((ts.tv_sec < 0) || (((s64) ts.tv_sec) > ((s64) 1 << 40))) ^ --- snap --- This warning is OK.
On 4/26/19 1:08 PM, lorenz@irmhil.de wrote:
Hello,
after a "make clean", "./configure" and "make" I got this compile-time warning:
--- snip ---
sysdep/unix/io.c: In function ‘times_init’: sysdep/unix/io.c:135:45: warning: comparison is always false due to limited range of data type [-Wtype-limits] if ((ts.tv_sec < 0) || (((s64) ts.tv_sec) > ((s64) 1 << 40))) ^ --- snap ---
But unfortunately the segmentation fault is still there. Is there anything I can do?
Thank you for investigation; anyway I have no clue what may be happening. I'll try to install a local QEMU host to simulate this and then I'll return to you off-list if I don't happen to find any problem that may be related to this. Sadly, this seems too much to be some strange use-after-free (which may be caused by some architecture-specific misbehaviour) which I'm probably unable to debug only from core. Thank you Maria
--- snip ---
Core was generated by `bird -c bird.conf'. Program terminated with signal SIGSEGV, Segmentation fault. #0 ea__find (id=1554, e=0x81000601, e@entry=0x0) at nest/rt-attr.c:389 389 if (e->flags & EALF_BISECT) (gdb) #0 ea__find (id=1554, e=0x81000601, e@entry=0x0) at nest/rt-attr.c:389 a = <optimized out> l = <optimized out> r = <optimized out> m = <optimized out> a = <optimized out> l = <optimized out> r = <optimized out> m = <optimized out> #1 ea_find (e=e@entry=0xf1ede200, id=id@entry=1554) at nest/rt-attr.c:426 a = <optimized out> #2 0x005267ba in nl_send_route (p=p@entry=0x575748, e=e@entry=0x587170, op=op@entry=1536, dest=<optimized out>, nh=<optimized out>, nh@entry=0x59c664) at sysdep/linux/netlink.c:1269 ea = <optimized out> net = 0x588174 a = 0x59c630 eattrs = <optimized out> bufsize = 284 priority = <optimized out> r = 0xbee4f180 rsize = 312 metrics = {16, 0, 0, 3202675588, 12, 5681780, 5796832, 5800932, 3202675764, 5783224, 5414941, 2147483648, 5797092, 5800932, 3202675764, 0} ews = {eattrs = 0x58e174, ea = 0x52aafb <krt_got_route+574>, visited = {5585652, 5797092, 5796884, 5585808}} #3 0x005271b8 in nl_add_rte (e=0x587170, p=0x575748) at sysdep/linux/netlink.c:1351 a = 0x59c630 err = 0 a = <optimized out> err = <optimized out> nh = <optimized out> #4 krt_replace_rte (p=p@entry=0x575748, n=n@entry=0x588174, new=new@entry=0x587170, old=old@entry=0x5874b0) at sysdep/linux/netlink.c:1387 err = 0 #5 0x0052a4d2 in krt_prune (p=0x575748) at sysdep/unix/krt.c:751 verdict = 2 new = <optimized out> old = 0x5874b0 rt_free = 0x0 fn_ = 0x58817c ff_ = 0x574694 count_ = <optimized out> n = <optimized out> t = 0x574330 t = <optimized out> fn_ = <optimized out> ff_ = <optimized out> count_ = <optimized out> n = <optimized out> verdict = <optimized out> new = <optimized out> old = <optimized out> rt_free = <optimized out> #6 krt_scan (t=<optimized out>) at sysdep/unix/krt.c:838 p = 0x575748 q = 0x5758a8 #7 0x004f2b86 in timers_fire (loop=loop@entry=0x56b7f0 <main_timeloop>) at lib/timer.c:235 ---Type <return> to continue, or q <return> to quit--- base_time = 74049570330 t = <optimized out> #8 0x0052976e in io_loop () at sysdep/unix/io.c:2193 poll_tout = <optimized out> timeout = <optimized out> nfds = <optimized out> events = 1 pout = <optimized out> t = <optimized out> s = <optimized out> n = <optimized out> fdmax = 256 pfd = 0x585df0 #9 0x004dabc6 in main (argc=<optimized out>, argv=<optimized out>) at sysdep/unix/main.c:884 use_uid = <optimized out> use_gid = <optimized out> conf = <optimized out> (gdb) quit
--- snap ---
Am 26.04.19 um 08:09 schrieb lorenz@irmhil.de:
Hello again!
I'm new to gdb - thank you for your quick advice.
I ran bird again, about 10 seconds later it segfaulted again and dumped core.
Looks like some strange metrics?
I tried running bird on another ARM v7-box (Odroid XU4, nearly the same hardware as the Odroid HC-2) on the same network with a similar config. That bird doesn't crash. Perhaps something happend on compiling or installing bird, I'll try recompiling and reinstalling it.
Thanks for any support!
Lorenz
The backtrace is:
--- snip --- ...
Thank you for your time. I would be happy if I could assist in any way. As I'm running bird on other Odroids without any problems I'm not sure if it would be possible to reproduce this on qemu. But if you've got any idea, I would be glad helping fixing any bugs. Thanks again Lorenz Am 26.04.19 um 13:39 schrieb Maria Jan Matejka:
On 4/26/19 1:08 PM, lorenz@irmhil.de wrote:
Hello,
after a "make clean", "./configure" and "make" I got this compile-time warning:
--- snip ---
sysdep/unix/io.c: In function ‘times_init’: sysdep/unix/io.c:135:45: warning: comparison is always false due to limited range of data type [-Wtype-limits] if ((ts.tv_sec < 0) || (((s64) ts.tv_sec) > ((s64) 1 << 40))) ^ --- snap ---
But unfortunately the segmentation fault is still there. Is there anything I can do? Thank you for investigation; anyway I have no clue what may be happening. I'll try to install a local QEMU host to simulate this and then I'll return to you off-list if I don't happen to find any problem that may be related to this.
Sadly, this seems too much to be some strange use-after-free (which may be caused by some architecture-specific misbehaviour) which I'm probably unable to debug only from core.
Thank you Maria
--- snip ---
Core was generated by `bird -c bird.conf'. Program terminated with signal SIGSEGV, Segmentation fault. #0 ea__find (id=1554, e=0x81000601, e@entry=0x0) at nest/rt-attr.c:389 389 if (e->flags & EALF_BISECT) (gdb) #0 ea__find (id=1554, e=0x81000601, e@entry=0x0) at nest/rt-attr.c:389 a = <optimized out> l = <optimized out> r = <optimized out> m = <optimized out> a = <optimized out> l = <optimized out> r = <optimized out> m = <optimized out> #1 ea_find (e=e@entry=0xf1ede200, id=id@entry=1554) at nest/rt-attr.c:426 a = <optimized out> #2 0x005267ba in nl_send_route (p=p@entry=0x575748, e=e@entry=0x587170, op=op@entry=1536, dest=<optimized out>, nh=<optimized out>, nh@entry=0x59c664) at sysdep/linux/netlink.c:1269 ea = <optimized out> net = 0x588174 a = 0x59c630 eattrs = <optimized out> bufsize = 284 priority = <optimized out> r = 0xbee4f180 rsize = 312 metrics = {16, 0, 0, 3202675588, 12, 5681780, 5796832, 5800932, 3202675764, 5783224, 5414941, 2147483648, 5797092, 5800932, 3202675764, 0} ews = {eattrs = 0x58e174, ea = 0x52aafb <krt_got_route+574>, visited = {5585652, 5797092, 5796884, 5585808}} #3 0x005271b8 in nl_add_rte (e=0x587170, p=0x575748) at sysdep/linux/netlink.c:1351 a = 0x59c630 err = 0 a = <optimized out> err = <optimized out> nh = <optimized out> #4 krt_replace_rte (p=p@entry=0x575748, n=n@entry=0x588174, new=new@entry=0x587170, old=old@entry=0x5874b0) at sysdep/linux/netlink.c:1387 err = 0 #5 0x0052a4d2 in krt_prune (p=0x575748) at sysdep/unix/krt.c:751 verdict = 2 new = <optimized out> old = 0x5874b0 rt_free = 0x0 fn_ = 0x58817c ff_ = 0x574694 count_ = <optimized out> n = <optimized out> t = 0x574330 t = <optimized out> fn_ = <optimized out> ff_ = <optimized out> count_ = <optimized out> n = <optimized out> verdict = <optimized out> new = <optimized out> old = <optimized out> rt_free = <optimized out> #6 krt_scan (t=<optimized out>) at sysdep/unix/krt.c:838 p = 0x575748 q = 0x5758a8 #7 0x004f2b86 in timers_fire (loop=loop@entry=0x56b7f0 <main_timeloop>) at lib/timer.c:235 ---Type <return> to continue, or q <return> to quit--- base_time = 74049570330 t = <optimized out> #8 0x0052976e in io_loop () at sysdep/unix/io.c:2193 poll_tout = <optimized out> timeout = <optimized out> nfds = <optimized out> events = 1 pout = <optimized out> t = <optimized out> s = <optimized out> n = <optimized out> fdmax = 256 pfd = 0x585df0 #9 0x004dabc6 in main (argc=<optimized out>, argv=<optimized out>) at sysdep/unix/main.c:884 use_uid = <optimized out> use_gid = <optimized out> conf = <optimized out> (gdb) quit
--- snap ---
Am 26.04.19 um 08:09 schrieb lorenz@irmhil.de:
Hello again!
I'm new to gdb - thank you for your quick advice.
I ran bird again, about 10 seconds later it segfaulted again and dumped core.
Looks like some strange metrics?
I tried running bird on another ARM v7-box (Odroid XU4, nearly the same hardware as the Odroid HC-2) on the same network with a similar config. That bird doesn't crash. Perhaps something happend on compiling or installing bird, I'll try recompiling and reinstalling it.
Thanks for any support!
Lorenz
The backtrace is:
--- snip --- ...
Hello again, I narrowed the bug down to the kernel protocol. The config for debugging is as follows: --- snip --- log syslog all; router id 10.33.0.0; protocol device { scan time 15; } filter myrange { if net ~ fd22:9c28:6cf6::/48 then accept; reject; }; protocol kernel { # scan time 10; ipv6 { export all; }; # learn; } protocol ospf v3 MyOSPF { ipv6 { import filter myrange; export filter myrange; }; area 0 { networks { fd22:9c28:6cf6::/48; }; interface "eth0.1000" { hello 2; dead 5; cost 10; }; }; } --- snap --- If I remove the comment on "scan" or "learn" in the "protocol kernel" section, I get the segfault. Thanks for the code so far, Lorenz Am 26.04.19 um 13:39 schrieb Maria Jan Matejka:
On 4/26/19 1:08 PM, lorenz@irmhil.de wrote:
Hello,
after a "make clean", "./configure" and "make" I got this compile-time warning:
--- snip ---
sysdep/unix/io.c: In function ‘times_init’: sysdep/unix/io.c:135:45: warning: comparison is always false due to limited range of data type [-Wtype-limits] if ((ts.tv_sec < 0) || (((s64) ts.tv_sec) > ((s64) 1 << 40))) ^ --- snap ---
But unfortunately the segmentation fault is still there. Is there anything I can do? Thank you for investigation; anyway I have no clue what may be happening. I'll try to install a local QEMU host to simulate this and then I'll return to you off-list if I don't happen to find any problem that may be related to this.
Sadly, this seems too much to be some strange use-after-free (which may be caused by some architecture-specific misbehaviour) which I'm probably unable to debug only from core.
Thank you Maria
--- snip ---
Core was generated by `bird -c bird.conf'. Program terminated with signal SIGSEGV, Segmentation fault. #0 ea__find (id=1554, e=0x81000601, e@entry=0x0) at nest/rt-attr.c:389 389 if (e->flags & EALF_BISECT) (gdb) #0 ea__find (id=1554, e=0x81000601, e@entry=0x0) at nest/rt-attr.c:389 a = <optimized out> l = <optimized out> r = <optimized out> m = <optimized out> a = <optimized out> l = <optimized out> r = <optimized out> m = <optimized out> #1 ea_find (e=e@entry=0xf1ede200, id=id@entry=1554) at nest/rt-attr.c:426 a = <optimized out> #2 0x005267ba in nl_send_route (p=p@entry=0x575748, e=e@entry=0x587170, op=op@entry=1536, dest=<optimized out>, nh=<optimized out>, nh@entry=0x59c664) at sysdep/linux/netlink.c:1269 ea = <optimized out> net = 0x588174 a = 0x59c630 eattrs = <optimized out> bufsize = 284 priority = <optimized out> r = 0xbee4f180 rsize = 312 metrics = {16, 0, 0, 3202675588, 12, 5681780, 5796832, 5800932, 3202675764, 5783224, 5414941, 2147483648, 5797092, 5800932, 3202675764, 0} ews = {eattrs = 0x58e174, ea = 0x52aafb <krt_got_route+574>, visited = {5585652, 5797092, 5796884, 5585808}} #3 0x005271b8 in nl_add_rte (e=0x587170, p=0x575748) at sysdep/linux/netlink.c:1351 a = 0x59c630 err = 0 a = <optimized out> err = <optimized out> nh = <optimized out> #4 krt_replace_rte (p=p@entry=0x575748, n=n@entry=0x588174, new=new@entry=0x587170, old=old@entry=0x5874b0) at sysdep/linux/netlink.c:1387 err = 0 #5 0x0052a4d2 in krt_prune (p=0x575748) at sysdep/unix/krt.c:751 verdict = 2 new = <optimized out> old = 0x5874b0 rt_free = 0x0 fn_ = 0x58817c ff_ = 0x574694 count_ = <optimized out> n = <optimized out> t = 0x574330 t = <optimized out> fn_ = <optimized out> ff_ = <optimized out> count_ = <optimized out> n = <optimized out> verdict = <optimized out> new = <optimized out> old = <optimized out> rt_free = <optimized out> #6 krt_scan (t=<optimized out>) at sysdep/unix/krt.c:838 p = 0x575748 q = 0x5758a8 #7 0x004f2b86 in timers_fire (loop=loop@entry=0x56b7f0 <main_timeloop>) at lib/timer.c:235 ---Type <return> to continue, or q <return> to quit--- base_time = 74049570330 t = <optimized out> #8 0x0052976e in io_loop () at sysdep/unix/io.c:2193 poll_tout = <optimized out> timeout = <optimized out> nfds = <optimized out> events = 1 pout = <optimized out> t = <optimized out> s = <optimized out> n = <optimized out> fdmax = 256 pfd = 0x585df0 #9 0x004dabc6 in main (argc=<optimized out>, argv=<optimized out>) at sysdep/unix/main.c:884 use_uid = <optimized out> use_gid = <optimized out> conf = <optimized out> (gdb) quit
--- snap ---
Am 26.04.19 um 08:09 schrieb lorenz@irmhil.de:
Hello again!
I'm new to gdb - thank you for your quick advice.
I ran bird again, about 10 seconds later it segfaulted again and dumped core.
Looks like some strange metrics?
I tried running bird on another ARM v7-box (Odroid XU4, nearly the same hardware as the Odroid HC-2) on the same network with a similar config. That bird doesn't crash. Perhaps something happend on compiling or installing bird, I'll try recompiling and reinstalling it.
Thanks for any support!
Lorenz
The backtrace is:
--- snip --- ...
participants (4)
-
lorenz@irmhil.de -
Maria Jan Matejka -
Maria Matějka -
Ondrej Zajicek