[PATCH] Bus error on ARMv7 when using OSPF

Matthew Reeve webmail at mreeve.com
Mon Jun 28 10:46:55 CEST 2021


On 24/06/2021 13:08, Ondrej Zajicek wrote:
> On Fri, Jun 18, 2021 at 05:06:27PM +0100, Matthew Reeve wrote:
>> Hi, yes sure, here it is. Please let me know if this does not give you what
>> you need.
>>
>> Thanks!
>
> Thanks, that looks like an issue with slists. We had similar issue with
> lists code in the past and reworked them to be more conservative. Will
> check that.
Great, thanks. If you want to make any changes on a branch or something, 
I can build it and test it on my hardware if it would help.
>
>> root at OpenWrt:/tmp# gdb debug/bird bird.1623776146.6869.7.core
>> GNU gdb (GDB) 10.1
>> Copyright (C) 2020 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later
>> <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.
>> Type "show copying" and "show warranty" for details.
>> This GDB was configured as "arm-openwrt-linux".
>> Type "show configuration" for configuration details.
>> For bug reporting instructions, please see:
>> <https://www.gnu.org/software/gdb/bugs/>.
>> Find the GDB manual and other documentation resources online at:
>>      <http://www.gnu.org/software/gdb/documentation/>.
>>
>> For help, type "help".
>> Type "apropos word" to search for commands related to "word"...
>> Reading symbols from debug/bird...
>> [New LWP 6869]
>> Core was generated by `./bird'.
>> Program terminated with signal SIGBUS, Bus error.
>> #0  ospf_rt_reset (p=0x1d610a0) at proto/ospf/rt.c:1646
>> 1646    proto/ospf/rt.c: No such file or directory.
>> (gdb) bt
>> #0  ospf_rt_reset (p=0x1d610a0) at proto/ospf/rt.c:1646
>> #1  ospf_rt_spf (p=0x1d610a0) at proto/ospf/rt.c:1698
>> #2  ospf_rt_spf (p=0x1d610a0) at proto/ospf/rt.c:1688
>> #3  ospf_disp (timer=<optimized out>) at proto/ospf/ospf.c:468
>> #4  0x00061574 in timers_fire (loop=0xc4878 <main_timeloop>) at
>> lib/timer.c:235
>> #5  0x00012ca8 in io_loop () at sysdep/unix/io.c:2195
>> #6  main (argc=<optimized out>, argv=<optimized out>) at
>> sysdep/unix/main.c:939
>> (gdb)
>>
>> On 18/06/2021 16:16, Ondrej Zajicek wrote:
>>> On Mon, Jun 14, 2021 at 04:25:04PM +0100, Matthew Reeve wrote:
>>>> Hi,
>>>>
>>>> when using bird 2.0.8 on openwrt 21.02 (and other versions) on a Netgear
>>>> R7800 router, if the OSPF protocol is used, either v2 or v3, bird
>>>> immediately crashes on startup with:
>>>>
>>>> Fri Jun 11 14:41:11 2021 daemon.info bird: Started
>>>> Fri Jun 11 14:41:11 2021 kern.err kernel: [ 3500.853248] Alignment trap: not
>>>> handling instruction f44c0a1f at [<00035848>] Fri Jun 11 14:41:11 2021
>>>> kern.alert kernel: [ 3500.853283] 8<--- cut here ---
>>>> Fri Jun 11 14:41:11 2021 kern.alert kernel: [ 3500.859363] Unhandled fault:
>>>> alignment exception (0x801) at 0x007e0624
>>>> Fri Jun 11 14:41:11 2021 kern.alert kernel: [ 3500.862443] pgd = 0bbef4fd
>>>> Fri Jun 11 14:41:11 2021 kern.alert kernel: [ 3500.868821] [007e0624]
>>>> *pgd=5d6ca835, *pte=5c40b75f, *ppte=5c40bc7f
>>>>
>>>>
>>>> This router uses an ARMv7 processor and the issue seems to be to do with
>>>> memory alignment issues. I've debugged it and traced it to an access to the
>>>> top_hash_entry struct. I've found that if I add the PACKED macro to the
>>>> struct definition then it fixes the problem, as per this patch:
>>> Hi
>>>
>>> Thanks, could you try to get backtrace from the coredump using gdb to see
>>> where is the invalid access?
>>>
>>>


More information about the Bird-users mailing list