Can you give me a quick guide on how to enable the debug output? I tried for example setting in the config file:
log "/tmp/birddebug.log" all; debug protocols all;
I am not able to see any output from ALIGN. On 12/11/24 4:40 AM, Ondrej Zajicek wrote:
On Wed, Dec 11, 2024 at 02:14:39AM +0100, nick wrote:
Thank you! I just ran a quick test and encountered the same crash at the same line. I’ll have more time to investigate tomorrow and can provide additional details then. Do you have any other ideas I could try in the meantime? A part of my previous explanation was wrong, the issue is not related to net_copy(), as the crash happened even before.
For some reason, struct bgp_prefix is u64-aligned. That makes sense in the original code, where u64 alignment is forced by struct net_addr, but i have no idea why this is true even after the patch removing the forced alignment.
The slab allocator in sl_alloc() just returns u32-aligned memory, because it is configured for sizeof(struct bgp_prefix) + sizeof(net_addr_ip6) and the second is 20 as net_addr_ip6 is u32-aligned. The sl_alloc() internally only enforces word-size alignment, wich is u32 in your case.
Please disregard the previous patch. Can you try the first attached patch (log-alignment.patch) to log alignment of different structures with or without the second patch (net-addr-u32-align.patch) and report me results?
On 12/11/24 1:41 AM, Ondrej Zajicek wrote:
On Tue, Dec 10, 2024 at 09:15:46PM +0100, nick via Bird-users wrote:
I also uploaded the coredumpfile: https://github.com/PolynomialDivision/coredumpupload/blob/main/bird_coredump Thanks. This seems like an interesting issue. In BIRD, generic net_addr structure is explicitly u64-aligned (to accomodate VPN variants), while specific net_addr_ip4 and net_addr_ip6 are just u32-aligned. In this case net_addr_ip6 is allocated with u32 alignment, but then copied with net_copy(), which assumes generic net_addr for arguments, and compiler probably used some u64-optimized copying, which required 64-bit alignment despite being on 32-bit platform,
For starters, try the attached patch. But it is preliminary, we will revisit alignment of these structures.
> The root cause appears to be insufficient alignment of memory > allocated for > structures, specifically in this line: > > ```c > px = mb_alloc(c->pool, sizeof(struct bgp_prefix) + net->length); > ``` Note that it is really allocated two lines above, here:
px = sl_alloc(c->prefix_slab);