[PATCH] crash in ospf DBDESC
Hello. I've got the following core for bird 1.4.5: (gdb) bt #0 0x000000000043ecfa in ospf_dbdes_send (n=0x8011018a0, next=1) at ../../../proto/ospf/dbdes.c:145 #1 0x000000000043f6b2 in ospf_dbdes_receive (ps_i=0x8010c9000, ifa=0x8011131a0, n=0x8011018a0) at ../../../proto/ospf/dbdes.c:386 #2 0x0000000000438fdd in ospf_rx_hook (sk=0x80101b8c0, size=28) at ../../../proto/ospf/packet.c:485 #3 0x000000000045f972 in sk_read (s=0x80101b8c0) at io.c:1760 #4 0x000000000046034b in io_loop () at io.c:1975 #5 0x0000000000467da3 in main (argc=3, argv=0x7fffffffed30) at main.c:825 .. (gdb) p n->dbsi $20 = {prev = 0x0, null = 0x0, next = 0x0, node = 0x0} (gdb) p sn $22 = (snode *) 0x0 Investigations has shown, that there was major OSPF instability in that area (~20 quagga boxes and and Juniper device) at that moment with either re-election or DR/BDR hang. Unfortunately, I don't have much logs for that. We also had an (typical) issue with this particular quagga peer just prior to the crash: Feb 19 18:28:34 XXX ospf6d[8387]: SLOW THREAD: task ospf6_receive (7f793a115810) ran for 5044ms (cpu time 5032ms) My guess is that 1) we started to send our DB to the peer and it stopped confirming DD packets for a while 2) Flap happened so part of/most LSADB got flushed 3) Quagga finally awoke from sleep and confirmed last packet 4) we tried to get the next chunk of LSAs but there were no more (unsent ) LSAs in DB 5) this message appeared in the list Something similar to the attached patch should fix this particular issue (at least I hope so).
participants (1)
-
Александр Черников