ospf: bird stuck in R state
Hi :) The following is a bird bug with some investigation but no good solution yet. Could bird developers please take a look and advise? Thanks, -- sizif We are running ospf with bird 1.0.12 and from time to time discover bird stuck in R state: # ps auxw | grep bird root 14269 56.9 0.0 3088 2064 ? Rs Dec16 690:56 /usr/sbin/bird I have approached the process with gdb and have found the following: 0x0807b742 in s_merge (from=0x80e0e54, to=0x80d118c) at slists.c:35 (gdb) bt #0 0x0807b742 in s_merge (from=0x80e0e54, to=0x80d118c) at slists.c:35 #1 0x0807b6e3 in s_rem_node (n=0x80e0e54) at slists.c:125 #2 0x08066fdb in lsa_install_new (lsa=0xbf983c80, body=0x80c8fc0, oa=0x8096340) at /home/sizif/bird/./proto/ospf/lsalib.c:459 #3 0x0806568a in ospf_lsupd_receive (ps=0x80b56e4, ifa=0x8099570, n=0x80b3bf8) at /home/sizif/bird/./proto/ospf/lsupd.c:487 #4 0x08060553 in ospf_rx_hook (sk=0x80b4310, size=64) at /home/sizif/bird/./proto/ospf/packet.c:346 #5 0x08074c27 in sk_read (s=0x80b4310) at io.c:1212 #6 0x080751af in io_loop () at io.c:1378 #7 0x080779eb in main (argc=3, argv=0xbf983fd4) at main.c:458 Here is the spot in s_merge where we cycle: /* Really merging */ while (g->next)
g = g->next;
(gdb) p /x *g $1 = {prev = 0x80e0e54, null = 0x0, next = 0x80ab1d0, node = 0x0} (gdb) display /x g $2 = 0x80ab1d0 That is, the list of slist readers is trivially cycled. 0x80ab1d0 is the address of n->dbsi field of one of the neighbors: (gdb) p /x *(struct ospf_neighbor *)0x80ab190 $31 = {n = {next = 0x80abc38, prev = 0x80aa6e8}, pool = 0x80ab160, ifa = 0x8099570, state = 0x4, inactim = 0x80ab848, imms = {byte = 0x2, bit = {ms = 0x0, m = 0x1, i = 0x0, padding = 0x0}}, dds = 0xcf4d0c79, ddr = 0xb846f866, myimms = {byte = 0x7, bit = {ms = 0x1, m = 0x1, i = 0x1, padding = 0x0}}, rid = 0xc0a84909, ip = 0xc0a84909, priority = 0xa, options = 0x2, dr = 0xc0a8491e, bdr = 0xc0a84930, adj = 0x0, dbsi = { prev = 0x80e0e54, null = 0x0, next = 0x80ab1d0, node = 0x0}, lsrql = { head = 0x80ab1e4, null = 0x0, tail = 0x80ab1e0, tail_readers = 0x80ab1f4}, lsrqh = 0x80ab8b8, lsrqi = {prev = 0x80ab1e4, null = 0x0, next = 0x0, node = 0x80ab1e4}, lsrtl = {head = 0x80e9f8c, null = 0x0, tail = 0x80e9e4c, tail_readers = 0x80ab214}, lsrti = { prev = 0x80ab208, null = 0x0, next = 0x0, node = 0x80ab208}, lsrth = 0x80aba50, ldbdes = 0x80ab268, rxmt_timer = 0x80ab878, ackl = {{ head = 0x80ab234, null = 0x0, tail = 0x80ab230}, {head = 0x80ab240, null = 0x0, tail = 0x80ab23c}}, ackd_timer = 0x80abbd8, csn = 0x0} Another interesing observation is that both nodes given as arguments to s_merge have identical readers field: #1 0x0807b6e3 in s_rem_node (n=0x80e0e54) at slists.c:125 (gdb) p /x *n $38 = {next = 0x80d118c, prev = 0x809a7b4, readers = 0x80ab1d0} (gdb) p /x *(snode *)0x80d118c $40 = {next = 0x80e0f44, prev = 0x809a7b4, readers = 0x80ab1d0} (gdb) p /x *(snode *)0x80e0f44 $41 = {next = 0x80bcdec, prev = 0x80d118c, readers = 0x0} (gdb) p /x *(snode *)0x80bcdec $42 = {next = 0x80b6db4, prev = 0x80e0f44, readers = 0x0} (gdb) p /x *(snode *)0x80b6db4 $43 = {next = 0x80e1174, prev = 0x80bcdec, readers = 0x0} (gdb) p /x *n $44 = {next = 0x80d118c, prev = 0x809a7b4, readers = 0x80ab1d0} (gdb) p /x *(snode *)0x809a7b4 $45 = {next = 0x80d118c, prev = 0x809a764, readers = 0x0} (gdb) p /x *(snode *)0x809a764 $46 = {next = 0x809a7b4, prev = 0x809a714, readers = 0x0} How can an iterator be referenced by more than one slist node? It looks like someone did s_put(&(n->dbsi), ...) while the previous iteration has not been completed. The s_get...s_put pair in ospf_dbdes_send seems sound and safe. I think the culprit is ospf_neigh_sm who does s_init(&(n->dbsi)) when entering NEIGHBOR_EXCHANGE: case INM_NEGDONE: if (n->state == NEIGHBOR_EXSTART) { neigh_chstate(n, NEIGHBOR_EXCHANGE); s_init(&(n->dbsi), &po->lsal); The s_init does good when the neighbor enters NEIGHBOR_EXCHANGE for the first time. But if the neighbor enters NEIGHBOR_EXCHANGE again without finishing the n->dbsi run through po->lsal, we will get exactly the effect we observe: n->dbsi relinked to the head node of po->lsal without unlinking it from the node it currently points to. And a neighbor can indeed leave NEIGHBOR_EXCHANGE with unfinished run through po->lsal for a number of reasons, with INM_SEQMIS or INM_BADLSREQ. One solution I can think of is to keep n->dbsi in s_put state *always*, except for a short s_get...s_put fragment in ospf_dbdes_send where there are no state transitions. Do initial s_init(&(n->dbsi)) in ospf_neighbor_new. Do s_get before s_init in the ospf_neigh_sm fragment cited above. Do s_put unconditionally in ospf_dbdes_send. Your thoughts?
On Thu, Dec 18, 2008 at 05:53:50PM +0300, Yury Shevchuk wrote:
Hi :)
The following is a bird bug with some investigation but no good solution yet. Could bird developers please take a look and advise?
Thank you for bug report. I will look at it soon. What are your other experiences with running ospf with bird? -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Sun, Dec 21, 2008 at 12:46:22AM +0100, Ondrej Zajicek wrote:
What are your other experiences with running ospf with bird?
Thank you for asking :) Ospf mostly works (18 routers running BIRD OSPF in a single area) but bites from time to time. Some experiences are below. But despite the problems, I feel BIRD worth the efforts put in debugging. It does the job while being really small, like a real bird -- only the essential feathers, no fat body. -- sizif The experiences: 1) Cryptographic authentication is broken. I have fixed something (*) but have put the problem on hold after encountering the sequence number problem. The problem: BIRD sometimes reorders outgoing ospf packets and when this happens, the peer reports: <TRACE> ospf1: OSPF_auth: lower sequence number. 2) There's a bug in LSDB synchronisation process which prevents neighborhood establishment in the presence of flapping OSPF ASE. The process advances to LOADING and then drops back to EXSTART, the slave never reaching the FULL state. I observed the sympthom while trying to handle 1000+ ASEs exported to OSPF from BGP. The bug manifests itself like this: 03-12-2008 21:04:33 <WARN> Received bad LS req from: 192.168.73.48 looking: RT: 192.168.73.120, ID: 209.85.128.1, Type: 5 I have worked around the bug by reducing the number of routes exported from BGP to OSPF. This spot within ospf_lsupd_send_list seems to be relevant to the problem: if ((en = ospf_hash_find(po->gr, oa->areaid, llsh->lsh.id, llsh->lsh.rt, llsh->lsh.type)) == NULL) continue; /* Probably flushed LSA */ /* FIXME This is a bug! I cannot flush LSA that is in lsrt */ 3) (not ospf bug though) netlink problem described here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=428865 The patch attached to the page seems to work well. (*) a patch to proto/ospf/packet.c --- /tmp/packet.c 2008-12-22 22:00:57.574539077 +0300 +++ packet.c 2008-11-17 23:44:21.000000000 +0300 @@ -180,12 +180,13 @@ if(n) { - if(ntohs(pkt->u.md5.csn) < n->csn) + if(ntohl(pkt->u.md5.csn) < n->csn) { - OSPF_TRACE(D_PACKETS, "OSPF_auth: lower sequence number"); + OSPF_TRACE(D_PACKETS, "OSPF_auth: lower sequence number 0x%08x < 0x%08x", + ntohl(pkt->u.md5.csn), n->csn); return 0; } - n->csn = ntohs(pkt->u.md5.csn); + n->csn = ntohl(pkt->u.md5.csn); } MD5Init(&ctxt);
participants (2)
-
Ondrej Zajicek -
Yury Shevchuk