bird6 crashing when restarting pipes
Hi all, DE-CIX is running BIRD since a couple of days. So far it's behaving almost perfect though it has to carry heavy load ;-) Did anyone see bird6 or bird crashing when restarting pipes? That's what I've seen. rs6-l:~> birdc6 BIRD 1.2.1 ready. bird> restart P13301_unitedcolov6 P13301_unitedcolov6: restarted bird> restart P2914_2914asglobal P2914_2914asglobal: restarted bird> restart P3249_estpak P3249_estpak: restarted bird> restart P3292_tdcnetipv6 <--- here bird6 crashes Connection closed by server. rs6-l:~> restart P4589_easynet6 restart: Command not found. rs6-l:~> restart P8781_qtelset restart: Command not found. rs6-l:~> restart P9031_edpnet restart: Command not found. Logfile says: Feb 28 12:42:59 rs6-l bird: Restarting protocol P13301_unitedcolo Feb 28 12:43:01 rs6-l bird: Restarting protocol P3249_estpak Feb 28 12:43:02 rs6-l bird: Restarting protocol P3292_tdcnet Feb 28 12:43:03 rs6-l bird: Restarting protocol P3549_gblx Feb 28 12:43:05 rs6-l bird: Restarting protocol P8222_netplace Feb 28 12:45:25 rs6-l bird: Restarting protocol P13301_unitedcolov6 Feb 28 12:45:25 rs6-l bird: Restarting protocol P2914_2914asglobal Feb 28 12:45:25 rs6-l bird: Restarting protocol P3249_estpak Feb 28 12:46:21 rs6-l bird: Started <--- restarted manually birdc6 commands were c&p, Best regards, Arnold -- Arnold Nipper / nIPper consulting, Sandhausen, Germany email: arnold@nipper.de phone: +49 6224 9259 299 mobile: +49 172 2650958 fax: +49 6224 9259 333
Arnold, Never seen that, but we tested 1.1.7 and 1.2.1 and didn't use pipes. In the RSWG test in the lab we used 1.1.6 and had no such issues with pipes. Wish I could be more helpful. -Chris On Mar 1, 2010, at 3:24 PM, Arnold Nipper wrote:
Hi all,
DE-CIX is running BIRD since a couple of days. So far it's behaving almost perfect though it has to carry heavy load ;-)
Did anyone see bird6 or bird crashing when restarting pipes? That's what I've seen.
rs6-l:~> birdc6 BIRD 1.2.1 ready. bird> restart P13301_unitedcolov6 P13301_unitedcolov6: restarted bird> restart P2914_2914asglobal P2914_2914asglobal: restarted bird> restart P3249_estpak P3249_estpak: restarted bird> restart P3292_tdcnetipv6 <--- here bird6 crashes Connection closed by server. rs6-l:~> restart P4589_easynet6 restart: Command not found. rs6-l:~> restart P8781_qtelset restart: Command not found. rs6-l:~> restart P9031_edpnet restart: Command not found.
Logfile says: Feb 28 12:42:59 rs6-l bird: Restarting protocol P13301_unitedcolo Feb 28 12:43:01 rs6-l bird: Restarting protocol P3249_estpak Feb 28 12:43:02 rs6-l bird: Restarting protocol P3292_tdcnet Feb 28 12:43:03 rs6-l bird: Restarting protocol P3549_gblx Feb 28 12:43:05 rs6-l bird: Restarting protocol P8222_netplace Feb 28 12:45:25 rs6-l bird: Restarting protocol P13301_unitedcolov6 Feb 28 12:45:25 rs6-l bird: Restarting protocol P2914_2914asglobal Feb 28 12:45:25 rs6-l bird: Restarting protocol P3249_estpak Feb 28 12:46:21 rs6-l bird: Started <--- restarted manually
birdc6 commands were c&p,
Best regards, Arnold -- Arnold Nipper / nIPper consulting, Sandhausen, Germany email: arnold@nipper.de phone: +49 6224 9259 299 mobile: +49 172 2650958 fax: +49 6224 9259 333
On Mon, Mar 01, 2010 at 10:24:38PM +0100, Arnold Nipper wrote:
Hi all,
DE-CIX is running BIRD since a couple of days. So far it's behaving almost perfect though it has to carry heavy load ;-)
Did anyone see bird6 or bird crashing when restarting pipes? That's what I've seen.
Hmm, we don't have any such bugreports. I would suggest to run bird/bird6 with enabled core dumps (ulimit -c unlimited) and non-stripped (with debug symbols, stripping is done by 'make install', just copy the binary after 'make', non-stripped binary ~ 1 MB, stripped ~ 300 kB). In the case of a crash, if you send me the core dump and the bird binary, it usually allows to locate and fix the problem. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On 02.03.2010 08:09 Ondrej Zajicek wrote
On Mon, Mar 01, 2010 at 10:24:38PM +0100, Arnold Nipper wrote:
Hi all,
DE-CIX is running BIRD since a couple of days. So far it's behaving almost perfect though it has to carry heavy load ;-)
Did anyone see bird6 or bird crashing when restarting pipes? That's what I've seen.
Hmm, we don't have any such bugreports. I would suggest to run bird/bird6 with enabled core dumps (ulimit -c unlimited) and non-stripped (with debug symbols, stripping is done by 'make install', just copy the binary after 'make', non-stripped binary ~ 1 MB, stripped ~ 300 kB). In the case of a crash, if you send me the core dump and the bird binary, it usually allows to locate and fix the problem.
Will do! Best regards and thank you for your excellent work, Arnold -- Arnold Nipper / nIPper consulting, Sandhausen, Germany email: arnold@nipper.de phone: +49 6224 9259 299 mobile: +49 172 2650958 fax: +49 6224 9259 333
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Ondrej, Am 02.03.2010 08:09, schrieb Ondrej Zajicek:
Hmm, we don't have any such bugreports. I would suggest to run bird/bird6 with enabled core dumps (ulimit -c unlimited) and non-stripped (with debug symbols, stripping is done by 'make install', just copy the binary after 'make', non-stripped binary ~ 1 MB, stripped ~ 300 kB). In the case of a crash, if you send me the core dump and the bird binary, it usually allows to locate and fix the problem.
Where will the dumps be stored? Is there anything else to do to activate core dumps than setting a core dump size? Currently I limited the dump size to: core file size (blocks, -c) 10000 Since our daemon is consuming about 4,8GB of RAM I don't want to set unlimited core dump size. Do you thing that is sufficient? Rgds, Stefan - -- Stefan Jakob e-mail: stefan.jakob@de-cix.net DE-CIX Management GmbH Phone: +49 69 1730 902-32 Lindleystr. 12, 60314 Frankfurt Mobile: +49 172 695 8467 Geschaeftsfuehrer Harald A. Summa Fax: +49 69 4056 2716 Registergericht AG Koeln, HRB 51135 http://www.de-cix.net Zentrale: Lichtstr. 43i, 50825 Koeln -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJLjnasAAoJEJDQtqadNP/lTDUH/1VJY3/hxcM+7M9atcxxUH+d ZxueaIfgvxbHNUKFf06DhXHwYBzLTHfSZi3DUWo+RmLTU0BuzIvinpWOgJEDI2Km sH4XrQFl01ZgnnzWHSo/4SlzjUDGzrCPvctfSzKpE6ZHI2hhkzh/VPnE4qAtDWvt clpCzkI6gEys+g0yzbhImBj5xUQxQTIXjOL9VFwGr9XhSUaaGbxnCXQJ6V62dhUj MDks90SSW2YgL7G93waHUgmHui2ff831EOt5m2M8X20bmby5r3nPe8a2rLuf8Fxy Rw0nfZoTIRYUCBqqN9qN1FTUbTcutqcAkb7v7n7JfNOZTeTKtzBzlshpQi28DEc= =WLly -----END PGP SIGNATURE-----
On Wed, Mar 03, 2010 at 03:48:13PM +0100, Stefan Jakob wrote:
Hi Ondrej,
Am 02.03.2010 08:09, schrieb Ondrej Zajicek:
Hmm, we don't have any such bugreports. I would suggest to run bird/bird6 with enabled core dumps (ulimit -c unlimited) and non-stripped (with debug symbols, stripping is done by 'make install', just copy the binary after 'make', non-stripped binary ~ 1 MB, stripped ~ 300 kB). In the case of a crash, if you send me the core dump and the bird binary, it usually allows to locate and fix the problem.
Where will the dumps be stored?
It depends on OS and its configuration, usually in the current working directory of the dumped process.
Is there anything else to do to activate core dumps than setting a core dump size?
This should be enough. The ulimit is process-specific (and it is inherited from the parent) and therefore should be set for BIRD in bird starting scripts. If you have some testing environment, you could try to send signal SIGABRT (kill -ABRT) to the BIRD to test it (this would lead to abort and core dump).
Currently I limited the dump size to:
core file size (blocks, -c) 10000
Since our daemon is consuming about 4,8GB of RAM I don't want to set unlimited core dump size.
Do you thing that is sufficient?
I think that it is not enough, because the most important info (the stack) is at the end of the address space and i think that the system in that case would store the first 10000 blocks. Therefore, i would expect that it is needed to have the full core dump. BTW, on what OS (Linux or some BSD) are you running BIRD? -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On 03.03.2010 22:46 Ondrej Zajicek wrote
On Wed, Mar 03, 2010 at 03:48:13PM +0100, Stefan Jakob wrote:
Hi Ondrej,
Am 02.03.2010 08:09, schrieb Ondrej Zajicek:
Hmm, we don't have any such bugreports. I would suggest to run bird/bird6 with enabled core dumps (ulimit -c unlimited) and non-stripped (with debug symbols, stripping is done by 'make install', just copy the binary after 'make', non-stripped binary ~ 1 MB, stripped ~ 300 kB). In the case of a crash, if you send me the core dump and the bird binary, it usually allows to locate and fix the problem.
Where will the dumps be stored?
It depends on OS and its configuration, usually in the current working directory of the dumped process.
Is there anything else to do to activate core dumps than setting a core dump size?
This should be enough. The ulimit is process-specific (and it is inherited from the parent) and therefore should be set for BIRD in bird starting scripts.
If you have some testing environment, you could try to send signal SIGABRT (kill -ABRT) to the BIRD to test it (this would lead to abort and core dump).
Currently I limited the dump size to:
core file size (blocks, -c) 10000
Since our daemon is consuming about 4,8GB of RAM I don't want to set unlimited core dump size.
Do you thing that is sufficient?
I think that it is not enough, because the most important info (the stack) is at the end of the address space and i think that the system in that case would store the first 10000 blocks. Therefore, i would expect that it is needed to have the full core dump.
BTW, on what OS (Linux or some BSD) are you running BIRD?
Debian Lenny 2.6.30-bpo.2-amd64 Best regards, Arnold -- Arnold Nipper / nIPper consulting, Sandhausen, Germany email: arnold@nipper.de phone: +49 6224 9259 299 mobile: +49 172 2650958 fax: +49 6224 9259 333
I have introduced "route limit" as a countermeasure for peers leaking full tables. Extra safety belt as we do as-path and prefix filter anyway. But you never know. When doing "configure", bird crashed. Same for bird6 I guess. Where to put the core and the binary? System in question is Debian, running lenny. Best regards, Arnold -- Arnold Nipper / nIPper consulting, Sandhausen, Germany email: arnold@nipper.de phone: +49 6224 9259 299 mobile: +49 172 2650958 fax: +49 6224 9259 333
On 10.03.2010 20:57 Arnold Nipper wrote
Where to put the core and the binary?
Unfortunately both bird as well as bird6 dump to core. Is there any way to have them dump to something like core{4,6}-yyyymmmddHHMMSS Arnold -- Arnold Nipper / nIPper consulting, Sandhausen, Germany email: arnold@nipper.de phone: +49 6224 9259 299 mobile: +49 172 2650958 fax: +49 6224 9259 333
On Wed, Mar 10, 2010 at 09:06:29PM +0100, Arnold Nipper wrote:
On 10.03.2010 20:57 Arnold Nipper wrote
Where to put the core and the binary?
Unfortunately both bird as well as bird6 dump to core. Is there any way to have them dump to something like
core{4,6}-yyyymmmddHHMMSS
The name of core file can be configured using /proc/sys/kernel/core_pattern . See 'man 5 core' . -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Wed, Mar 10, 2010 at 08:57:25PM +0100, Arnold Nipper wrote:
I have introduced "route limit" as a countermeasure for peers leaking full tables. Extra safety belt as we do as-path and prefix filter anyway. But you never know.
When doing "configure", bird crashed. Same for bird6 I guess.
Where to put the core and the binary?
The best would be if i could just SSH to some your machine with the core, binary and gdb tool to not need to evade downloading gigabytes of core. But you could also put that on some HTTP or FTP server and send me URL. BTW, it would be useful to see logs too. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On 11.03.2010 09:57 Ondrej Zajicek wrote
On Wed, Mar 10, 2010 at 08:57:25PM +0100, Arnold Nipper wrote:
I have introduced "route limit" as a countermeasure for peers leaking full tables. Extra safety belt as we do as-path and prefix filter anyway. But you never know.
When doing "configure", bird crashed. Same for bird6 I guess.
Where to put the core and the binary?
The best would be if i could just SSH to some your machine with the core, binary and gdb tool to not need to evade downloading gigabytes of core. But you could also put that on some HTTP or FTP server and send me URL.
BTW, it would be useful to see logs too.
I'll tar everything and will let you know the URL. SSH'ing to the routesever is almost even impossible for me ;-) Arnold -- Arnold Nipper / nIPper consulting, Sandhausen, Germany email: arnold@nipper.de phone: +49 6224 9259 299 mobile: +49 172 2650958 fax: +49 6224 9259 333
On Mar 10, 2010, at 20:57 , Arnold Nipper wrote:
I have introduced "route limit" as a countermeasure for peers leaking full tables. Extra safety belt as we do as-path and prefix filter anyway. But you never know.
When doing "configure", bird crashed. Same for bird6 I guess.
Where to put the core and the binary?
System in question is Debian, running lenny.
Arnold, can you remember if birdc in this case had a non-zero return-status, what does echo $? say after issuing the command to crash the bird? I need this for my deployment scripts, but I can't get bird to crash unfortunately :)
Best regards, Arnold
Wolfgang, SO close to release the VIX-Route-Servers into production.
-- Arnold Nipper / nIPper consulting, Sandhausen, Germany email: arnold@nipper.de phone: +49 6224 9259 299 mobile: +49 172 2650958 fax: +49 6224 9259 333
-- www.vix.at | www.aco.net wh@univie.ac.at | WH844-RIPE Vienna University Computer Center
On 17.03.2010 13:39 Wolfgang Hennerbichler wrote
On Mar 10, 2010, at 20:57 , Arnold Nipper wrote:
I have introduced "route limit" as a countermeasure for peers leaking full tables. Extra safety belt as we do as-path and prefix filter anyway. But you never know.
When doing "configure", bird crashed. Same for bird6 I guess.
Where to put the core and the binary?
System in question is Debian, running lenny.
can you remember if birdc in this case had a non-zero return-status,
no, as this was ran by a batch job.
what does echo $? say after issuing the command to crash the bird?
see above
I need this for my deployment scripts, but I can't get bird to crash unfortunately :)
do 'echo "show route where defined(bgp_atomic_aggr)" | birdc' ;-) Arnold -- Arnold Nipper / nIPper consulting, Sandhausen, Germany email: arnold@nipper.de phone: +49 6224 9259 299 mobile: +49 172 2650958 fax: +49 6224 9259 333
On Mar 17, 2010, at 23:31 , Arnold Nipper wrote:
I need this for my deployment scripts, but I can't get bird to crash unfortunately :)
do 'echo "show route where defined(bgp_atomic_aggr)" | birdc' ;-)
good call, didn't crash in my case as we have nothing in the main rib, but it "worked" (means it crashed) as I used that command on a table. birdc crashes with return code 1, which means all my deployment scripts would stop right there and not propagate that change to the second route-server. Thanks,
Arnold
Wolfgang
-- Arnold Nipper / nIPper consulting, Sandhausen, Germany email: arnold@nipper.de phone: +49 6224 9259 299 mobile: +49 172 2650958 fax: +49 6224 9259 333
-- www.vix.at | www.aco.net wh@univie.ac.at | WH844-RIPE Vienna University Computer Center
participants (5)
-
Arnold Nipper -
Chris Malayter -
Ondrej Zajicek -
Stefan Jakob -
Wolfgang Hennerbichler