BIRD crashes with 1000 peers
Dear BIRD community, I want to connect BIRD 1.5.0 with more than 1000 ExaBGP peers on Ubuntu 14.04 LTS 64 Bit. Connecting 1000 peers works but if I increase the peer count to more than 1024 it fails due to the soft file limit of 1024. To raise that I tried the usual (ulimit, /etc/security/limits.conf) but BIRD doesnt take these new limits. Later I found the advice to add ulimit -n 10000 to the init.d script before each start-stop-daemon command. After restarting BIRD it gets the correct soft and hard limits (cat /proc/<pid>/limits | grep files). But now only 750 peers work and connecting 1000 peers crashs bird without any error message in syslog (log syslog all configured). (I did not test 751-999 peers.) It would be nice if somebody could help meJ. Thanks, Daniel Seidenstuecker
tried this today: - took fresh 1.5.0 - applied the 2 patches manually - changed the following defines to 65535: o /usr/include/linux/posix_types.h:#define __FD_SETSIZE 1024 o /usr/include/x86_64-linux-gnu/bits/typesizes.h:#define __FD_SETSIZE 1024 - configure …, make, sudo make install - verified that the binaries I use were the ones that I built today But the result is the same behavior as before: - without raise of user limits (/etc/security/limits.conf or ulimit in init-script); user limit is 1024 and lower than FD_SETSIZE o 1000 works, but too many open files in syslog; 1250 doesn’t work - with raise of user limits (/etc/security/limits.conf or ulimit in init-script); user limit is 10,000 and lower than FD_SETSIZE o limits were accepted by BIRD (cat /proc/<pid>/limits | grep files) o 750 works, 1000 crashes without anything in syslog Due to my understanding and research some call of select() assumes FD_SETSIZE to be 1024 and doesn’t get my FD_SETSIZE changes. If too many peers connect the amount of open files grow larger than 1024 and that crushes BIRD. Without me raising the user limits, the user limit of 1024 prevents BIRD from taking more than 1024 and therefore select() from crushing BIRD. I will try the master branch tomorrow but nevertheless any further help will be appreciated. Thanks, Daniel Seidenstuecker Von: Alexander V. Chernikov [mailto:melifaro@ipfw.ru] Gesendet: Freitag, 12. Februar 2016 17:34 An: Daniel Seidenstьcker; Bird-users@network.cz Betreff: Re: BIRD crashes with 1000 peers https://gitlab.labs.nic.cz/labs/bird/commit/3aed0a6ff7b2b811a535202fd787281d..., 19:15, "Daniel Seidenstücker" <d.seidenstuecker@googlemail.com>: Dear BIRD community, I want to connect BIRD 1.5.0 with more than 1000 ExaBGP peers on Ubuntu 14.04 LTS 64 Bit. Connecting 1000 peers works but if I increase the peer count to more than 1024 it fails due to the soft file limit of 1024. To raise that I tried the usual (ulimit, /etc/security/limits.conf) but BIRD doesn’t take these new limits. Later I found the advice to add “ulimit -n 10000” to the init.d script before each “start-stop-daemon” command. After restarting BIRD it gets the correct soft and hard limits (cat /proc/<pid>/limits | grep files). But now only 750 peers work and connecting 1000 peers crashs bird without any error message in syslog (“log syslog all” configured). (I did not test 751-999 peers.) You need to recompile bird with different FD_SETSIZE limit. For example, https://cs.uwaterloo.ca/~brecht/servers/openfiles.html can be used as a guide. As for the crashes, they were fixed by https://gitlab.labs.nic.cz/labs/bird/commit/338f85ca7721fac16394ccabd561ddb5... && https://gitlab.labs.nic.cz/labs/bird/commit/3aed0a6ff7b2b811a535202fd787281d... It would be nice if somebody could help meJ. Thanks, Daniel Seidenstuecker
Hello Daniel, did you find any solution? Thanks, Milan Dne 15.2.2016 v 17:13 Daniel Seidenstücker napsal(a):
tried this today:
-took fresh 1.5.0
-applied the 2 patches manually
-changed the following defines to 65535:
o/usr/include/linux/posix_types.h:#define __FD_SETSIZE 1024
o/usr/include/x86_64-linux-gnu/bits/typesizes.h:#define __FD_SETSIZE 1024
-configure …, make, sudo make install
-verified that the binaries I use were the ones that I built today
But the result is the same behavior as before:
-without raise of user limits (/etc/security/limits.conf or ulimit in init-script); user limit is 1024 and lower than FD_SETSIZE
o1000 works, but too many open files in syslog; 1250 doesn’t work
-with raise of user limits (/etc/security/limits.conf or ulimit in init-script); user limit is 10,000 and lower than FD_SETSIZE
olimits were accepted by BIRD (cat /proc/<pid>/limits | grep files)
o750 works, 1000 crashes without anything in syslog
Due to my understanding and research some call of select() assumes FD_SETSIZE to be 1024 and doesn’t get my FD_SETSIZE changes. If too many peers connect the amount of open files grow larger than 1024 and that crushes BIRD. Without me raising the user limits, the user limit of 1024 prevents BIRD from taking more than 1024 and therefore select() from crushing BIRD.
I will try the master branch tomorrow but nevertheless any further help will be appreciated.
Thanks,
Daniel Seidenstuecker
*Von:*Alexander V. Chernikov [mailto:melifaro@ipfw.ru] *Gesendet:* Freitag, 12. Februar 2016 17:34 *An:* Daniel Seidenstьcker; Bird-users@network.cz *Betreff:* Re: BIRD crashes with 1000 peers
https://gitlab.labs.nic.cz/labs/bird/commit/3aed0a6ff7b2b811a535202fd787281d..., 19:15, "Daniel Seidenstücker" <d.seidenstuecker@googlemail.com <mailto:d.seidenstuecker@googlemail.com>>:
Dear BIRD community,
I want to connect BIRD 1.5.0 with more than 1000 ExaBGP peers on Ubuntu 14.04 LTS 64 Bit. Connecting 1000 peers works but if I increase the peer count to more than 1024 it fails due to the soft file limit of 1024.
To raise that I tried the usual (ulimit, /etc/security/limits.conf) but BIRD doesn’t take these new limits. Later I found the advice to add “ulimit -n 10000” to the init.d script before each “start-stop-daemon” command. After restarting BIRD it gets the correct soft and hard limits (cat /proc/<pid>/limits | grep files). But now only 750 peers work and connecting 1000 peers crashs bird without any error message in syslog (“log syslog all” configured). (I did not test 751-999 peers.)
You need to recompile bird with different FD_SETSIZE limit. For example, https://cs.uwaterloo.ca/~brecht/servers/openfiles.html can be used as a guide.
As for the crashes, they were fixed by https://gitlab.labs.nic.cz/labs/bird/commit/338f85ca7721fac16394ccabd561ddb5... && https://gitlab.labs.nic.cz/labs/bird/commit/3aed0a6ff7b2b811a535202fd787281d...
It would be nice if somebody could help meJ.
Thanks,
Daniel Seidenstuecker
-- Milan Strakoš | Specialista IP služeb / IP services specialist ha-vel internet s.r.o. | Olešní 587/11A, 712 00 Ostrava - Muglinov, Czech Republic mobile: +420 606 77 88 41 office: +420 552 305 341 fax: +420 552 305 306 nmc: +420 552 305 321 e-mail: milan.strakos@ha-vel.cz www: http://www.ha-vel.cz Neodstraňujte, prosím, žádnou část tohoto e-mailu při případné další komunikaci k tomuto tématu. Please do not remove any parts of this e-mail message in further communication about this issue.
No I tried (always Ubuntu 14.04 64Bit): - raising FD_SETSIZE in /usr/include/linux/posix_types and /usr/include/x86_64-linux-gnu/bits/typesizes.h and recompile BIRD - adding "#undef __FD_SETSIZE #define __FD_SETSIZE 2048" in bird-1.5.0/sysdep/unix/io.c and bird-1.5.0/client/client.c after #include <sys/types.h> and recompile - adding "#define FD_SETSIZE 2048" in bird-1.5.0/sysdep/unix/io.c and bird-1.5.0/client/client.c before all includes and recompile - master branch Nothing of it worked for me. I know BIRD gets the higher FD_SETSIZE values because the error messages of the patches of Nov 2015 disappeared. But BIRD crashes every run in that moment BIRD's file count growths over 1024 (safe with 750 peers). So I think we have to wait till the developers switch to poll or another alternative of select. -----Ursprüngliche Nachricht----- Von: bird-users-bounces@network.cz [mailto:bird-users-bounces@network.cz] Im Auftrag von Milan Strakoš Gesendet: Montag, 7. März 2016 09:38 An: bird-users@network.cz Betreff: Re: AW: BIRD crashes with 1000 peers Hello Daniel, did you find any solution? Thanks, Milan Dne 15.2.2016 v 17:13 Daniel Seidenstücker napsal(a):
tried this today:
-took fresh 1.5.0
-applied the 2 patches manually
-changed the following defines to 65535:
o/usr/include/linux/posix_types.h:#define __FD_SETSIZE 1024
o/usr/include/x86_64-linux-gnu/bits/typesizes.h:#define __FD_SETSIZE 1024
-configure …, make, sudo make install
-verified that the binaries I use were the ones that I built today
But the result is the same behavior as before:
-without raise of user limits (/etc/security/limits.conf or ulimit in init-script); user limit is 1024 and lower than FD_SETSIZE
o1000 works, but too many open files in syslog; 1250 doesn’t work
-with raise of user limits (/etc/security/limits.conf or ulimit in init-script); user limit is 10,000 and lower than FD_SETSIZE
olimits were accepted by BIRD (cat /proc/<pid>/limits | grep files)
o750 works, 1000 crashes without anything in syslog
Due to my understanding and research some call of select() assumes FD_SETSIZE to be 1024 and doesn’t get my FD_SETSIZE changes. If too many peers connect the amount of open files grow larger than 1024 and that crushes BIRD. Without me raising the user limits, the user limit of 1024 prevents BIRD from taking more than 1024 and therefore select() from crushing BIRD.
I will try the master branch tomorrow but nevertheless any further help will be appreciated.
Thanks,
Daniel Seidenstuecker
*Von:*Alexander V. Chernikov [mailto:melifaro@ipfw.ru] *Gesendet:* Freitag, 12. Februar 2016 17:34 *An:* Daniel Seidenstьcker; Bird-users@network.cz *Betreff:* Re: BIRD crashes with 1000 peers
https://gitlab.labs.nic.cz/labs/bird/commit/3aed0a6ff7b2b811a535202fd7 87281d2ac3340912.02.2016, 19:15, "Daniel Seidenstücker" <d.seidenstuecker@googlemail.com <mailto:d.seidenstuecker@googlemail.com>>:
Dear BIRD community,
I want to connect BIRD 1.5.0 with more than 1000 ExaBGP peers on Ubuntu 14.04 LTS 64 Bit. Connecting 1000 peers works but if I increase the peer count to more than 1024 it fails due to the soft file limit of 1024.
To raise that I tried the usual (ulimit, /etc/security/limits.conf) but BIRD doesn’t take these new limits. Later I found the advice to add “ulimit -n 10000” to the init.d script before each “start-stop-daemon” command. After restarting BIRD it gets the correct soft and hard limits (cat /proc/<pid>/limits | grep files). But now only 750 peers work and connecting 1000 peers crashs bird without any error message in syslog (“log syslog all” configured). (I did not test 751-999 peers.)
You need to recompile bird with different FD_SETSIZE limit. For example, https://cs.uwaterloo.ca/~brecht/servers/openfiles.html can be used as a guide.
As for the crashes, they were fixed by https://gitlab.labs.nic.cz/labs/bird/commit/338f85ca7721fac16394ccabd5 61ddb5ccaacb36 && https://gitlab.labs.nic.cz/labs/bird/commit/3aed0a6ff7b2b811a535202fd7 87281d2ac33409
It would be nice if somebody could help meJ.
Thanks,
Daniel Seidenstuecker
-- Milan Strakoš | Specialista IP služeb / IP services specialist ha-vel internet s.r.o. | Olešní 587/11A, 712 00 Ostrava - Muglinov, Czech Republic mobile: +420 606 77 88 41 office: +420 552 305 341 fax: +420 552 305 306 nmc: +420 552 305 321 e-mail: milan.strakos@ha-vel.cz www: http://www.ha-vel.cz Neodstraňujte, prosím, žádnou část tohoto e-mailu při případné další komunikaci k tomuto tématu. Please do not remove any parts of this e-mail message in further communication about this issue.
On 7.3.2016 11:24, Stuart Henderson wrote:
On 2016/03/07 09:38, Milan Strakoš wrote:
Hello Daniel,
did you find any solution?
Thanks, Milan
Switching from select() to poll() would probably be a better idea here..
I agree. We will look at that. Ondrej
On 03/07/2016 11:33 AM, Ondrej Filip wrote:
On 7.3.2016 11:24, Stuart Henderson wrote:
On 2016/03/07 09:38, Milan Strakoš wrote:
Hello Daniel,
did you find any solution?
Thanks, Milan
Switching from select() to poll() would probably be a better idea here..
I agree. We will look at that.
See the branch "poll" in our Git repo. It seems to work, at least in my simple test case with 2000 BGP sessions.
participants (6)
-
Alexander V. Chernikov -
Daniel Seidenstücker -
Jan Matejka -
Milan Strakoš -
Ondrej Filip -
Stuart Henderson