I haven't tried the optmem_max option, but I did some more experimenting.. We have a virtual machine running a nearly identical BIRD config that's not showing this issue. The machine with the issue is physical, and has a Mellanox ConnectX NIC. I'm wondering if there's some limitation with TCP offload there that's responsible. Disabling TCP offload didn't seem to help though. On 8/24/2015 4:59 PM, Michael Vallaly wrote:
I saw this problem back in 2013 on Bird 1.3.6 and 3.6+ kernels.. (Re: Strange MD5 Auth problem in BIRD 1.3.8)
AFAIK it was related to kernel socket option memory (or lack there of) and I can only surmise it was related to some sort of memory leak. Ondrej Zajicek seemed to think this was an issue in the kernel itself, but I wasn't able to prove that definitively.
I was able to work around it (without rebooting) by:
<snip> echo 40960 > /proc/sys/net/core/optmem_max # Defaults to 20480 </snip>
Which seemed to have deferred the issue, long enough for us to reboot / not run into it constantly.
If anyone else has any details or info, I would still be interested in the root-cause analysis and hopefully permanent fix.
-Mike
On Mon, 24 Aug 2015 15:59:06 -0400 Brian Rak <brak@gameservers.com> wrote:
I have a machine running BIRD 1.4.5, and I'm seeing a lot of these messages when I start it up:
2015-08-24 15:54:26 <ERR> xxxx: Socket error: TCP_MD5SIG: Cannot allocate memory 2015-08-24 15:54:26 <ERR> yyyy: Socket error: TCP_MD5SIG: Cannot allocate memory
It also seems like the sessions that report that error do not come up, and show a status of 'Error: Kernel MD5 auth failed'.
I'm only trying to configure around 200 BGP sessions here, most of which are advertising a very small number of prefixes.
I don't really see any tunable settings here, any suggestions as to how I can correct this?