On 18.05.2013 16:43, Ondrej Zajicek wrote:
On Wed, May 15, 2013 at 09:18:10PM +0400, Slawa Olhovchenkov wrote:
So this is a problem with multiple IPs on an iface. On BSD, we support just one primary address in OSPF (others are just handled as stubs). Lexicographically smallest address was chosen. Unfortunately the kernel chose a different one as a source address. Why lexicographically smallest? Why not first from getifaddrs() list? Well, this is mainly a consequence of the facts that primary address selection is currently in platform independent code and on Linux the order of addressess has no real meaning. Therefore it seemed to be a good idea to make it more deterministic, i.e. independent on the order in which the addresses appeared. There seems to be another "driver" for changing this behavior: currently ifa_recalc_all_primary_addresses() simply shuts interface down if primary address disappears. After that, it stays down until the next "device" protocol pass.
This is quite annoying in our scenario: we are using bird to announce load balancers /32 VIPs, and _sometimes_ when one (so-called primary) address is being removed, all other services are disrupted for 10-15 seconds. It seems that the only protocols which are interested in such changes are OSPF (and possibly RIP/IS-IS). Maybe we should consider moving "primary-is-changed" case to protocol-dependent hook without doing nest interface manipulations?
Perhaps the best idea would be to remove the whole primary address selection (as it does more harm than good) and replace it by platform dependent flag that this address is preferred for the interface (and don't even use this flag on Linux).
We enumerate addresses on BSD using CTL_NET / NET_RT_IFLIST sysctl scan. Does anyone know whether is a reliable way to determine which one of returned addresses is the primary one or whether we should just add an another call of SIOCGIFADDR (as suggested by Alexander V. Chernikov)? I would guess it is the first one, but i am not sure if that is specified or just a coincidence.