Some of the VM's are in the same data center (latency < 1ms) and some are very far away (Germany <-> Hong Kong 300ms-400ms). Which rtt-min, rtt-max and rtt-cost/max-rtt-penalty would be suitable for such a network?
rtt-min should be chosen so that all links below rtt-min are local -- the idea is that below rtt-min, it's not worth optimising the path. The default value (10ms) should be fine in most networks. rtt-max should be larger than all good links in your network -- the idea is that above rtt-max, it's no longer worth optimising the path. In your case, you might want to increase the value to 350ms or so.
we are running re6st a mesh network based on babel. We have almost the same setup as you: around 400 machines connected worldwide. Our default options for babel are:
max-rtt-penalty 5000 rtt-max 500 rtt-decay 125
That's a little bit extreme, but since Nexedi's network is uncongested, Babel should be fairly stable even with such extreme parameters. -- Juliusz