multithread support

Tue Mar 2 18:54:54 CET 2021

Hello!

On 3/2/21 4:34 PM, Douglas Fischer wrote:
> This is very good news!
> 
> I know you said "This is a ball park guess", but I confess that I was a 
> little scared by the proportion of extra CPU usage (30/48 -> +60%).

This depends much on what kind of load we are speaking about. Generally, 
if you are a big route server, then 98% of CPU time is probably eaten by 
complex filters. I would estimate that this may finish anywhere between 
+10% and -10% due to other structural changes. The parallelization 
overhead would be minimal.

However, if you are a big route reflector, then you're constantly just 
recomputing the best route, accessing the same table. Then we may get to 
the +60% estimate. Long story short, the more work you do with one 
route, the less overhead you get.

Remember that BIRD is currently extremely well optimized for 
single-threaded execution and some parts still heavily depend on being 
executed that way. We chose first to allow parallel execution of those 
parts that can be parallelized well, with adding some overhead to other 
parts.

The most critical part of this is route export (from tables to 
protocols) which is now done synchronously after route import. We 
decided to decouple it in the multithreaded code, which involves having 
a route export queue. Hence more memory stores and loads, more cache 
misses etc.

Well … maybe the +60% is too much, reconsidering that guess. Let's hope 
it's overestimated. I'd be more concerned about the memory usage. There 
are some estimations of peak memory usage in worst cases which can be 
even +100% (for a short time). In case we get to these problems in real 
world, we'd definitely have to implement algorithms to limit these peaks 
as swapping to disk is not desirable here at all. Anyway, this is not 
the problem of today; we still need first to get to a code which at 
least builds and runs without spitting one core file after another.

> I also know that you said that the code is still "currently not 
> releasable", but I'm curious to know a little more about how this 
> multi-threading was handled.

Basically, one thread per receiving socket, one thread per exporting 
channel, with some exceptions. One lock per protocol instance, one lock 
per table. You can lock only one table and one protocol instance at 
time; protocol goes first.

We'll publish more documentation; it's still WIP. For now, I'm just 
answering a question to say "yes, we're going multithreaded and we're 
actively working on it".

> Just to illustrate:
> 
> Single-Core CPU on BGP is known to be a problem for many engines and 
> vendors.
> 
> One of the vendors developed a "creative" way to do this load 
> distribution in multiple colors.
> As I understood it, they made a kind of Affinity CPU by BGP-Peer.
> In a way that each peer has a BGP process, and that process is 
> "semi-tied" to a core.
> And they created a mechanism to redistribute these affinities from 
> time-to-time based on the amount of BGP messages per second exchanged on 
> each peer.

If this arises to be a problem, we'll consider this. For now, it just 
seems that the most critical part is the route itself which is being 
propagated through BIRD -- which should stay in one thread as long as 
possible and the threads should keep its CPU (on a well-behaved system) 
unless moved for a good reason.

Maria

> 
> Em ter., 2 de mar. de 2021 às 10:13, Maria Matejka <maria.matejka at nic.cz 
> <mailto:maria.matejka at nic.cz>> escreveu:
> 
>     Hi!
> 
>     On 3/1/21 1:26 PM, Marcelo Balbinot wrote:
>      >
>      > Hi, I already asked this question at some point,
>      > but I am curious about the evolution ..
>      > About multi thread support (multi-core cpu use).
>      > Is this still a possibility?
> 
>     Yes, it is. Be prepared that this will also raise memory usage (current
>     estimates are about >+10% memory) and overall CPU usage (compared to
>     single-thread execution) due to needed synchronization and buffers.
> 
>     This means that if you now consume 20G of memory and 30 minutes of
>     single core time to converge the main table on a rather big node,
>     you're
>     going to consume, let's say, >22G of memory and 3 minutes of 16-core
>     CPU
>     (summing to 48 minutes of CPU time). This is a ball park guess, do not
>     take me much seriously. It may be better, it may be worse.
> 
>     Anyway, there is some code (currently not releasable) that will get
>     to a
>     preview release soon. We'll highly appreciate testing from any user
>     around. Stay tunad!
> 
>     Maria
> 
> 
> 
> -- 
> Douglas Fernando Fischer
> Engº de Controle e Automação