ECMP on 2 x 1ge Linux hosts

Bao Nguyen ngqbao at gmail.com
Wed May 8 18:58:03 CEST 2013


Ondrej,

Thanks for pointing toward the directions of route cache. I think you might
be right about the cache, even though there are multiple cache entries for
10.25.14.20 (the server) in this case, there's still only one path selected
for each destination i.e. instead of multiple path for each destination...

10.25.14.20 via 10.25.206.69 dev eth1  src 10.25.206.70
    cache  used 3 age 38sec ipid 0xfe2e rtt 40ms rttvar 10ms ssthresh 4081
cwnd 7935

10.25.14.20 via 10.25.205.69 dev eth0  src 10.25.205.70
    cache  used 13 age 4sec ipid 0xfe2e rtt 40ms rttvar 10ms ssthresh 4081
cwnd 7935

I think the problem is more with the way the client selected the interface
to use then the route cache in this case. For example, you see that
10.25.205.70 interface is selected here instead of dividing the threads
into multiple interfaces between 10.25.205.70 and 10.25.206.70 in which
this host have two physical connections. I'm wondering if there is a way to
ties say the dummy0 interface so that if you point traffic toward it, it
will hash it across two physical links...

------------------------------------------------------------
Client connecting to 10.25.14.20, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  6] local 10.25.205.70 port 52606 connected with 10.25.14.20 port 5001
[  4] local 10.25.205.70 port 52603 connected with 10.25.14.20 port 5001
[  5] local 10.25.205.70 port 52605 connected with 10.25.14.20 port 5001
[  3] local 10.25.205.70 port 52604 connected with 10.25.14.20 port 5001

These nodes are not setup for transit traffic but I'm wondering how the
through transit traffic would look and how the kernel would decide how to
balance the traffic or it would pick one instead of two..


-bn
0216331C


On Tue, May 7, 2013 at 1:13 AM, Ondrej Zajicek <santiago at crfreenet.org>wrote:

> On Mon, May 06, 2013 at 06:50:10PM -0700, Bao Nguyen wrote:
> > >From host A I do "iperf -c x.x.x.x -P 4" to host B's dummy0 interface,
> host
> > A iperf seemed to pick a single eth0 or eth1 but not both interfaces and
> > use it to send out traffic on all 4 difference processes. Ideally how
> would
> > one allow a single application (in this case iperf) with multiple
> > threads/processes to send 2Gbit worth of traffic? Ideally to a logical
> > interface and it's hash automatically to two difference eth0 and eth1?
>
> AFAIK Linux kernel implements ECMP in a way that for each route cache
> entry just one path is used (and fixed in route cache). I heard that
> there are patches for different behavior. I don't know how this behavior
> differs on newer kernels without route cache. Without the current
> behavior, an application could use different addresses or ports for each
> thread to use multiple threads, i guess. Or use something completely
> different, like link-level interface bonding.
>
> > I've looked at setting (krt_prefsrc) to source the address as the dummy0
> > interface on each host. Would that be the answer?
>
> Probably not (although it is probably useful anyway).
>
> --
> Elen sila lumenn' omentielvo
>
> Ondrej 'SanTiago' Zajicek (email: santiago at crfreenet.org)
> OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
> "To err is human -- to blame it on a computer is even more so."
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAlGIt6QACgkQw1GB2RHercP+UACfZhlP0L9VbQ9pU281fmRGIj0G
> a3IAn1yeE9uPcY/NUbM9lm0bepkVJKNr
> =8shF
> -----END PGP SIGNATURE-----
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://trubka.network.cz/pipermail/bird-users/attachments/20130508/ceaea447/attachment-0001.html>


More information about the Bird-users mailing list