Re: OSPF performance/SPF calculations
Joakim Tjernlund/Transmode wrote on 2010/04/21 22:04:06:
Ondrej Zajicek <santiago@crfreenet.org> wrote on 2010/04/21 20:15:07:
On Wed, Apr 21, 2010 at 09:41:47AM +0200, Joakim Tjernlund wrote:
I am using Quagga ATM but I had a quick look at BIRD and I got a few observations.
Hello. Thank you for your tips and notes.
The LSA/checksum code seem very inefficient. LSAs are built allocating/reallocing bits of memory. This is slow and will case memory fragmentation.
You mean lsab_alloc() in originate_rt_lsa_body()? This allocation is just a sequential allocation in a persistent memory buffer, therefore it is very efficient (in most cases just increase of lsab_used counter) and there si no memory fragmentation (all is done inside a persistent memory buffer).
Yes, you do realloc on small amounts of memory. Also, receives LSAs seems to be impl. differently so you need handle these somewhat differently.
The fletcher checksum is very slow as it has extra tests in the hot path and also flips endian in the LSA back and forth.
Do you have some suggestions for a better algorithm, or just a better implementation? Yes, the problem with flipping endianity is known. I do not studied this part of the code yet (the checksum algorithm).
Yes, but not at the moment. The endian problem should be addressed when you build the lsa.
Does this help at all? In any case, the (int) cast should be there.
On Fri, Apr 23, 2010 at 08:27:40AM +0200, Joakim Tjernlund wrote:
Yes, but not at the moment. The endian problem should be addressed when you build the lsa.
Does this help at all? In any case, the (int) cast should be there.
Removing of endianity swap is correct only if the Fletcher checksum would return the same value regardless of endianity swap. Is this a property of the Fletcher checksum? I don't see that. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej Zajicek <santiago@crfreenet.org> wrote on 2010/04/23 10:54:26:
On Fri, Apr 23, 2010 at 08:27:40AM +0200, Joakim Tjernlund wrote:
Yes, but not at the moment. The endian problem should be addressed when you build the lsa.
Does this help at all? In any case, the (int) cast should be there.
Removing of endianity swap is correct only if the Fletcher checksum would return the same value regardless of endianity swap. Is this a property of the Fletcher checksum? I don't see that.
Assuming the LSA's are in the same endian(Big Endian) the sum should be the same. You might have to swap the sum before returning it to the caller. Quagga does NOT do what BIRD does and it works as it should. Jocke
On Fri, Apr 23, 2010 at 10:54:33AM +0200, Joakim Tjernlund wrote:
Removing of endianity swap is correct only if the Fletcher checksum would return the same value regardless of endianity swap. Is this a property of the Fletcher checksum? I don't see that.
Assuming the LSA's are in the same endian(Big Endian) the sum should be the same. You might have to swap the sum before returning it to the caller.
As i looked on the Fletcher checksum, it seems that you cannot just swap the result instead of swapping the checked data.
Quagga does NOT do what BIRD does and it works as it should.
Berhaps Quagga stores the LSAs in the network endianity (big endian) and BIRD stores the LSAs in the host endianity? -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej Zajicek <santiago@crfreenet.org> wrote on 2010/04/23 16:09:00:
On Fri, Apr 23, 2010 at 10:54:33AM +0200, Joakim Tjernlund wrote:
Removing of endianity swap is correct only if the Fletcher checksum would return the same value regardless of endianity swap. Is this a property of the Fletcher checksum? I don't see that.
Assuming the LSA's are in the same endian(Big Endian) the sum should be the same. You might have to swap the sum before returning it to the caller.
As i looked on the Fletcher checksum, it seems that you cannot just swap the result instead of swapping the checked data.
Then there is a bug else where. Fletcher as such does not depend on host endian. It operates on bytes and those are always the same endian. Did you try my patch?
Quagga does NOT do what BIRD does and it works as it should.
Berhaps Quagga stores the LSAs in the network endianity (big endian) and BIRD stores the LSAs in the host endianity?
Quagga does store its own LSAs in Big Endian, you have to make them BE before transmitting them anyway so you might as well store them directly in BE. Then one can use the same code to manage common LSA parts for both received LSAs and your own LSAs. Jocke
On Fri, Apr 23, 2010 at 04:12:27PM +0200, Joakim Tjernlund wrote:
As i looked on the Fletcher checksum, it seems that you cannot just swap the result instead of swapping the checked data.
Then there is a bug else where. Fletcher as such does not depend on host endian. It operates on bytes and those are always the same endian.
But it also takes in consideration the position of the bytes. If you put endianity-swapped data into it, you have a different byte sequence, and the algorithm returns a different result.
Did you try my patch?
I looked at the Fletcher algorithm and found that it wouldn't work (because of the reason above). Perhaps i should try it to prove that.
Quagga does NOT do what BIRD does and it works as it should.
Berhaps Quagga stores the LSAs in the network endianity (big endian) and BIRD stores the LSAs in the host endianity?
Quagga does store its own LSAs in Big Endian, you have to make them BE before transmitting them anyway so you might as well store them directly in BE.
But you also need LSAs in host endianity when doing SPF calculation. Although it would be probably possible to change SPF calculation to use directly BE values it would be huge work and it is questionable whether it wouldn't just move endianity swaps deeper in the code. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
On Fri, Apr 23, 2010 at 04:12:27PM +0200, Joakim Tjernlund wrote:
As i looked on the Fletcher checksum, it seems that you cannot just swap the result instead of swapping the checked data.
Then there is a bug else where. Fletcher as such does not depend on host endian. It operates on bytes and those are always the same endian.
But it also takes in consideration the position of the bytes. If you put endianity-swapped data into it, you have a different byte sequence, and the algorithm returns a different result.
Yes, the sequence of bytes must be the same, but host endian doesn't matter.
Did you try my patch?
I looked at the Fletcher algorithm and found that it wouldn't work (because of the reason above). Perhaps i should try it to prove that.
Quagga does NOT do what BIRD does and it works as it should.
Berhaps Quagga stores the LSAs in the network endianity (big endian) and BIRD stores the LSAs in the host endianity?
Quagga does store its own LSAs in Big Endian, you have to make them BE before transmitting them anyway so you might as well store them directly in BE.
But you also need LSAs in host endianity when doing SPF calculation. Although it would be probably possible to change SPF calculation to use directly BE values it would be huge work and it is questionable whether it wouldn't just move endianity swaps deeper in the code.
But how do you know when swap endian or not? It seems to me that swapping endian back and forth for some LSAs and not for others is more work. Meanwhile, perhaps you could add something like this so only LE CPU suffer: diff --git a/proto/ospf/lsalib.c b/proto/ospf/lsalib.c index 35f02dc..2e40fb8 100644 --- a/proto/ospf/lsalib.c +++ b/proto/ospf/lsalib.c @@ -187,6 +187,11 @@ lsasum_calculate(struct ospf_lsa_header *h, void *body) u16 length = h->length; // log(L_WARN "Checksum %R %R %d start (len %d)", h->id, h->rt, h->type, length); + + if (1 == htonl(1)) { + (void) lsasum_check(h, body); + return; + } htonlsah(h, h); htonlsab(body, body, length - sizeof(struct ospf_lsa_header));
On Fri, Apr 23, 2010 at 05:29:09PM +0200, Joakim Tjernlund wrote:
But you also need LSAs in host endianity when doing SPF calculation. Although it would be probably possible to change SPF calculation to use directly BE values it would be huge work and it is questionable whether it wouldn't just move endianity swaps deeper in the code.
But how do you know when swap endian or not? It seems to me that swapping endian back and forth for some LSAs and not for others is more work.
In BIRD, we do net->host 4B endianity swap when receiving every LSA and do host->net 4B endianity swap when transmitting every LSA. If LSA contains for example 2x u16 fields, we have such field swapped in structure declaration on little endian systems (see ospf.h for example struct ospf_lsa_rt). When we do some LSA processing (for example in SPF), we can assume that everything is in host endianity.
Meanwhile, perhaps you could add something like this so only LE CPU suffer:
I would make some ifdefs to lsalib.c / lsalib.h to make htonlsah() (and others) an empty operation on big endians. -- Elen sila lumenn' omentielvo Ondrej 'SanTiago' Zajicek (email: santiago@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Ondrej Zajicek <santiago@crfreenet.org> wrote on 2010/04/23 18:46:19:
On Fri, Apr 23, 2010 at 05:29:09PM +0200, Joakim Tjernlund wrote:
But you also need LSAs in host endianity when doing SPF calculation. Although it would be probably possible to change SPF calculation to use directly BE values it would be huge work and it is questionable whether it wouldn't just move endianity swaps deeper in the code.
But how do you know when swap endian or not? It seems to me that swapping endian back and forth for some LSAs and not for others is more work.
In BIRD, we do net->host 4B endianity swap when receiving every LSA and do host->net 4B endianity swap when transmitting every LSA.
Ah, I haven't noticed that yet. I do think it is the wrong way though. You will must convert each LSA multiple times and will generate a lot memory traffic to and from main RAM. I don't think it would cost that much to keep the PDUs in BE, == and != ops don't care about endian so it is only when need to do arith. and > or < that you need host endian.
If LSA contains for example 2x u16 fields, we have such field swapped in structure declaration on little endian systems (see ospf.h for example struct ospf_lsa_rt).
When we do some LSA processing (for example in SPF), we can assume that everything is in host endianity.
Meanwhile, perhaps you could add something like this so only LE CPU suffer:
I would make some ifdefs to lsalib.c / lsalib.h to make htonlsah() (and others) an empty operation on big endians.
Yes, it would be great if on BE CPUs all the endian stuff were NOPs
participants (2)
-
Joakim Tjernlund -
Ondrej Zajicek