[PATCH 0/5] IP checksum improvements

Martin Mares mj at ucw.cz
Mon Apr 26 10:31:31 CEST 2010


Hello!

> while(buf != end) got worse in ppc. gcc 4.3.4 got even more worse
> than gcc 3.4.6. I think it is safe to say that gcc 4.3.4 is busted when
> it comes to optimization, even on x86. Seen -O1 do better than -O2 for
> x86 with gcc 3.4.3.

BTW have you tried unrolling the loops or using __attribute__((hot))?

> Since gcc in general isn't very good at optimization I think the best bet
> is to have different loops for different archs. I seen people do that based on
> endian:
> #ifdef CPU_BIG_ENDIAN
>   for(buf--; len, --len)
>     sum = acc32(sum, *++buf);
> #else
>   while(buf != end)
>     sum = add32(sum, *buf++);
> #endif

Huh, what should do endianity have in common with the choice of pre-/postincrement?

However, I would still like to see a full profile of running BIRD, so that
we know the real hot spots.

(If it turns out that the checksum function is a hot spot, it would be
interesting to write a SSE version.)

				Have a nice fortnight
-- 
Martin `MJ' Mares                          <mj at ucw.cz>   http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Not all rumors are as misleading as this one.



More information about the Bird-users mailing list