[PATCH 0/5] IP checksum improvements

Joakim Tjernlund joakim.tjernlund at transmode.se
Mon Apr 26 00:09:55 CEST 2010


> Ondrej Zajicek <santiago at crfreenet.org> wrote on 2010/04/25 23:20:52:
> >
> > On Sun, Apr 25, 2010 at 11:41:17AM +0200, Joakim Tjernlund wrote:
> > > Here are a series of performance improvements on the
> > > Internet checksum. With these changes applied I get about
> > > 20-30% better performance on x86 and PowerPC.
> >
> > Although i agree with Martin Mares that such kind of optimizations
> > should be done mainly if we know (from profiling) that BIRD spends
> > a significant share of time (during update processing) in that function,
> > i did some changes to the checksum function and merged some of these
> > patches.
> >
> > I did some more optimizations (changing the loop condition, removing len
> > decrement) and together with your change to add32 i got two times faster
> > checksum function (on x86) than the old code. Changing postincrement to
> > preincrement leads to worse results (only 1.4 times faster than the old
> > code) so i kept postincrement.
>
> On x86? That is strange. On x86 that should only lead to one
> extra add outside the loop, or so I think.

Ah, now I think I know. The while(buf < end) is optimized for
post inc so that is why.

I do think performance is worse on every other arch as the above is probably
very x86 tuned.
>
> the while(buf < end) definitely slower on any RISC like CPU. Did you test
> for (; len; --len)
>      sum = addr32(sum, *buf++);
> ?
> Was  the other arch's also faster with that?
>
>  Jocke
>
>




More information about the Bird-users mailing list