[ros-dev] Speed Tests (was: ping Alex regarding
log2() forscheduler)
Mark Junker
mjscod at gmx.de
Thu Mar 24 11:25:48 CET 2005
Ash schrieb:
> BSR has a latency of 8-12 Cycles on Athlon/P3 but can be pipelined.
> Worse (up to ~80 cycles) on Pentium and other older CPUs.
> http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_739_3748,00.html
>
My tests have shown that you're right and BSR is much too slow.
> Dont know about A64 - maybe someone can test BSR with A64?
I have an AMD64 here but it doesn't run in 64 bit mode.
> It doesnt make much sense to put the optimized ASM in there, neither
> is much hope of GCC having a good day and doing a lot of optimisation.
> So far the best option would be the macro with a lookup table (only
> one global kernel table tho).
I've converted your sources to be compileable with GCC (MinGW). I
attached the sources.
> Here are the updated STATS
> also available at http://hackersquest.org/kerneltest.html
>
> result orig function 46ffffe9
> it took 1526862 18%
> result orig function inlined 46ffffe9
> it took 1041460 12%
> result second proposal inlined 46ffffe9
> it took 1248990 15%
> result optimized asm 46ffffe9
> it took 1321532 16%
> result lookup inlined 46ffffe9
> it took 682264 8%
> result bsr inlined 46ffffe9
> it took 1751088 21%
> result macro 46ffffe9
> it took 653692 7%
This are my results on the AMD64 using your Release-EXE:
STATS
result orig function 46ffffe9
it took 1272638 18%
result orig function inlined 46ffffe9
it took 875751 12%
result second proposal inlined 46ffffe9
it took 1051861 15%
result optimized asm 46ffffe9
it took 1225282 17%
result lookup inlined 46ffffe9
it took 549861 7%
result bsr inlined 46ffffe9
it took 1410179 20%
result macro 46ffffe9
it took 607638 8%
This are my results using the GCC EXE (-O2):
STATS
result orig function 46ffffe9
it took 1321663 24%
result orig function inlined 46ffffe9
it took 879318 16%
result second proposal inlined 46ffffe9
it took 940285 17%
result lookup inlined 46ffffe9
it took 615267 11%
result bsr inlined 46ffffe9
it took 1103432 20%
result macro 46ffffe9
it took 484450 9%
BTW: I had to remove all functions using the __asm() statement. The
"result bsr inlined" uses my GCC BSR macro. You can see that using BSR
seems to be much too slow ...
Regards,
Mark
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SpeedTest.zip
Type: application/x-zip-compressed
Size: 3062 bytes
Desc: not available
Url : http://reactos.com:8080/pipermail/ros-dev/attachments/20050324/1e8955f9/SpeedTest.bin
More information about the Ros-dev
mailing list