ce2c95b2a2
problem was that one of the replacement routines had not been working since SSLeay releases. For now the offending routine has been replaced with non-optimised assembler. Even so, this now gives around 95% performance improvement for 1024 bit RSA signs. |
||
---|---|---|
.. | ||
alpha | ||
alpha.works | ||
x86 | ||
.cvsignore | ||
alpha.s | ||
alpha.s.works | ||
bn-586.pl | ||
bn-alpha.pl | ||
bn-win32.asm | ||
ca.pl | ||
co-586.pl | ||
co-alpha.pl | ||
mips1.s | ||
mips3.s | ||
pa-risc.s | ||
pa-risc2.s | ||
r3000.s | ||
README | ||
sparcv8.S | ||
sparcv8plus.S | ||
vms.mar | ||
x86.pl | ||
x86w16.asm | ||
x86w32.asm |
All assember in this directory are just version of the file crypto/bn/bn_mulw.c. Quite a few of these files are just the assember output from gcc since on quite a few machines they are 2 times faster than the system compiler. For the x86, I have hand written assember because of the bad job all compilers seem to do on it. This normally gives a 2 time speed up in the RSA routines. For the DEC alpha, I also hand wrote the assember (except the division which is just the output from the C compiler pasted on the end of the file). On the 2 alpha C compilers I had access to, it was not possible to do 64b x 64b -> 128b calculations (both long and the long long data types were 64 bits). So the hand assember gives access to the 128 bit result and a 2 times speedup :-). The x86xxxx.obj files are the assembled version of x86xxxx.asm files. I had such a hard time finding a macro assember for Microsoft, I decided to include the object file to save others the hassle :-). I have also included uu encoded versions of the .obj incase they get trashed. There are 2 versions of assember for the HP PA-RISC. pa-risc.s is the origional one which works fine. pa-risc2.s is a new version that often generates warnings but if the tests pass, it gives performance that is over 2 times faster than pa-risc.s. Both were generated using gcc :-)