openssl/crypto/bn/asm
Richard Levitte 6ab285bf4c I think I got it now. Apparently, the case of having to shift down
the divisor was a bit more complex than I first saw.  The lost bit
can't just be discarded, as there are cases where it is important.
For example, look at dividing 320000 with 80000 vs. 80001 (all
decimals), the difference is crucial.  The trick here is to check if
that lost bit was 1, and in that case, do the following:

1. subtract the quotient from the remainder
2. as long as the remainder is negative, add the divisor (the whole
   divisor, not the shofted down copy) to it, and decrease the
   quotient by one.

There's probably a nice mathematical proof for this already, but I
won't bother with that, unless someone requests it from me.
2002-12-02 21:31:45 +00:00
..
alpha
alpha.works
x86
.cvsignore More CVS ignore stuff... 1999-02-25 09:06:30 +00:00
alpha.s Fix assembler for Alpha (tested only on DEC OSF not Linux or *BSD). The 1999-11-03 14:10:10 +00:00
alpha.s.works
bn-586.pl remove a comment that shouldn't have been there any more 2000-12-06 16:30:23 +00:00
bn-alpha.pl
ca.pl
co-586.pl
co-alpha.pl
ia64.S Support for Intel and HP-UXi assemblers. 2001-07-30 15:54:13 +00:00
mips1.s
mips3.s This fixes "Spurious test failures on IRIX?" reported in April. Apparently 2001-06-22 19:17:42 +00:00
pa-risc.s
pa-risc2.s A compiler warning removed. Thanks to the folks at HP! 2000-09-27 13:54:28 +00:00
pa-risc2.s.old A patch from HP for better performance. 2000-09-17 20:04:42 +00:00
pa-risc2W.s A compiler warning removed. Thanks to the folks at HP! 2000-09-27 13:54:28 +00:00
r3000.s
README Very few in the "README" is up-to-date 2000-12-15 10:42:11 +00:00
sparcv8.S - performance retunes, v8plus bn_*_comba routines are reimplemented; 1999-07-25 12:34:30 +00:00
sparcv8plus.S - performance retunes, v8plus bn_*_comba routines are reimplemented; 1999-07-25 12:34:30 +00:00
vms.mar I think I got it now. Apparently, the case of having to shift down 2002-12-02 21:31:45 +00:00
x86.pl

<OBSOLETE>

All assember in this directory are just version of the file
crypto/bn/bn_asm.c.

Quite a few of these files are just the assember output from gcc since on 
quite a few machines they are 2 times faster than the system compiler.

For the x86, I have hand written assember because of the bad job all
compilers seem to do on it.  This normally gives a 2 time speed up in the RSA
routines.

For the DEC alpha, I also hand wrote the assember (except the division which
is just the output from the C compiler pasted on the end of the file).
On the 2 alpha C compilers I had access to, it was not possible to do
64b x 64b -> 128b calculations (both long and the long long data types
were 64 bits).  So the hand assember gives access to the 128 bit result and
a 2 times speedup :-).

There are 3 versions of assember for the HP PA-RISC.

pa-risc.s is the origional one which works fine and generated using gcc :-)

pa-risc2W.s and pa-risc2.s are 64 and 32-bit PA-RISC 2.0 implementations
by Chris Ruemmler from HP (with some help from the HP C compiler).

</OBSOLETE>