The lazy-initialisation of BN_MONT_CTX was serialising all threads, as
noted by Daniel Sands and co at Sandia. This was to handle the case that
2 or more threads race to lazy-init the same context, but stunted all
scalability in the case where 2 or more threads are doing unrelated
things! We favour the latter case by punishing the former. The init work
gets done by each thread that finds the context to be uninitialised, and
we then lock the "set" logic after that work is done - the winning
thread's work gets used, the losing threads throw away what they've done.
Signed-off-by: Geoff Thorpe <geoff@openssl.org>
It's not clear whether this inconsistency could lead to an actual
computation error, but it involved a BIGNUM being passed around the
montgomery logic in an inconsistent state. This was found using flags
-DBN_DEBUG -DBN_DEBUG_RAND, and working backwards from this assertion
in 'ectest';
ectest: bn_mul.c:960: BN_mul: Assertion `(_bnum2->top == 0) ||
(_bnum2->d[_bnum2->top - 1] != 0)' failed
Signed-off-by: Geoff Thorpe <geoff@openssl.org>
(cherry picked from commit a529261891)
Fix for the attack described in the paper "Recovering OpenSSL
ECDSA Nonces Using the FLUSH+RELOAD Cache Side-channel Attack"
by Yuval Yarom and Naomi Benger. Details can be obtained from:
http://eprint.iacr.org/2014/140
Thanks to Yuval Yarom and Naomi Benger for discovering this
flaw and to Yuval Yarom for supplying a fix.
(cherry picked from commit 2198be3483)
Conflicts:
CHANGES
eliminating them as dead code.
Both volatile and "memory" are used because of some concern that the compiler
may still cache values across the asm block without it, and because this was
such a painful debugging session that I wanted to ensure that it's never
repeated.
(cherry picked from commit 7753a3a684)
rsaz_exp.c: harmonize line terminating;
asm/rsaz-*.pl: minor optimizations.
asm/rsaz-x86_64.pl: sync from master.
(cherry picked from commit 31ed9a2131)
Latest MIPS ISA specification declared 'branch likely' instructions
obsolete. To makes code future-proof replace them with equivalent.
(cherry picked from commit 0c2adb0a9b)
Improve RSA sing performance by 20-30% by:
- switching from floating-point to integer conditional moves;
- daisy-chaining sqr-sqr-sqr-sqr-sqr-mul sequences;
- using MONTMUL even during powers table setup;
(cherry picked from commit 4ddacd9921)