Unlike "upstream", Android NDK's arm64 gcc [but not clang] performs
64x64=128-bit multiplications with library calls, which appears to
have devastating impact on performance. [The condition is reduced to
__ANDROID__ [&& !__clang__], because x86_64 has corresponding
assembly module.]
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5589)
As it turns out gcc -pedantic doesn't seem to consider __uint128_t
as non-standard, unlike __int128 that is.
Fix even MSVC warnings in curve25519.c.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5449)
Currently it's limited to 64-bit platforms only as minimum radix
expected in assembly is 2^51.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/5408)
3 least significant bits of the input scalar are explicitly cleared,
hence swap variable has fixed value [of zero] upon exit from the loop.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/5408)
SPARC ISA doesn't have provisions to back up 128-bit multiplications
and additions. And so multiplications are done with library calls
and carries with comparisons and conditional moves. As result base
2^51 code is >40% slower...
Reviewed-by: Tim Hudson <tjh@openssl.org>
"Double" is in quotes because improvement coefficient varies
significantly depending on platform and compiler. You're likely
to measure ~2x improvement on popular desktop and server processors,
but not so much on mobile ones, even minor regression on ARM
Cortex series. Latter is because they have rather "weak" umulh
instruction. On low-end x86_64 problem is that contemporary gcc
and clang tend to opt for double-precision shift for >>51, which
can be devastatingly slow on some processors.
Just in case for reference, trick is to use 2^51 radix [currently
only for DH].
Reviewed-by: Rich Salz <rsalz@openssl.org>
It's argued that /WX allows to keep better focus on new code, which
motivates its comeback...
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/4721)
Fix undefined behaviour in curve25519.c. Prior to this running with
ubsan produces errors like this:
crypto/ec/curve25519.c:3871:18: runtime error: left shift of negative
value -22867
[extended tests]
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3600)
Rename and change ED25519_keypair_from_seed to ED25519_public_from_private
to be consistent with X25519 API.
Modidy ED25519_sign to take separate public key argument instead of
requiring it to follow the private key.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3503)
Reinstate Ed25519 algorithm to curv25519.c this is largely just a copy of
the code from BoringSSL with some adjustments so it compiles under OpenSSL.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3503)
This demystifies two for-loops that do nothing. They were used to write
the ladder in a unified way. Now that the ladder is otherwise commented,
remove the dead loops.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
Appease the sanitizer: avoid left shifts of negative values.
This could've been done entirely with casts to uint and back,
but using masks seemed slightly more readable.
There are also implementation-defined signed right shifts in this
code. Those remain.
Reviewed-by: Rich Salz <rsalz@openssl.org>
- Remove OPENSSL_X25519_X86_64 which never worked, because we don't have
the assembly.
- Also remove OPENSSL_SMALL (which should have been
OPENSSL_SMALL_FOOTPRINT) which isn't a priority at the moment.
Reviewed-by: Rich Salz <rsalz@openssl.org>