This addresses
- request for improvement for faster key setup in RT#3576;
- clearing registers and stack in RT#3554 (this is more of a gesture to
see if there will be some traction from compiler side);
- more commentary around input parameters handling and stack layout
(desired when RT#3553 was reviewed);
- minor size and single block performance optimization (was lying around);
Reviewed-by: Matt Caswell <matt@openssl.org>
(cherry picked from commit 23f6eec71d)
ARM has optimized Cortex-A5x pipeline to favour pairs of complementary
AES instructions. While modified code improves performance of post-r0p0
Cortex-A53 performance by >40% (for CBC decrypt and CTR), it hurts
original r0p0. We favour later revisions, because one can't prevent
future from coming. Improvement on post-r0p0 Cortex-A57 exceeds 50%,
while new code is not slower on r0p0, or Apple A7 for that matter.
[Update even SHA results for latest Cortex-A53.]
Reviewed-by: Richard Levitte <levitte@openssl.org>
(cherry picked from commit 94376cccb4)
Td4 and Te4 are arrays of u8. A u8 << int promotes the u8 to an int first then shifts.
If the mathematical result of a shift (as modelled by lhs * 2^{rhs}) is not representable
in an integer, behaviour is undefined. In other words, you can't shift into the sign bit
of a signed integer. Fix this by casting to u32 whenever we're shifting left by 24.
(For consistency, cast other shifts, too.)
Caught by -fsanitize=shift
Submitted by Nick Lewycky (Google)
Reviewed-by: Andy Polyakov <appro@openssl.org>
(cherry picked from commit 8b37e5c14f)
indent will not alter them when reformatting comments
(cherry picked from commit 1d97c84351)
Conflicts:
crypto/bn/bn_lcl.h
crypto/bn/bn_prime.c
crypto/engine/eng_all.c
crypto/rc4/rc4_utl.c
crypto/sha/sha.h
ssl/kssl.c
ssl/t1_lib.c
Reviewed-by: Tim Hudson <tjh@openssl.org>
This facilitates "universal" builds, ones that target multiple
architectures, e.g. ARMv5 through ARMv7. See commentary in
Configure for details.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(cherry picked from commit c1669e1c20)
Improve CBC decrypt and CTR by ~13/16%, which adds up to ~25/33%
improvement over "pre-Silvermont" version. [Add performance table to
aesni-x86.pl].
(cherry picked from commit 5599c7331b)
Add support for key wrap algorithms via EVP interface.
Generalise AES wrap algorithm and add to modes, making existing
AES wrap algorithm a special case.
Move test code to evptests.txt
(cherry picked from commit 97cf1f6c28)
Conflicts:
CHANGES