It was also found that stich performs suboptimally on AMD Jaguar, hence
execution is limited to XOP-capable and Intel processors.
Reviewed-by: Kurt Roeckx <kurt@openssl.org>
(cherry picked from commit a5fd24d19b)
This leaves behind files with names ending with '.iso-8859-1'. These
should be safe to remove. If something went wrong when re-encoding,
there will be some files with names ending with '.utf8' left behind.
Reviewed-by: Rich Salz <rsalz@openssl.org>
We had updates of certain header files in both Makefile.org and the
Makefile in the directory the header file lived in. This is error
prone and also sometimes generates slightly different results (usually
just a comment that differs) depending on which way the update was
done.
This removes the file update targets from the top level Makefile, adds
an update: target in all Makefiles and has it depend on the depend: or
local_depend: targets, whichever is appropriate, so we don't get a
double run through the whole file tree.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(cherry picked from commit 0f539dc1a2)
Conflicts:
Makefile.org
apps/Makefile
test/Makefile
This addresses
- request for improvement for faster key setup in RT#3576;
- clearing registers and stack in RT#3554 (this is more of a gesture to
see if there will be some traction from compiler side);
- more commentary around input parameters handling and stack layout
(desired when RT#3553 was reviewed);
- minor size and single block performance optimization (was lying around);
Reviewed-by: Matt Caswell <matt@openssl.org>
(cherry picked from commit 23f6eec71d)
ARM has optimized Cortex-A5x pipeline to favour pairs of complementary
AES instructions. While modified code improves performance of post-r0p0
Cortex-A53 performance by >40% (for CBC decrypt and CTR), it hurts
original r0p0. We favour later revisions, because one can't prevent
future from coming. Improvement on post-r0p0 Cortex-A57 exceeds 50%,
while new code is not slower on r0p0, or Apple A7 for that matter.
[Update even SHA results for latest Cortex-A53.]
Reviewed-by: Richard Levitte <levitte@openssl.org>
(cherry picked from commit 94376cccb4)
Td4 and Te4 are arrays of u8. A u8 << int promotes the u8 to an int first then shifts.
If the mathematical result of a shift (as modelled by lhs * 2^{rhs}) is not representable
in an integer, behaviour is undefined. In other words, you can't shift into the sign bit
of a signed integer. Fix this by casting to u32 whenever we're shifting left by 24.
(For consistency, cast other shifts, too.)
Caught by -fsanitize=shift
Submitted by Nick Lewycky (Google)
Reviewed-by: Andy Polyakov <appro@openssl.org>
(cherry picked from commit 8b37e5c14f)
indent will not alter them when reformatting comments
(cherry picked from commit 1d97c84351)
Conflicts:
crypto/bn/bn_lcl.h
crypto/bn/bn_prime.c
crypto/engine/eng_all.c
crypto/rc4/rc4_utl.c
crypto/sha/sha.h
ssl/kssl.c
ssl/t1_lib.c
Reviewed-by: Tim Hudson <tjh@openssl.org>
This facilitates "universal" builds, ones that target multiple
architectures, e.g. ARMv5 through ARMv7. See commentary in
Configure for details.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(cherry picked from commit c1669e1c20)
Improve CBC decrypt and CTR by ~13/16%, which adds up to ~25/33%
improvement over "pre-Silvermont" version. [Add performance table to
aesni-x86.pl].
(cherry picked from commit 5599c7331b)