"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.
AVX-512 results cover even Skylake-X :-)
Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!
Reviewed-by: Rich Salz <rsalz@openssl.org>
The assembler already knows the actual path to the generated file and,
in other perlasm architectures, is left to manage debug symbols itself.
Notably, in OpenSSL 1.1.x's new build system, which allows a separate
build directory, converting .pl to .s as the scripts currently do result
in the wrong paths.
This also avoids inconsistencies from some of the files using $0 and
some passing in the filename.
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3431)
Fix some comments too
[skip ci]
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3069)
Even though Apple refers to Procedure Call Standard for ARM Architecture
(AAPCS), they apparently adhere to custom version that doesn't follow
stack alignment constraints in the said standard. [Why or why? If it's
vendor lock-in thing, then it would be like worst spot ever.] And since
bsaes-armv7 relied on standard alignment, it became problematic to
execute the code on iOS.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Initial IV was disregarded on SHAEXT-capable processors. Amazingly
enough bulk AES128-SHA* talk-to-yourself tests were passing.
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/2992)
This removes the fips configure option. This option is broken as the
required FIPS code is not available.
FIPS_mode() and FIPS_mode_set() are retained for compatibility, but
FIPS_mode() always returns 0, and FIPS_mode_set() can only be used to
turn FIPS mode off.
Reviewed-by: Stephen Henson <steve@openssl.org>
Prevent undefined behavior in CRYPTO_cbc128_encrypt: calling this function
with the 'len' parameter being 0 would result in a memcpy where the source
and destination parameters are the same, which is undefined behavior.
Do same for AES_ige_encrypt.
Reviewed-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/2671)
- harmonize handlers with guidelines and themselves;
- fix some bugs in handlers;
- add missing handlers in chacha and ecp_nistz256 modules;
Reviewed-by: Rich Salz <rsalz@openssl.org>
Some of stone-age assembler can't cope with r0 in address. It's actually
sensible thing to do, because r0 is shunted to 0 in address arithmetic
and by refusing r0 assembler effectively makes you understand that.
Reviewed-by: Rich Salz <rsalz@openssl.org>
crypto/evp/e_aes.c: Types of inp and out parameters of
AES_xts_en/decrypt functions need to be changed from char to
unsigned char to avoid build error due to
'-Werror=incompatible-pointer-types'.
crypto/aes/asm/aes-s390x.pl: Comments need to reflect the above
change.
Signed-off-by: Patrick Steuer <psteuer@mail.de>
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
CLA: trivial
The prevailing style seems to not have trailing whitespace, but a few
lines do. This is mostly in the perlasm files, but a few C files got
them after the reformat. This is the result of:
find . -name '*.pl' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
find . -name '*.c' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
find . -name '*.h' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
Then bn_prime.h was excluded since this is a generated file.
Note mkerr.pl has some changes in a heredoc for some help output, but
other lines there lack trailing whitespace too.
Reviewed-by: Kurt Roeckx <kurt@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
MIPS[32|64]R6 is binary and source incompatible with previous MIPS ISA
specifications. Fortunately it's still possible to resolve differences
in source code with standard pre-processor and switching to trap-free
version of addition and subtraction instructions.
Reviewed-by: Richard Levitte <levitte@openssl.org>
Fix some indentation at the same time
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/1292)
Reviewed-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Kurt Roeckx <kurt@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/1264)
This is useful in Linux kernel context, in cases data happens
to be fragmented and processing can take multiple calls.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Add copyright to missing assembler files.
Add copyrights to missing test/* files.
Add copyrights
Various source and misc files.
Reviewed-by: Richard Levitte <levitte@openssl.org>
IBM argues that in certain scenarios capability query is really
expensive. At the same time it's asserted that query results can
be safely cached, because disabling CPACF is incompatible with
reboot-free operation.
Reviewed-by: Tim Hudson <tjh@openssl.org>
The Unix build was the last to retain the classic build scheme. The
new unified scheme has matured enough, even though some details may
need polishing.
Reviewed-by: Rich Salz <rsalz@openssl.org>
As it turns out branch hints grew as kind of a misconception. In
addition their interpretation by GNU assembler is affected by
assembler flags and can end up with opposite meaning on different
processors. As we have to loose quite a lot on misinterprerations,
especially on newer processors, we just omit them altogether.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Since NDEBUG is defined unconditionally on command line for release
builds, we can omit *_DEBUG options in favour of effective "all-on"
in debug builds exercised though CI.
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Emilia Käsper <emilia@openssl.org>
Don't have #error statements in header files, but instead wrap
the contents of that file in #ifndef OPENSSL_NO_xxx
This means it is now always safe to include the header file.
Reviewed-by: Richard Levitte <levitte@openssl.org>
The reason to do so is that some of the generators detect PIC flags
like -fPIC and -KPIC, and those are normally delivered in LD_CFLAGS.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Some of these scripts would recognise an output parameter if it looks
like a file path. That works both in both the classic and new build
schemes. Some fo these scripts would only recognise it if it's a
basename (i.e. no directory component). Those need to be corrected,
as the output parameter in the new build scheme is more likely to
contain a directory component than not.
Reviewed-by: Andy Polyakov <appro@openssl.org>