Commit graph

288 commits

Author SHA1 Message Date
Xiaoyin Liu
bac5b39c96 Fix typo in sha1-thumb.pl
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/4056)
2017-07-30 21:26:38 -04:00
Andy Polyakov
e3c79f0f19 sha/asm/keccak1600-avx512.pl: improve performance by 17%.
Improvement is result of combination of data layout ideas from
Keccak Code Package and initial version of this module.

Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-24 21:23:01 +02:00
Andy Polyakov
0d7903f83f sha/asm/keccak1600-avx512.pl: absorb bug-fix and minor optimization.
Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-21 14:12:14 +02:00
Andy Polyakov
64d92d7498 x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results.
"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.

AVX-512 results cover even Skylake-X :-)

Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-21 14:07:32 +02:00
Andy Polyakov
d212b98b36 sha/asm/keccak1600-avx2.pl: optimized remodelled version.
New register usage pattern allows to achieve sligtly better
performance. Not as much as I hoped for. Performance is believed
to be limited by irreconcilable write-back conflicts, rather than
lack of computational resources or data dependencies.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-15 23:04:38 +02:00
Andy Polyakov
91dbdc63bd sha/asm/keccak1600-avx2.pl: remodel register usage.
This gives much more freedom to rearrange instructions. This is
unoptimized version, provided for reference. Basically you need
to compare it to initial 29724d0e15
to figure out the key difference.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-15 23:04:18 +02:00
Andy Polyakov
c7c7a8e601 Optimize sha/asm/keccak1600-avx2.pl.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-10 10:16:42 +02:00
Andy Polyakov
29724d0e15 Add sha/asm/keccak1600-avx2.pl.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-10 10:16:31 +02:00
Andy Polyakov
313fa47fea Add sha/asm/keccak1600-avx512.pl.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/3861)
2017-07-07 10:04:33 +02:00
Andy Polyakov
edbc681d22 sha/asm/keccak1600-x86_64.pl: close gap with Keccak Code Package.
[Also typo and readability fixes. Ryzen result is added.]

Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
2017-07-03 18:18:02 +02:00
Andy Polyakov
b547aba954 sha/asm/keccak1600-s390x.pl: typo and readability, minor size optimization.
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
2017-07-03 18:17:55 +02:00
Andy Polyakov
54f8f9a1ed x86_64 assembly pack: fill some blanks in Ryzen results.
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
2017-07-03 18:17:00 +02:00
Andy Polyakov
7807267bed Add sha/asm/keccak1600-s390x.pl.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-06-29 21:16:02 +02:00
Andy Polyakov
d6f0c94a65 sha/asm/keccak1600-x86_64.pl: add CFI directives.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-06-29 21:15:56 +02:00
Andy Polyakov
a1613840dd sha/asm/keccak1600-x86_64.pl: optimize by re-ordering instructions.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-06-29 21:15:51 +02:00
Andy Polyakov
a078d9dfa9 sha/asm/keccak1600-x86_64.pl: remove redundant moves.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-06-29 21:15:45 +02:00
Andy Polyakov
64aef3f53d Add sha/asm/keccak1600-x86_64.pl.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2017-06-29 21:15:09 +02:00
Andy Polyakov
a163e60d95 sha/asm/keccak1600-mmx.pl: optimize for Atom and add comparison data.
Curiously enough out-of-order Silvermont benefited most from
optimization, 33%. [Originally mentioned "anomaly" turned to be
misreported frequency scaling problem. Correct results were
collected under older kernel.]

Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/3739)
2017-06-24 09:42:14 +02:00
Andy Polyakov
415248e1e1 Add sha/asm/keccak1600-mmx.pl, x86 MMX module.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/3739)
2017-06-24 09:42:08 +02:00
Andy Polyakov
b5cdec2fea sha/asm/sha512p8-ppc.pl: add POWER8 performance data.
[skip ci]

Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3705)
2017-06-21 16:26:59 +02:00
Andy Polyakov
53ddf7dd05 Add Keccak-1600 modules for PPC64 and POWER8.
[skip ci]

Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3705)
2017-06-21 16:24:36 +02:00
Andy Polyakov
1d23bbccd3 Add sha/asm/keccak1600-c64x.pl
[skip ci]

Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/3708)
2017-06-21 15:21:47 +02:00
Andy Polyakov
5eb2dd88b3 Add sha/asm/keccak1600-armv8.pl.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-15 21:53:30 +02:00
Andy Polyakov
6dad1efef7 sha/asm/keccak1600-armv4.pl: switch to more efficient bit interleaving algorithm.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-08 20:21:31 +02:00
Andy Polyakov
367c552790 sha/asm/keccak1600-armv4.pl: add NEON code path.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-06 19:54:29 +02:00
Andy Polyakov
56676f877d sha/asm/keccak1600-armv4.pl: add SHA3_absorb and SHA3_squeeze.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-06 19:54:24 +02:00
Andy Polyakov
5371810714 sha/asm/keccak1600-armv4.pl: optimization based on profiler feedback.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-06 19:54:19 +02:00
Andy Polyakov
aabfd32910 Add sha/asm/keccak1600-armv4.pl.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-06 19:54:12 +02:00
David Benjamin
e195c8a256 Remove filename argument to x86 asm_init.
The assembler already knows the actual path to the generated file and,
in other perlasm architectures, is left to manage debug symbols itself.
Notably, in OpenSSL 1.1.x's new build system, which allows a separate
build directory, converting .pl to .s as the scripts currently do result
in the wrong paths.

This also avoids inconsistencies from some of the files using $0 and
some passing in the filename.

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3431)
2017-05-11 17:00:23 -04:00
FdaSilvaYY
69687aa829 More typo fixes
Fix some comments too
[skip ci]

Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3069)
2017-03-29 07:14:29 +02:00
Andy Polyakov
6cbfd94d08 x86_64 assembly pack: add some Ryzen performance results.
Reviewed-by: Tim Hudson <tjh@openssl.org>
2017-03-22 10:58:01 +01:00
Emilia Kasper
b53338cbf8 Clean up references to FIPS
This removes the fips configure option. This option is broken as the
required FIPS code is not available.

FIPS_mode() and FIPS_mode_set() are retained for compatibility, but
FIPS_mode() always returns 0, and FIPS_mode_set() can only be used to
turn FIPS mode off.

Reviewed-by: Stephen Henson <steve@openssl.org>
2017-02-28 15:26:25 +01:00
Andy Polyakov
399976c7ba sha/asm/*-x86_64.pl: add CFI annotations.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-15 15:43:05 +01:00
Adam Langley
1f9e00a6fc sha/asm/sha1-x86_64.pl: add CFI annotations.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/2590)
2017-02-11 21:33:33 +01:00
Andy Polyakov
384e6de4c7 x86_64 assembly pack: Win64 SEH face-lift.
- harmonize handlers with guidelines and themselves;
- fix some bugs in handlers;
- add missing handlers in chacha and ecp_nistz256 modules;

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-06 08:21:42 +01:00
Andy Polyakov
a30b0522cb x86 assembly pack: update performance results.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-12-19 16:18:25 +01:00
Andy Polyakov
32bbb62ea6 sha/asm/sha512-armv8.pl: fix big-endian support in __KERNEL__ case.
In non-__KERNEL__ context 32-bit-style __ARMEB__/__ARMEL__ macros were
set in arm_arch.h, which is shared between 32- and 64-bit builds. Since
it's not included in __KERNEL__ case, we have to adhere to official
64-bit pre-defines, __AARCH64EB__/__AARCH64EL__.

[If we are to share more code, it would need similar adjustment.]

Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-11-17 19:29:58 +01:00
Andy Polyakov
866e505e0d sha/asm/sha512-armv8.pl: add NEON version of SHA256.
This provides up to 30% better performance on some of recent processors.

Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-11-11 13:48:16 +01:00
Andy Polyakov
413b6a8259 sha/asm/sha512-armv8.pl: adapt for kernel use.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-10-24 14:21:07 +02:00
Andy Polyakov
ace05265d2 x86_64 assembly pack: add Goldmont performance results.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-10-24 13:01:13 +02:00
David Benjamin
609b0852e4 Remove trailing whitespace from some files.
The prevailing style seems to not have trailing whitespace, but a few
lines do. This is mostly in the perlasm files, but a few C files got
them after the reformat. This is the result of:

  find . -name '*.pl' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
  find . -name '*.c' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
  find . -name '*.h' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'

Then bn_prime.h was excluded since this is a generated file.

Note mkerr.pl has some changes in a heredoc for some help output, but
other lines there lack trailing whitespace too.

Reviewed-by: Kurt Roeckx <kurt@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
2016-10-10 23:36:21 +01:00
Andy Polyakov
947716c187 MIPS assembly pack: adapt it for MIPS[32|64]R6.
MIPS[32|64]R6 is binary and source incompatible with previous MIPS ISA
specifications. Fortunately it's still possible to resolve differences
in source code with standard pre-processor and switching to trap-free
version of addition and subtraction instructions.

Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-09-02 13:33:17 +02:00
Andy Polyakov
05ef4d1980 ARMv8 assembly pack: add Samsung Mongoose results.
Reviewed-by: Tim Hudson <tjh@openssl.org>
2016-08-16 12:47:49 +02:00
Andy Polyakov
7123aa81e9 sha/asm/sha1-x86_64.pl: fix crash in SHAEXT code on Windows.
RT#4530

Reviewed-by: Tim Hudson <tjh@openssl.org>
2016-08-11 13:39:57 +02:00
klemens
6025001707 spelling fixes, just comments and readme.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/1413)
2016-08-05 19:07:30 -04:00
Rich Salz
b8a9af6881 Remove/rename some old files.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-06-01 11:29:57 -04:00
Andy Polyakov
cfe1d9929e x86_64 assembly pack: tolerate spaces in source directory name.
[as it is now quoting $output is not required, but done just in case]

Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-29 14:12:51 +02:00
Rich Salz
6aa36e8e5a Add OpenSSL copyright to .pl files
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-21 08:23:39 -04:00
Andy Polyakov
c6cb8e3ca4 Alpha assembly pack: make it work on Linux.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-04 08:51:08 +02:00
Andy Polyakov
f7dc4a3bd7 MIPS assembly pack: fix MIPS64 assembler warnings.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-04 08:48:53 +02:00