Commit graph

45 commits

Author SHA1 Message Date
Andy Polyakov
64d92d7498 x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results.
"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.

AVX-512 results cover even Skylake-X :-)

Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-21 14:07:32 +02:00
Andy Polyakov
54f8f9a1ed x86_64 assembly pack: fill some blanks in Ryzen results.
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
2017-07-03 18:17:00 +02:00
Andy Polyakov
6cbfd94d08 x86_64 assembly pack: add some Ryzen performance results.
Reviewed-by: Tim Hudson <tjh@openssl.org>
2017-03-22 10:58:01 +01:00
Andy Polyakov
399976c7ba sha/asm/*-x86_64.pl: add CFI annotations.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-15 15:43:05 +01:00
Andy Polyakov
384e6de4c7 x86_64 assembly pack: Win64 SEH face-lift.
- harmonize handlers with guidelines and themselves;
- fix some bugs in handlers;
- add missing handlers in chacha and ecp_nistz256 modules;

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-06 08:21:42 +01:00
Andy Polyakov
ace05265d2 x86_64 assembly pack: add Goldmont performance results.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-10-24 13:01:13 +02:00
David Benjamin
609b0852e4 Remove trailing whitespace from some files.
The prevailing style seems to not have trailing whitespace, but a few
lines do. This is mostly in the perlasm files, but a few C files got
them after the reformat. This is the result of:

  find . -name '*.pl' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
  find . -name '*.c' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
  find . -name '*.h' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'

Then bn_prime.h was excluded since this is a generated file.

Note mkerr.pl has some changes in a heredoc for some help output, but
other lines there lack trailing whitespace too.

Reviewed-by: Kurt Roeckx <kurt@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
2016-10-10 23:36:21 +01:00
klemens
6025001707 spelling fixes, just comments and readme.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/1413)
2016-08-05 19:07:30 -04:00
Andy Polyakov
cfe1d9929e x86_64 assembly pack: tolerate spaces in source directory name.
[as it is now quoting $output is not required, but done just in case]

Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-29 14:12:51 +02:00
Rich Salz
6aa36e8e5a Add OpenSSL copyright to .pl files
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-21 08:23:39 -04:00
Andy Polyakov
b974943234 x86_64 assembly pack: tune clang version detection even further.
RT#4171

Reviewed-by: Kurt Roeckx <kurt@openssl.org>
2015-12-13 22:18:18 +01:00
Andy Polyakov
76eba0d94b x86_64 assembly pack: tune clang version detection.
RT#4142

Reviewed-by: Richard Levitte <levitte@openssl.org>
2015-11-23 16:00:06 +01:00
Andy Polyakov
b7f5503fa6 Skylake performance results.
Reviewed-by: Matt Caswell <matt@openssl.org>
2015-09-26 19:50:11 +02:00
Andy Polyakov
b59f92e75d x86[_64] assembly pack: add Silvermont performance data.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2014-08-30 19:13:49 +02:00
Andy Polyakov
07b635cceb sha[1|512]-x86_64.pl: fix logical errors with $shaext=0. 2014-07-07 17:01:07 +02:00
Andy Polyakov
7eb9680ae1 sha512-x86_64.pl: fix typo.
PR: #3431
2014-07-05 23:59:57 +02:00
Andy Polyakov
29be3f6411 sha512-x86_64.pl: fix linking problem under Windows. 2014-07-01 17:11:22 +02:00
Andy Polyakov
a356e488ad x86_64 assembly pack: refine clang detection. 2014-06-28 17:23:21 +02:00
Andy Polyakov
7eb0488280 x86_64 assembly pack: addendum to last clang commit. 2014-06-24 08:37:05 +02:00
Andy Polyakov
ac171925ab x86_64 assembly pack: allow clang to compile AVX code. 2014-06-24 08:24:25 +02:00
Andy Polyakov
977f32e852 Facilitate back-porting of AESNI and SHA modules.
Fix SEH and stack handling in Win64 build.
2014-06-12 21:45:41 +02:00
Andy Polyakov
619b94667c Add support for Intel SHA extension. 2014-06-11 10:27:45 +02:00
Andy Polyakov
147cca8f53 sha/asm/sha512-x86_64.pl: fix compilation error on Solaris. 2014-02-26 09:30:03 +01:00
Andy Polyakov
006784378d crypto/sha/asm/sha*-x86_64.pl: comply with Win64 ABI. 2013-07-31 23:50:15 +02:00
Andy Polyakov
c7f690c243 sha512-x86_64.pl: upcoming-Atom-specific optimization. 2013-06-10 22:29:01 +02:00
Andy Polyakov
504bbcf3cd sha512-x86_64.pl: +16% optimization for Atom.
(and pending AVX2 changes).
2013-05-25 19:02:57 +02:00
Andy Polyakov
c4558efbf3 sha512-x86_64.pl: add AVX2 code path. 2013-02-14 15:39:42 +01:00
Andy Polyakov
46bf83f07a x86_64 assembly pack: make Windows build more robust.
PR: 2963 and a number of others
2013-01-22 22:27:28 +01:00
Andy Polyakov
f6ff1aa8e0 sha512-x86_64.pl: revert previous change and solve the problem through
perlasm/x86_64-xlate.pl instead.
2012-08-13 12:34:36 +00:00
Andy Polyakov
3a5485a9f8 sha512-x86_64.pl: minimum gas requirement for AMD XOP. 2012-08-13 11:01:44 +00:00
Andy Polyakov
6251989eb6 x86_64 assembly pack: make it possible to compile with Perl located on
path with spaces.

PR: 2835
2012-06-27 10:08:23 +00:00
Andy Polyakov
faee82c1bc sha512-x86_64.pl: fix typo. 2012-06-25 17:13:15 +00:00
Andy Polyakov
a8f3b8b519 sha512-x86_64.pl: add SIMD code paths. 2012-06-24 19:22:06 +00:00
Andy Polyakov
ad880dc469 sha512-x86_64.pl: fix typo. 2012-06-19 07:50:10 +00:00
Andy Polyakov
3a9b3852c6 sha256-586.pl: squeeze some more, most notably ~10% on Nehalem. 2012-06-12 14:38:01 +00:00
Andy Polyakov
83698d3191 sha512-x86_64.pl: >5% better performance. 2012-05-28 17:47:15 +00:00
Andy Polyakov
d2fd65f6f6 sha512-x86_64.pl: +15% better performance on Westmere and incidentally Atom.
Other Intel processors +5%, Opteron -2%.
2011-09-17 11:30:28 +00:00
Andy Polyakov
3efe51a407 Revert previous Linux-specific/centric commit#19629. If it really has to
be done, it's definitely not the way to do it. So far answer to the
question was to ./config -Wa,--noexecstack (adopted by RedHat).
2010-05-05 22:05:39 +00:00
Ben Laurie
0e3ef596e5 Non-executable stack in asm. 2010-05-05 15:50:13 +00:00
Andy Polyakov
be01f79d3d x86_64 assembler pack: add support for Win64 SEH. 2008-12-19 11:17:29 +00:00
Andy Polyakov
aa8f38e49b x86_64 assembler pack to comply with updated styling x86_64-xlate.pl rules. 2008-11-12 08:15:52 +00:00
Andy Polyakov
55eab3b74b Make x86_64 modules work under Win64/x64. 2007-08-23 12:01:58 +00:00
Andy Polyakov
c5f17d45c1 Further synchronizations with md32_common.h update, consistent naming
for low-level SHA block routines.
2006-10-17 16:13:18 +00:00
Andy Polyakov
4a5b8a5bee Commentary section update in sha512-x86_64.pl. 2005-07-25 13:29:42 +00:00
Andy Polyakov
2337eb5823 SHA-256/-512 x86_64 assembler module. 2005-07-24 12:28:04 +00:00