Andy Polyakov
64d92d7498
x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results.
...
"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.
AVX-512 results cover even Skylake-X :-)
Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-21 14:07:32 +02:00
Andy Polyakov
54f8f9a1ed
x86_64 assembly pack: fill some blanks in Ryzen results.
...
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
2017-07-03 18:17:00 +02:00
Andy Polyakov
6cbfd94d08
x86_64 assembly pack: add some Ryzen performance results.
...
Reviewed-by: Tim Hudson <tjh@openssl.org>
2017-03-22 10:58:01 +01:00
Andy Polyakov
399976c7ba
sha/asm/*-x86_64.pl: add CFI annotations.
...
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-15 15:43:05 +01:00
Andy Polyakov
384e6de4c7
x86_64 assembly pack: Win64 SEH face-lift.
...
- harmonize handlers with guidelines and themselves;
- fix some bugs in handlers;
- add missing handlers in chacha and ecp_nistz256 modules;
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-06 08:21:42 +01:00
Andy Polyakov
ace05265d2
x86_64 assembly pack: add Goldmont performance results.
...
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-10-24 13:01:13 +02:00
David Benjamin
609b0852e4
Remove trailing whitespace from some files.
...
The prevailing style seems to not have trailing whitespace, but a few
lines do. This is mostly in the perlasm files, but a few C files got
them after the reformat. This is the result of:
find . -name '*.pl' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
find . -name '*.c' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
find . -name '*.h' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
Then bn_prime.h was excluded since this is a generated file.
Note mkerr.pl has some changes in a heredoc for some help output, but
other lines there lack trailing whitespace too.
Reviewed-by: Kurt Roeckx <kurt@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
2016-10-10 23:36:21 +01:00
klemens
6025001707
spelling fixes, just comments and readme.
...
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/1413 )
2016-08-05 19:07:30 -04:00
Andy Polyakov
cfe1d9929e
x86_64 assembly pack: tolerate spaces in source directory name.
...
[as it is now quoting $output is not required, but done just in case]
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-29 14:12:51 +02:00
Rich Salz
6aa36e8e5a
Add OpenSSL copyright to .pl files
...
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-21 08:23:39 -04:00
Andy Polyakov
b974943234
x86_64 assembly pack: tune clang version detection even further.
...
RT#4171
Reviewed-by: Kurt Roeckx <kurt@openssl.org>
2015-12-13 22:18:18 +01:00
Andy Polyakov
76eba0d94b
x86_64 assembly pack: tune clang version detection.
...
RT#4142
Reviewed-by: Richard Levitte <levitte@openssl.org>
2015-11-23 16:00:06 +01:00
Andy Polyakov
b7f5503fa6
Skylake performance results.
...
Reviewed-by: Matt Caswell <matt@openssl.org>
2015-09-26 19:50:11 +02:00
Andy Polyakov
b59f92e75d
x86[_64] assembly pack: add Silvermont performance data.
...
Reviewed-by: Rich Salz <rsalz@openssl.org>
2014-08-30 19:13:49 +02:00
Andy Polyakov
07b635cceb
sha[1|512]-x86_64.pl: fix logical errors with $shaext=0.
2014-07-07 17:01:07 +02:00
Andy Polyakov
7eb9680ae1
sha512-x86_64.pl: fix typo.
...
PR: #3431
2014-07-05 23:59:57 +02:00
Andy Polyakov
29be3f6411
sha512-x86_64.pl: fix linking problem under Windows.
2014-07-01 17:11:22 +02:00
Andy Polyakov
a356e488ad
x86_64 assembly pack: refine clang detection.
2014-06-28 17:23:21 +02:00
Andy Polyakov
7eb0488280
x86_64 assembly pack: addendum to last clang commit.
2014-06-24 08:37:05 +02:00
Andy Polyakov
ac171925ab
x86_64 assembly pack: allow clang to compile AVX code.
2014-06-24 08:24:25 +02:00
Andy Polyakov
977f32e852
Facilitate back-porting of AESNI and SHA modules.
...
Fix SEH and stack handling in Win64 build.
2014-06-12 21:45:41 +02:00
Andy Polyakov
619b94667c
Add support for Intel SHA extension.
2014-06-11 10:27:45 +02:00
Andy Polyakov
147cca8f53
sha/asm/sha512-x86_64.pl: fix compilation error on Solaris.
2014-02-26 09:30:03 +01:00
Andy Polyakov
006784378d
crypto/sha/asm/sha*-x86_64.pl: comply with Win64 ABI.
2013-07-31 23:50:15 +02:00
Andy Polyakov
c7f690c243
sha512-x86_64.pl: upcoming-Atom-specific optimization.
2013-06-10 22:29:01 +02:00
Andy Polyakov
504bbcf3cd
sha512-x86_64.pl: +16% optimization for Atom.
...
(and pending AVX2 changes).
2013-05-25 19:02:57 +02:00
Andy Polyakov
c4558efbf3
sha512-x86_64.pl: add AVX2 code path.
2013-02-14 15:39:42 +01:00
Andy Polyakov
46bf83f07a
x86_64 assembly pack: make Windows build more robust.
...
PR: 2963 and a number of others
2013-01-22 22:27:28 +01:00
Andy Polyakov
f6ff1aa8e0
sha512-x86_64.pl: revert previous change and solve the problem through
...
perlasm/x86_64-xlate.pl instead.
2012-08-13 12:34:36 +00:00
Andy Polyakov
3a5485a9f8
sha512-x86_64.pl: minimum gas requirement for AMD XOP.
2012-08-13 11:01:44 +00:00
Andy Polyakov
6251989eb6
x86_64 assembly pack: make it possible to compile with Perl located on
...
path with spaces.
PR: 2835
2012-06-27 10:08:23 +00:00
Andy Polyakov
faee82c1bc
sha512-x86_64.pl: fix typo.
2012-06-25 17:13:15 +00:00
Andy Polyakov
a8f3b8b519
sha512-x86_64.pl: add SIMD code paths.
2012-06-24 19:22:06 +00:00
Andy Polyakov
ad880dc469
sha512-x86_64.pl: fix typo.
2012-06-19 07:50:10 +00:00
Andy Polyakov
3a9b3852c6
sha256-586.pl: squeeze some more, most notably ~10% on Nehalem.
2012-06-12 14:38:01 +00:00
Andy Polyakov
83698d3191
sha512-x86_64.pl: >5% better performance.
2012-05-28 17:47:15 +00:00
Andy Polyakov
d2fd65f6f6
sha512-x86_64.pl: +15% better performance on Westmere and incidentally Atom.
...
Other Intel processors +5%, Opteron -2%.
2011-09-17 11:30:28 +00:00
Andy Polyakov
3efe51a407
Revert previous Linux-specific/centric commit#19629. If it really has to
...
be done, it's definitely not the way to do it. So far answer to the
question was to ./config -Wa,--noexecstack (adopted by RedHat).
2010-05-05 22:05:39 +00:00
Ben Laurie
0e3ef596e5
Non-executable stack in asm.
2010-05-05 15:50:13 +00:00
Andy Polyakov
be01f79d3d
x86_64 assembler pack: add support for Win64 SEH.
2008-12-19 11:17:29 +00:00
Andy Polyakov
aa8f38e49b
x86_64 assembler pack to comply with updated styling x86_64-xlate.pl rules.
2008-11-12 08:15:52 +00:00
Andy Polyakov
55eab3b74b
Make x86_64 modules work under Win64/x64.
2007-08-23 12:01:58 +00:00
Andy Polyakov
c5f17d45c1
Further synchronizations with md32_common.h update, consistent naming
...
for low-level SHA block routines.
2006-10-17 16:13:18 +00:00
Andy Polyakov
4a5b8a5bee
Commentary section update in sha512-x86_64.pl.
2005-07-25 13:29:42 +00:00
Andy Polyakov
2337eb5823
SHA-256/-512 x86_64 assembler module.
2005-07-24 12:28:04 +00:00