Commit graph

85 commits

Author SHA1 Message Date
Andy Polyakov
4dfe4310c3 poly1305/asm/poly1305-x86_64.pl: add Knights Landing AVX512 result.
Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/4855)
2017-12-23 16:06:25 +01:00
Andy Polyakov
a8f302e5ba poly1305/asm/poly1305-x86_64.pl: switch to pure AVX512F.
Convert AVX512F+VL+BW code path to pure AVX512F, so that it can be
executed even on Knights Landing. Trigger for modification was
observation that AVX512 code paths can negatively affect overall
Skylake-X system performance. Since we are likely to suppress
AVX512F capability flag [at least on Skylake-X], conversion serves
as kind of "investment protection".

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/4758)
2017-11-25 22:06:10 +01:00
Andy Polyakov
7533162322 ARMv8 assembly pack: add Qualcomm Kryo results.
[skip ci]

Reviewed-by: Tim Hudson <tjh@openssl.org>
2017-11-13 11:13:00 +01:00
Josh Soref
46f4e1bec5 Many spelling fixes/typo's corrected.
Around 138 distinct errors found and fixed; thanks!

Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3459)
2017-11-11 19:03:10 -05:00
Andy Polyakov
64d92d7498 x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results.
"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.

AVX-512 results cover even Skylake-X :-)

Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-07-21 14:07:32 +02:00
Andy Polyakov
54f8f9a1ed x86_64 assembly pack: fill some blanks in Ryzen results.
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
2017-07-03 18:17:00 +02:00
David Benjamin
e195c8a256 Remove filename argument to x86 asm_init.
The assembler already knows the actual path to the generated file and,
in other perlasm architectures, is left to manage debug symbols itself.
Notably, in OpenSSL 1.1.x's new build system, which allows a separate
build directory, converting .pl to .s as the scripts currently do result
in the wrong paths.

This also avoids inconsistencies from some of the files using $0 and
some passing in the filename.

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3431)
2017-05-11 17:00:23 -04:00
Andy Polyakov
0a5d1a38f2 poly1305/asm/poly1305-x86_64.pl: add poly1305_blocks_vpmadd52_8x.
As hinted by its name new subroutine processes 8 input blocks in
parallel by loading data to 512-bit registers. It still needs more
work, as it needs to handle some specific input lengths better.
In this sense it's yet another intermediate step...

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-03-22 10:59:59 +01:00
Andy Polyakov
6cbfd94d08 x86_64 assembly pack: add some Ryzen performance results.
Reviewed-by: Tim Hudson <tjh@openssl.org>
2017-03-22 10:58:01 +01:00
Andy Polyakov
c2b935904a poly1305/asm/poly1305-x86_64.pl: add poly1305_blocks_vpmadd52_4x.
As hinted by its name new subroutine processes 4 input blocks in
parallel. It still operates on 256-bit registers and is just
another step toward full-blown AVX512IFMA procedure.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-03-13 18:48:34 +01:00
Andy Polyakov
a25cef89fd poly1305/asm/poly1305-armv8.pl: ilp32-specific poly1305_init fix.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-03-13 18:46:11 +01:00
Andy Polyakov
e052083cc7 poly1305/asm/poly1305-x86_64.pl: minor AVX512 optimization.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-26 21:27:54 +01:00
Andy Polyakov
1c47e8836f poly1305/asm/poly1305-x86_64.pl: add CFI annotations.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-26 21:26:07 +01:00
Andy Polyakov
fd910ef959 poly1305/asm/poly1305-x86_64.pl: add VPMADD52 code path.
This is initial and minimal single-block implementation.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-25 18:36:41 +01:00
Andy Polyakov
73e8a5c826 poly1305/asm/poly1305-x86_64.pl: switch to vpermdd in table expansion.
Effectively it's minor size optimization, 5-6% per affected subroutine.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-25 18:36:37 +01:00
Andy Polyakov
c1e1fc500d poly1305/asm/poly1305-x86_64.pl: optimize AVX512 code path.
On pre-Skylake best optimization strategy was balancing port-specific
instructions, while on Skylake minimizing the sheer amount appears
more sensible.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-02-25 18:35:45 +01:00
Todd Short
52ad5b60e3 Add support for Poly1305 in EVP_PKEY
Add Poly1305 as a "signed" digest.

Reviewed-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/2128)
2017-01-24 15:40:37 +01:00
Andy Polyakov
9872238eb6 poly1305/poly1305_base2_44.c: clarify shift boundary condition.
Reviewed-by: Matt Caswell <matt@openssl.org>
2017-01-21 22:33:38 +01:00
Andy Polyakov
a30b0522cb x86 assembly pack: update performance results.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-12-19 16:18:25 +01:00
Andy Polyakov
1ea01427c5 poly1305/asm/poly1305-x86_64.pl: allow nasm to assemble AVX512 code.
chacha/asm/chacha-x86_64.pl: refine nasm version detection logic.

Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-12-15 17:57:50 +01:00
Andy Polyakov
abb8c44fba x86_64 assembly pack: add AVX512 ChaCha20 and Poly1305 code paths.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-12-12 10:58:04 +01:00
Andy Polyakov
f2d78649fb poly1305/poly1305_base2_44.c: add reference base 2^44 implementation.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-12-12 10:54:59 +01:00
Richard Levitte
10b0b5ecd9 Revert "Move algorithm specific ppccap code from crypto/ppccap.c"
Now that we can link specifically with static libraries, the immediate
need to split ppccap.c (and eventually other *cap.c files) is no more.

This reverts commit e3fb4d3d52.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-11-10 16:24:02 +01:00
Richard Levitte
e3fb4d3d52 Move algorithm specific ppccap code from crypto/ppccap.c
Having that code in one central object file turned out to cause
trouble when building test/modes_internal_test.

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/1883)
2016-11-09 02:40:36 +01:00
Richard Levitte
aeac218372 Convert poly1305 selftest into internal test
Reviewed-by: Emilia Käsper <emilia@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/1789)
2016-11-03 13:13:31 +01:00
Andy Polyakov
ace05265d2 x86_64 assembly pack: add Goldmont performance results.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-10-24 13:01:13 +02:00
Andy Polyakov
947716c187 MIPS assembly pack: adapt it for MIPS[32|64]R6.
MIPS[32|64]R6 is binary and source incompatible with previous MIPS ISA
specifications. Fortunately it's still possible to resolve differences
in source code with standard pre-processor and switching to trap-free
version of addition and subtraction instructions.

Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-09-02 13:33:17 +02:00
Andy Polyakov
05ef4d1980 ARMv8 assembly pack: add Samsung Mongoose results.
Reviewed-by: Tim Hudson <tjh@openssl.org>
2016-08-16 12:47:49 +02:00
klemens
6025001707 spelling fixes, just comments and readme.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/1413)
2016-08-05 19:07:30 -04:00
Andy Polyakov
2c12f22c33 SPARC assembly pack: enforce V8+ ABI constraints.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-07-01 14:25:38 +02:00
Matt Caswell
3ce2fdabe6 Convert memset calls to OPENSSL_cleanse
Ensure things really do get cleared when we intend them to.

Addresses an OCAP Audit issue.

Reviewed-by: Andy Polyakov <appro@openssl.org>
2016-06-30 15:51:57 +01:00
Andy Polyakov
cfe1d9929e x86_64 assembly pack: tolerate spaces in source directory name.
[as it is now quoting $output is not required, but done just in case]

Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-29 14:12:51 +02:00
Andy Polyakov
8640f21093 poly1305/asm/poly1305-mips.pl: adhere to standard frame layout.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-28 22:17:59 +02:00
Andy Polyakov
ff823ee89b SPARC assembly pack: add missing .type directives.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-28 22:14:13 +02:00
Rich Salz
6aa36e8e5a Add OpenSSL copyright to .pl files
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-21 08:23:39 -04:00
Rich Salz
aa6bb1352b Copyright consolidation 05/10
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-17 15:38:09 -04:00
Rich Salz
49445f21da Use OPENSSL_hexchar2int
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-16 15:21:10 -04:00
Andy Polyakov
c6b77c16a6 MIPS64 assembly pack: add Poly1305 module.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-10 20:27:52 +02:00
FdaSilvaYY
dccd20d1b5 fix tab-space mixed indentation
No code change

Reviewed-by: Kurt Roeckx <kurt@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
2016-05-09 09:09:55 +01:00
Andy Polyakov
3992e8c023 poly1305/asm/poly1305-x86_64.pl: contain symbols within shared lib.
We don't need it, but external users might find it handy.

Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-06 09:48:15 +02:00
Andy Polyakov
284116575d poly1305/asm/poly1305-x86_64.pl: make it cross-compile.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-05-06 09:46:39 +02:00
FdaSilvaYY
8483a003bf various spelling fixes
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/952)
2016-04-28 14:22:26 -04:00
Andy Polyakov
33ea23dc5c SPARCv9 assembly pack: fine-tune run-time switch.
Reviewed-by: Tim Hudson <tjh@openssl.org>
2016-04-26 21:35:05 +02:00
Andy Polyakov
dc3c5067cd crypto/poly1305/asm: chase overflow bit on x86 and ARM platforms.
Even though no test could be found to trigger this, paper-n-pencil
estimate suggests that x86 and ARM inner loop lazy reductions can
loose a bit in H4>>*5+H0 step.

Reviewed-by: Emilia Käsper <emilia@openssl.org>
2016-04-25 22:56:09 +02:00
Richard Levitte
45c6e23c97 Remove --classic build entirely
The Unix build was the last to retain the classic build scheme.  The
new unified scheme has matured enough, even though some details may
need polishing.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-04-20 16:04:56 +02:00
Andy Polyakov
6ca3e6e779 poly1305/asm/poly1305-x86_64.pl: not all assemblers manage << in constants.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-04-20 09:51:27 +02:00
Rich Salz
e771eea6d8 Revert "various spelling fixes"
This reverts commit 620d540bd4.
It wasn't reviewed.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-04-04 16:11:43 -04:00
FdaSilvaYY
620d540bd4 various spelling fixes
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-04-04 15:06:32 -04:00
Andy Polyakov
4b8736a22e crypto/poly1305: don't break carry chains.
RT#4483

[poly1305-armv4.pl: remove redundant #ifdef __thumb2__]
[poly1305-ppc*.pl: presumably more accurate benchmark results]

Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-04-04 16:56:20 +02:00
Andy Polyakov
bbe9769ba6 poly1305/asm/poly1305-x86.pl: don't loose 59-th bit.
RT#4439

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
2016-03-29 09:55:43 +02:00