openssl

Author	SHA1	Message	Date
Andy Polyakov	2cf7fd698e	AArch64 assembly pack: authenticate return addresses. ARMv8.3 adds pointer authentication extension, which in this case allows to ensure that, when offloaded to stack, return address is same at return as at entry to the subroutine. The new instructions are nops on processors that don't implement the extension, so that the vetification is backward compatible. Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/8205) (cherry picked from commit `9a18aae5f2`)	2019-02-13 02:39:27 +01:00
Matt Caswell	1212818eb0	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/7176)	2018-09-11 13:45:17 +01:00
Andy Polyakov	8977880603	poly1305/asm/poly1305-x86_64.pl: fix solaris64-x86_64-cc build. Reviewed-by: Paul Dale <paul.dale@oracle.com> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6676)	2018-07-10 12:01:56 +02:00
Andy Polyakov	0edb109f97	evp/e_chacha20_poly1305.c: further improve small-fragment TLS performance. Improvement coefficients vary with TLS fragment length and platform, on most Intel processors maximum improvement is ~50%, while on Ryzen - 80%. The "secret" is new dedicated ChaCha20_128 code path and vectorized xor helpers. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6638)	2018-07-06 16:33:19 +02:00
Matt Caswell	fd38836ba8	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6538)	2018-06-20 15:29:23 +01:00
Andy Polyakov	27635a4ecb	{chacha\|poly1305}/asm/*-x64.pl: harmonize clang version detection. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6499)	2018-06-18 19:59:07 +02:00
Andy Polyakov	41013cd63c	PPC assembly pack: correct POWER9 results. As it turns out originally published results were skewed by "turbo" mode. VM apparently remains oblivious to dynamic frequency scaling, and reports that processor operates at "base" frequency at all times. While actual frequency gets increased under load. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6406)	2018-06-03 21:20:06 +02:00
Matt Caswell	83cf7abf8e	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6371)	2018-05-29 13:16:04 +01:00
Andy Polyakov	13f6857db1	PPC assembly pack: add POWER9 results. Reviewed-by: Rich Salz <rsalz@openssl.org>	2018-05-10 11:44:21 +02:00
Matt Caswell	6ec5fce25e	Update copyright year Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6145)	2018-05-01 13:34:30 +01:00
Rahul Chaudhry	5bb1cd2292	poly1305/asm/poly1305-armv4.pl: remove unintentional relocation. Branch to global symbol results in reference to PLT, and when compiling for THUMB-2 - in a R_ARM_THM_JUMP19 relocation. Some linkers don't support this relocation (ld.gold), while others can end up truncating the relocation to fit (ld.bfd). Convert this branch through PLT into a direct branch that the assembler can resolve locally. See https://github.com/android-ndk/ndk/issues/337 for background. The current workaround is to disable poly1305 optimization assembly, which is not optimal and can be reverted after this patch: `beab607d2b` CLA: trivial Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/5949)	2018-04-18 19:47:53 +02:00
Andy Polyakov	4dfe4310c3	poly1305/asm/poly1305-x86_64.pl: add Knights Landing AVX512 result. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4855)	2017-12-23 16:06:25 +01:00
Andy Polyakov	a8f302e5ba	poly1305/asm/poly1305-x86_64.pl: switch to pure AVX512F. Convert AVX512F+VL+BW code path to pure AVX512F, so that it can be executed even on Knights Landing. Trigger for modification was observation that AVX512 code paths can negatively affect overall Skylake-X system performance. Since we are likely to suppress AVX512F capability flag [at least on Skylake-X], conversion serves as kind of "investment protection". Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4758)	2017-11-25 22:06:10 +01:00
Andy Polyakov	7533162322	ARMv8 assembly pack: add Qualcomm Kryo results. [skip ci] Reviewed-by: Tim Hudson <tjh@openssl.org>	2017-11-13 11:13:00 +01:00
Josh Soref	46f4e1bec5	Many spelling fixes/typo's corrected. Around 138 distinct errors found and fixed; thanks! Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3459)	2017-11-11 19:03:10 -05:00
Andy Polyakov	64d92d7498	x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results. "Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. AVX-512 results cover even Skylake-X :-) Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-21 14:07:32 +02:00
Andy Polyakov	54f8f9a1ed	x86_64 assembly pack: fill some blanks in Ryzen results. Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>	2017-07-03 18:17:00 +02:00
David Benjamin	e195c8a256	Remove filename argument to x86 asm_init. The assembler already knows the actual path to the generated file and, in other perlasm architectures, is left to manage debug symbols itself. Notably, in OpenSSL 1.1.x's new build system, which allows a separate build directory, converting .pl to .s as the scripts currently do result in the wrong paths. This also avoids inconsistencies from some of the files using $0 and some passing in the filename. Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Andy Polyakov <appro@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3431)	2017-05-11 17:00:23 -04:00
Andy Polyakov	0a5d1a38f2	poly1305/asm/poly1305-x86_64.pl: add poly1305_blocks_vpmadd52_8x. As hinted by its name new subroutine processes 8 input blocks in parallel by loading data to 512-bit registers. It still needs more work, as it needs to handle some specific input lengths better. In this sense it's yet another intermediate step... Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-03-22 10:59:59 +01:00
Andy Polyakov	6cbfd94d08	x86_64 assembly pack: add some Ryzen performance results. Reviewed-by: Tim Hudson <tjh@openssl.org>	2017-03-22 10:58:01 +01:00
Andy Polyakov	c2b935904a	poly1305/asm/poly1305-x86_64.pl: add poly1305_blocks_vpmadd52_4x. As hinted by its name new subroutine processes 4 input blocks in parallel. It still operates on 256-bit registers and is just another step toward full-blown AVX512IFMA procedure. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-03-13 18:48:34 +01:00
Andy Polyakov	a25cef89fd	poly1305/asm/poly1305-armv8.pl: ilp32-specific poly1305_init fix. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-03-13 18:46:11 +01:00
Andy Polyakov	e052083cc7	poly1305/asm/poly1305-x86_64.pl: minor AVX512 optimization. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-26 21:27:54 +01:00
Andy Polyakov	1c47e8836f	poly1305/asm/poly1305-x86_64.pl: add CFI annotations. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-26 21:26:07 +01:00
Andy Polyakov	fd910ef959	poly1305/asm/poly1305-x86_64.pl: add VPMADD52 code path. This is initial and minimal single-block implementation. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-25 18:36:41 +01:00
Andy Polyakov	73e8a5c826	poly1305/asm/poly1305-x86_64.pl: switch to vpermdd in table expansion. Effectively it's minor size optimization, 5-6% per affected subroutine. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-25 18:36:37 +01:00
Andy Polyakov	c1e1fc500d	poly1305/asm/poly1305-x86_64.pl: optimize AVX512 code path. On pre-Skylake best optimization strategy was balancing port-specific instructions, while on Skylake minimizing the sheer amount appears more sensible. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-25 18:35:45 +01:00
Andy Polyakov	a30b0522cb	x86 assembly pack: update performance results. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-12-19 16:18:25 +01:00
Andy Polyakov	1ea01427c5	poly1305/asm/poly1305-x86_64.pl: allow nasm to assemble AVX512 code. chacha/asm/chacha-x86_64.pl: refine nasm version detection logic. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-12-15 17:57:50 +01:00
Andy Polyakov	abb8c44fba	x86_64 assembly pack: add AVX512 ChaCha20 and Poly1305 code paths. Reviewed-by: Rich Salz <rsalz@openssl.org>	2016-12-12 10:58:04 +01:00
Andy Polyakov	ace05265d2	x86_64 assembly pack: add Goldmont performance results. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-10-24 13:01:13 +02:00
Andy Polyakov	947716c187	MIPS assembly pack: adapt it for MIPS[32\|64]R6. MIPS[32\|64]R6 is binary and source incompatible with previous MIPS ISA specifications. Fortunately it's still possible to resolve differences in source code with standard pre-processor and switching to trap-free version of addition and subtraction instructions. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-09-02 13:33:17 +02:00
Andy Polyakov	05ef4d1980	ARMv8 assembly pack: add Samsung Mongoose results. Reviewed-by: Tim Hudson <tjh@openssl.org>	2016-08-16 12:47:49 +02:00
klemens	6025001707	spelling fixes, just comments and readme. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/1413)	2016-08-05 19:07:30 -04:00
Andy Polyakov	2c12f22c33	SPARC assembly pack: enforce V8+ ABI constraints. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-07-01 14:25:38 +02:00
Andy Polyakov	cfe1d9929e	x86_64 assembly pack: tolerate spaces in source directory name. [as it is now quoting $output is not required, but done just in case] Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-29 14:12:51 +02:00
Andy Polyakov	8640f21093	poly1305/asm/poly1305-mips.pl: adhere to standard frame layout. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-28 22:17:59 +02:00
Andy Polyakov	ff823ee89b	SPARC assembly pack: add missing .type directives. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-28 22:14:13 +02:00
Rich Salz	6aa36e8e5a	Add OpenSSL copyright to .pl files Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-21 08:23:39 -04:00
Andy Polyakov	c6b77c16a6	MIPS64 assembly pack: add Poly1305 module. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-10 20:27:52 +02:00
Andy Polyakov	3992e8c023	poly1305/asm/poly1305-x86_64.pl: contain symbols within shared lib. We don't need it, but external users might find it handy. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-06 09:48:15 +02:00
Andy Polyakov	284116575d	poly1305/asm/poly1305-x86_64.pl: make it cross-compile. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-06 09:46:39 +02:00
Andy Polyakov	33ea23dc5c	SPARCv9 assembly pack: fine-tune run-time switch. Reviewed-by: Tim Hudson <tjh@openssl.org>	2016-04-26 21:35:05 +02:00
Andy Polyakov	dc3c5067cd	crypto/poly1305/asm: chase overflow bit on x86 and ARM platforms. Even though no test could be found to trigger this, paper-n-pencil estimate suggests that x86 and ARM inner loop lazy reductions can loose a bit in H4>>*5+H0 step. Reviewed-by: Emilia Käsper <emilia@openssl.org>	2016-04-25 22:56:09 +02:00
Andy Polyakov	6ca3e6e779	poly1305/asm/poly1305-x86_64.pl: not all assemblers manage << in constants. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-04-20 09:51:27 +02:00
Andy Polyakov	4b8736a22e	crypto/poly1305: don't break carry chains. RT#4483 [poly1305-armv4.pl: remove redundant #ifdef __thumb2__] [poly1305-ppc*.pl: presumably more accurate benchmark results] Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-04-04 16:56:20 +02:00
Andy Polyakov	bbe9769ba6	poly1305/asm/poly1305-x86.pl: don't loose 59-th bit. RT#4439 Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org>	2016-03-29 09:55:43 +02:00
Andy Polyakov	2460c7f133	poly1305/asm/poly1305-x86_64.pl: make it work with linux-x32. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-03-15 23:58:31 +01:00
Andy Polyakov	8d51db86f7	s390x assembly pack: 32-bit fixups. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-03-14 13:52:34 +01:00
Richard Levitte	a5aa63a456	Fix some assembler generating scripts for better unification Some of these scripts would recognise an output parameter if it looks like a file path. That works both in both the classic and new build schemes. Some fo these scripts would only recognise it if it's a basename (i.e. no directory component). Those need to be corrected, as the output parameter in the new build scheme is more likely to contain a directory component than not. Reviewed-by: Andy Polyakov <appro@openssl.org>	2016-03-11 00:54:31 +01:00

1 2

63 commits