openssl

Author	SHA1	Message	Date
Matt Caswell	83cf7abf8e	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6371)	2018-05-29 13:16:04 +01:00
Andy Polyakov	13f6857db1	PPC assembly pack: add POWER9 results. Reviewed-by: Rich Salz <rsalz@openssl.org>	2018-05-10 11:44:21 +02:00
Matt Caswell	6ec5fce25e	Update copyright year Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6145)	2018-05-01 13:34:30 +01:00
Rahul Chaudhry	5bb1cd2292	poly1305/asm/poly1305-armv4.pl: remove unintentional relocation. Branch to global symbol results in reference to PLT, and when compiling for THUMB-2 - in a R_ARM_THM_JUMP19 relocation. Some linkers don't support this relocation (ld.gold), while others can end up truncating the relocation to fit (ld.bfd). Convert this branch through PLT into a direct branch that the assembler can resolve locally. See https://github.com/android-ndk/ndk/issues/337 for background. The current workaround is to disable poly1305 optimization assembly, which is not optimal and can be reverted after this patch: `beab607d2b` CLA: trivial Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/5949)	2018-04-18 19:47:53 +02:00
Andy Polyakov	4dfe4310c3	poly1305/asm/poly1305-x86_64.pl: add Knights Landing AVX512 result. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4855)	2017-12-23 16:06:25 +01:00
Andy Polyakov	a8f302e5ba	poly1305/asm/poly1305-x86_64.pl: switch to pure AVX512F. Convert AVX512F+VL+BW code path to pure AVX512F, so that it can be executed even on Knights Landing. Trigger for modification was observation that AVX512 code paths can negatively affect overall Skylake-X system performance. Since we are likely to suppress AVX512F capability flag [at least on Skylake-X], conversion serves as kind of "investment protection". Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4758)	2017-11-25 22:06:10 +01:00
Andy Polyakov	7533162322	ARMv8 assembly pack: add Qualcomm Kryo results. [skip ci] Reviewed-by: Tim Hudson <tjh@openssl.org>	2017-11-13 11:13:00 +01:00
Josh Soref	46f4e1bec5	Many spelling fixes/typo's corrected. Around 138 distinct errors found and fixed; thanks! Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3459)	2017-11-11 19:03:10 -05:00
Andy Polyakov	64d92d7498	x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results. "Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. AVX-512 results cover even Skylake-X :-) Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-21 14:07:32 +02:00
Andy Polyakov	54f8f9a1ed	x86_64 assembly pack: fill some blanks in Ryzen results. Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>	2017-07-03 18:17:00 +02:00
David Benjamin	e195c8a256	Remove filename argument to x86 asm_init. The assembler already knows the actual path to the generated file and, in other perlasm architectures, is left to manage debug symbols itself. Notably, in OpenSSL 1.1.x's new build system, which allows a separate build directory, converting .pl to .s as the scripts currently do result in the wrong paths. This also avoids inconsistencies from some of the files using $0 and some passing in the filename. Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Andy Polyakov <appro@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3431)	2017-05-11 17:00:23 -04:00
Andy Polyakov	0a5d1a38f2	poly1305/asm/poly1305-x86_64.pl: add poly1305_blocks_vpmadd52_8x. As hinted by its name new subroutine processes 8 input blocks in parallel by loading data to 512-bit registers. It still needs more work, as it needs to handle some specific input lengths better. In this sense it's yet another intermediate step... Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-03-22 10:59:59 +01:00
Andy Polyakov	6cbfd94d08	x86_64 assembly pack: add some Ryzen performance results. Reviewed-by: Tim Hudson <tjh@openssl.org>	2017-03-22 10:58:01 +01:00
Andy Polyakov	c2b935904a	poly1305/asm/poly1305-x86_64.pl: add poly1305_blocks_vpmadd52_4x. As hinted by its name new subroutine processes 4 input blocks in parallel. It still operates on 256-bit registers and is just another step toward full-blown AVX512IFMA procedure. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-03-13 18:48:34 +01:00
Andy Polyakov	a25cef89fd	poly1305/asm/poly1305-armv8.pl: ilp32-specific poly1305_init fix. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-03-13 18:46:11 +01:00
Andy Polyakov	e052083cc7	poly1305/asm/poly1305-x86_64.pl: minor AVX512 optimization. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-26 21:27:54 +01:00
Andy Polyakov	1c47e8836f	poly1305/asm/poly1305-x86_64.pl: add CFI annotations. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-26 21:26:07 +01:00
Andy Polyakov	fd910ef959	poly1305/asm/poly1305-x86_64.pl: add VPMADD52 code path. This is initial and minimal single-block implementation. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-25 18:36:41 +01:00
Andy Polyakov	73e8a5c826	poly1305/asm/poly1305-x86_64.pl: switch to vpermdd in table expansion. Effectively it's minor size optimization, 5-6% per affected subroutine. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-25 18:36:37 +01:00
Andy Polyakov	c1e1fc500d	poly1305/asm/poly1305-x86_64.pl: optimize AVX512 code path. On pre-Skylake best optimization strategy was balancing port-specific instructions, while on Skylake minimizing the sheer amount appears more sensible. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-25 18:35:45 +01:00
Andy Polyakov	a30b0522cb	x86 assembly pack: update performance results. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-12-19 16:18:25 +01:00
Andy Polyakov	1ea01427c5	poly1305/asm/poly1305-x86_64.pl: allow nasm to assemble AVX512 code. chacha/asm/chacha-x86_64.pl: refine nasm version detection logic. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-12-15 17:57:50 +01:00
Andy Polyakov	abb8c44fba	x86_64 assembly pack: add AVX512 ChaCha20 and Poly1305 code paths. Reviewed-by: Rich Salz <rsalz@openssl.org>	2016-12-12 10:58:04 +01:00
Andy Polyakov	ace05265d2	x86_64 assembly pack: add Goldmont performance results. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-10-24 13:01:13 +02:00
Andy Polyakov	947716c187	MIPS assembly pack: adapt it for MIPS[32\|64]R6. MIPS[32\|64]R6 is binary and source incompatible with previous MIPS ISA specifications. Fortunately it's still possible to resolve differences in source code with standard pre-processor and switching to trap-free version of addition and subtraction instructions. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-09-02 13:33:17 +02:00
Andy Polyakov	05ef4d1980	ARMv8 assembly pack: add Samsung Mongoose results. Reviewed-by: Tim Hudson <tjh@openssl.org>	2016-08-16 12:47:49 +02:00
klemens	6025001707	spelling fixes, just comments and readme. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/1413)	2016-08-05 19:07:30 -04:00
Andy Polyakov	2c12f22c33	SPARC assembly pack: enforce V8+ ABI constraints. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-07-01 14:25:38 +02:00
Andy Polyakov	cfe1d9929e	x86_64 assembly pack: tolerate spaces in source directory name. [as it is now quoting $output is not required, but done just in case] Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-29 14:12:51 +02:00
Andy Polyakov	8640f21093	poly1305/asm/poly1305-mips.pl: adhere to standard frame layout. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-28 22:17:59 +02:00
Andy Polyakov	ff823ee89b	SPARC assembly pack: add missing .type directives. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-28 22:14:13 +02:00
Rich Salz	6aa36e8e5a	Add OpenSSL copyright to .pl files Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-21 08:23:39 -04:00
Andy Polyakov	c6b77c16a6	MIPS64 assembly pack: add Poly1305 module. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-10 20:27:52 +02:00
Andy Polyakov	3992e8c023	poly1305/asm/poly1305-x86_64.pl: contain symbols within shared lib. We don't need it, but external users might find it handy. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-06 09:48:15 +02:00
Andy Polyakov	284116575d	poly1305/asm/poly1305-x86_64.pl: make it cross-compile. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-06 09:46:39 +02:00
Andy Polyakov	33ea23dc5c	SPARCv9 assembly pack: fine-tune run-time switch. Reviewed-by: Tim Hudson <tjh@openssl.org>	2016-04-26 21:35:05 +02:00
Andy Polyakov	dc3c5067cd	crypto/poly1305/asm: chase overflow bit on x86 and ARM platforms. Even though no test could be found to trigger this, paper-n-pencil estimate suggests that x86 and ARM inner loop lazy reductions can loose a bit in H4>>*5+H0 step. Reviewed-by: Emilia Käsper <emilia@openssl.org>	2016-04-25 22:56:09 +02:00
Andy Polyakov	6ca3e6e779	poly1305/asm/poly1305-x86_64.pl: not all assemblers manage << in constants. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-04-20 09:51:27 +02:00
Andy Polyakov	4b8736a22e	crypto/poly1305: don't break carry chains. RT#4483 [poly1305-armv4.pl: remove redundant #ifdef __thumb2__] [poly1305-ppc*.pl: presumably more accurate benchmark results] Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-04-04 16:56:20 +02:00
Andy Polyakov	bbe9769ba6	poly1305/asm/poly1305-x86.pl: don't loose 59-th bit. RT#4439 Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org>	2016-03-29 09:55:43 +02:00
Andy Polyakov	2460c7f133	poly1305/asm/poly1305-x86_64.pl: make it work with linux-x32. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-03-15 23:58:31 +01:00
Andy Polyakov	8d51db86f7	s390x assembly pack: 32-bit fixups. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-03-14 13:52:34 +01:00
Richard Levitte	a5aa63a456	Fix some assembler generating scripts for better unification Some of these scripts would recognise an output parameter if it looks like a file path. That works both in both the classic and new build schemes. Some fo these scripts would only recognise it if it's a basename (i.e. no directory component). Those need to be corrected, as the output parameter in the new build scheme is more likely to contain a directory component than not. Reviewed-by: Andy Polyakov <appro@openssl.org>	2016-03-11 00:54:31 +01:00
Richard Levitte	3aa3af68a5	Unified - adapt the generation of poly1305 assembler to use GENERATE This gets rid of the BEGINRAW..ENDRAW sections in crypto/poly1305/build.info. This also moves the assembler generating perl scripts to take the output file name as last command line argument, where necessary. Reviewed-by: Andy Polyakov <appro@openssl.org>	2016-03-09 11:09:26 +01:00
Andy Polyakov	eb77e8886d	SPARCv9 assembly pack: unify build rules and argument handling. Make all scripts produce .S, make interpretation of $(CFLAGS) pre-processor's responsibility, start accepting $(PERLASM_SCHEME). [$(PERLASM_SCHEME) is redundant in this case, because there are no deviataions between Solaris and Linux assemblers. This is purely to unify .pl->.S handling across all targets.] Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-03-08 15:51:06 +01:00
Andy Polyakov	1ea8ae5090	poly1305/asm/poly1305-*.pl: flip horizontal add and reduction. Formally only 32-bit AVX2 code path needs this, but I choose to harmonize all vector code paths. RT#4346 Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-03-02 13:11:38 +01:00
David Benjamin	bdbd3aea59	Consistently use arm_arch.h constants in armcap assembly code. Most of the assembly uses constants from arm_arch.h, but a few references to ARMV7_NEON don't. Consistently use the macros everywhere. Signed-off-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org>	2016-03-02 12:57:28 +01:00
Andy Polyakov	1457731221	poly1305/asm/poly1305-armv4.pl: replace ambiguous instruction. Different assembler versions disagree on how to interpret #-1 as argument to vmov.i64, as 0xffffffffffffffff or 0x00000000ffffffff. So replace it with something they can't disagree on. Reviewed-by: Rich Salz <rsalz@openssl.org>	2016-02-23 21:14:25 +01:00
Andy Polyakov	9e58d1192d	PPC assembly pack: add ChaCha20 and Poly1305 modules. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-02-13 17:21:47 +01:00
Andy Polyakov	f4e175e4af	C64x+ assembly pack: add ChaCha20 and Poly1305 modules. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-02-13 12:34:29 +01:00

1 2

56 commits