openssl

Author	SHA1	Message	Date
Lei Maohui	7b0fceed21	Fix build error for aarch64 big endian. Modified rev to rev64, because rev only takes integer registers. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90827 Otherwise, the following error will occur. Error: operand 1 must be an integer register -- `rev v31.16b,v31.16b' CLA: trivial Signed-off-by: Lei Maohui <leimaohui@cn.fujitsu.com> Reviewed-by: Shane Lontis <shane.lontis@oracle.com> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/9151)	2019-07-08 10:53:02 +02:00
Antoine Cœur	c2969ff6e7	Fix Typos CLA: trivial Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com> (Merged from https://github.com/openssl/openssl/pull/9288)	2019-07-02 14:22:29 +02:00
Andy Polyakov	6465321e40	ARM64 assembly pack: add ThunderX2 results. Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/8776)	2019-04-17 21:08:13 +02:00
Andy Polyakov	1b1ff9b94d	sha/asm/keccak1600-ppc64.pl: up 10% performance improvement. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/8444)	2019-03-11 12:33:39 +01:00
Andy Polyakov	db42bb440e	ARM64 assembly pack: make it Windows-friendly. "Windows friendliness" means a) unified PIC-ification, unified across all platforms; b) unified commantary delimiter; c) explicit ldur/stur, as Visual Studio assembler can't automatically encode ldr/str as ldur/stur when needed. Reviewed-by: Paul Dale <paul.dale@oracle.com> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/8256)	2019-02-16 17:01:15 +01:00
Andy Polyakov	3405db97e5	ARM assembly pack: make it Windows-friendly. "Windows friendliness" means a) flipping .thumb and .text directives, b) always generate Thumb-2 code when asked(); c) Windows-specific references to external OPENSSL_armcap_P. () so far some modules were compiled as .code 32 even if Thumb-2 was targeted. It works at hardware level because processor can alternate between the modes with no overhead. But clang --target=arm-windows's builtin assembler just refuses to compile .code 32... Reviewed-by: Paul Dale <paul.dale@oracle.com> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/8252)	2019-02-16 16:59:23 +01:00
Andy Polyakov	9a18aae5f2	AArch64 assembly pack: authenticate return addresses. ARMv8.3 adds pointer authentication extension, which in this case allows to ensure that, when offloaded to stack, return address is same at return as at entry to the subroutine. The new instructions are nops on processors that don't implement the extension, so that the vetification is backward compatible. Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/8205)	2019-02-12 19:00:42 +01:00
Richard Levitte	a598ed0dc4	Following the license change, modify the boilerplates in crypto/sha/ [skip ci] Reviewed-by: Matt Caswell <matt@openssl.org> (Merged from https://github.com/openssl/openssl/pull/7816)	2018-12-06 15:23:03 +01:00
Richard Levitte	389c09fa09	License: change any non-boilerplate comment referring to "OpenSSL license" Make it just say "the License", which refers back to the standard boilerplate. Reviewed-by: Matt Caswell <matt@openssl.org> (Merged from https://github.com/openssl/openssl/pull/7764)	2018-12-06 13:26:28 +01:00
Andy Polyakov	6b956fe77b	sha/asm/sha512p8-ppc.pl: optimize epilogue. Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/7643)	2018-11-16 09:23:50 +01:00
Andy Polyakov	79d7fb990c	sha/asm/sha512p8-ppc.pl: fix typo in prologue. Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/7643)	2018-11-16 09:23:50 +01:00
Andy Polyakov	9986bfefa4	sha/asm/keccak1600-armv8.pl: halve the size of hw-assisted subroutine. Yes, it's second halving, i.e. it's now 1/4 of original size, or more specifically inner loop. The challenge with Keccak is that you need more temporary registers than there are available. By reversing the order in which columns are assigned in Chi, it's possible to use three of A[][] registers as temporary prior their assigment. Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/7294)	2018-10-19 10:43:02 +02:00
Andy Polyakov	fc97c882f4	sha/asm/keccak1600-s390x.pl: resolve -march=z900 portability issue. Negative displacement in memory references was not originally specified, so that for maximum coverage one should abstain from it, just like with any other extension. [Unless it's guarded by run-time switch, but there is no switch in keccak1600-s390x.] Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/7239)	2018-10-12 20:51:27 +02:00
Matt Caswell	1212818eb0	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/7176)	2018-09-11 13:45:17 +01:00
Pauli	8794be2ed8	Remove development artifacts. The issue was discovered on the x86/64 when attempting to include libcrypto inside another shared library. A relocation of type R_X86_64_PC32 was generated which causes a linker error. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Andy Polyakov <appro@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6595)	2018-07-02 07:21:26 +10:00
Andy Polyakov	1753d12374	PA-RISC assembly pack: make it work with GNU assembler for HP-UX. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6583)	2018-06-25 16:45:48 +02:00
Andy Polyakov	2e51557bc9	sha/asm/sha{256\|512}-armv4.pl: harmonize thumb2 support with the rest. Reviewed-by: Richard Levitte <levitte@openssl.org>	2018-06-22 14:28:08 +02:00
Matt Caswell	fd38836ba8	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6538)	2018-06-20 15:29:23 +01:00
Andy Polyakov	b55e21b357	sha/asm/sha{1\|256}-586.pl: harmonize clang version detection. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6499)	2018-06-18 19:59:03 +02:00
Andy Polyakov	f0c77d66b4	sha/asm/sha512p8-ppc.pl: fix build on Mac OS X. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6419)	2018-06-06 22:13:24 +02:00
Andy Polyakov	c4d9ef4cc5	sha/asm/sha512p8-ppc.pl: improve POWER9 performance by ~10%. Biggest part, ~7%, of improvement resulted from omitting constants' table index increment in each round. And minor part from rescheduling instructions. Apparently POWER9 (and POWER8) manage to dispatch instructions more efficiently if they are laid down as if they have no latency... Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6406)	2018-06-03 21:20:40 +02:00
Andy Polyakov	41013cd63c	PPC assembly pack: correct POWER9 results. As it turns out originally published results were skewed by "turbo" mode. VM apparently remains oblivious to dynamic frequency scaling, and reports that processor operates at "base" frequency at all times. While actual frequency gets increased under load. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6406)	2018-06-03 21:20:06 +02:00
Matt Caswell	83cf7abf8e	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6371)	2018-05-29 13:16:04 +01:00
Andy Polyakov	13f6857db1	PPC assembly pack: add POWER9 results. Reviewed-by: Rich Salz <rsalz@openssl.org>	2018-05-10 11:44:21 +02:00
Matt Caswell	6ec5fce25e	Update copyright year Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6145)	2018-05-01 13:34:30 +01:00
Andy Polyakov	e9afe7a143	sha/asm/keccak1600-armv4.pl: adapt for multi-platform. Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6042)	2018-04-23 17:27:53 +02:00
Andy Polyakov	0fe72aaaa9	sha/asm/keccak1600-x86_64.pl: make it work on Windows. Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6042)	2018-04-23 17:27:31 +02:00
Andy Polyakov	dd2d7b19f8	sha/asm/keccak1600-armv8.pl: halve the size of hw-assisted subroutine. Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org>	2018-04-23 17:19:57 +02:00
Matt Caswell	b0edda11cb	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/5689)	2018-03-20 13:08:46 +00:00
Andy Polyakov	9d3cab4bdb	MIPS assembly pack: default heuristic detection to little-endian. Current endianness detection is somewhat opportunistic and can fail in cross-compile scenario. Since we are more likely to cross-compile for little-endian now, adjust the default accordingly. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/5613)	2018-03-19 14:31:30 +01:00
Richard Levitte	2bd3b626dd	Make a few more asm modules conform: last argument is output file Fixes #5310 Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/5315)	2018-03-08 19:31:41 +01:00
Andy Polyakov	b761ff4e77	sha/asm/keccak1600-armv8.pl: add hardware-assisted ARMv8.2 subroutines. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/5358)	2018-02-19 14:15:31 +01:00
Matt Caswell	6738bf1417	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org>	2018-02-13 13:59:25 +00:00
Andy Polyakov	af0fcf7b46	sha/asm/sha512-armv8.pl: add hardware-assisted SHA512 subroutine. Reviewed-by: Rich Salz <rsalz@openssl.org>	2018-02-12 14:05:05 +01:00
Andy Polyakov	24d06e8ca0	Add sha/asm/keccak1600-avx512vl.pl. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4948)	2017-12-22 12:38:40 +01:00
Andy Polyakov	7533162322	ARMv8 assembly pack: add Qualcomm Kryo results. [skip ci] Reviewed-by: Tim Hudson <tjh@openssl.org>	2017-11-13 11:13:00 +01:00
Josh Soref	46f4e1bec5	Many spelling fixes/typo's corrected. Around 138 distinct errors found and fixed; thanks! Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3459)	2017-11-11 19:03:10 -05:00
Patrick Steuer	bc4e831ccd	s390x assembly pack: extend s390x capability vector. Extend the s390x capability vector to store the longer facility list available from z13 onwards. The bits indicating the vector extensions are set to zero, if the kernel does not enable the vector facility. Also add capability bits returned by the crypto instructions' query functions. Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com> Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4542)	2017-10-30 14:31:32 +01:00
Patrick Steuer	af1d638730	s390x assembly pack: remove capability double-checking. An instruction's QUERY function is executed at initialization, iff the required MSA level is installed. Therefore, it is sufficient to check the bits returned by the QUERY functions. The MSA level does not have to be checked at every function call. crypto/aes/asm/aes-s390x.pl: The AES key schedule must be computed if the required KM or KMC function codes are not available. Formally, the availability of a KMC function code does not imply the availability of the corresponding KM function code. Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com> Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4501)	2017-10-17 21:55:33 +02:00
Rich Salz	e3713c365c	Remove email addresses from source code. Names were not removed. Some comments were updated. Replace Andy's address with openssl.org Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Paul Dale <paul.dale@oracle.com> (Merged from https://github.com/openssl/openssl/pull/4516)	2017-10-13 10:06:59 -04:00
Andy Polyakov	236dd46339	sha/asm/keccak1600-armv8.pl: fix return value buglet and ... ... script data load. On related note an attempt was made to merge rotations with logical operations. I mean as we know, ARM ISA has merged rotate-n-logical instructions which can be used here. And they were used to improve keccak1600-armv4 performance. But not here. Even though this approach resulted in improvement on Cortex-A53 proportional to reduction of amount of instructions, ~8%, it didn't exactly worked out on non-Cortex cores. Presumably because they break merged instructions to separate μ-ops, which results in higher operations count. X-Gene and Denver went ~20% slower and Apple A7 - 40%. The optimization was therefore dismissed. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-09-09 19:09:36 +02:00
Andy Polyakov	e0584e96c1	sha/asm/keccak1600-armv4.pl: optimize for Thumb-2. Reduce per-round instruction count in Thumb-2 case by 16%. This is achieved by folding ldr/str pairs to their double-word counterparts. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-08-16 20:25:20 +02:00
Andy Polyakov	3c1a60e56f	sha/asm/keccak1600-avx512.pl: fix buglet in SHA3_squeeze tail. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-08-12 12:23:31 +02:00
Andy Polyakov	d9ca12cbf6	sha/asm/keccak1600-armv4.pl: improve non-NEON performance by ~10%. This is achieved mostly by ~10% reduction of amount of instructions per round thanks to a) switch to KECCAK_2X variant; b) merge of almost 1/2 rotations with logical instructions. Performance is improved on all observed processors except on Cortex-A15. This is because it's capable of exploiting more parallelism and can execute original code for same amount of time. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/4057)	2017-08-02 23:22:28 +02:00
Xiaoyin Liu	bac5b39c96	Fix typo in sha1-thumb.pl Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4056)	2017-07-30 21:26:38 -04:00
Andy Polyakov	e3c79f0f19	sha/asm/keccak1600-avx512.pl: improve performance by 17%. Improvement is result of combination of data layout ideas from Keccak Code Package and initial version of this module. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-24 21:23:01 +02:00
Andy Polyakov	0d7903f83f	sha/asm/keccak1600-avx512.pl: absorb bug-fix and minor optimization. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-21 14:12:14 +02:00
Andy Polyakov	64d92d7498	x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results. "Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. AVX-512 results cover even Skylake-X :-) Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-21 14:07:32 +02:00
Andy Polyakov	d212b98b36	sha/asm/keccak1600-avx2.pl: optimized remodelled version. New register usage pattern allows to achieve sligtly better performance. Not as much as I hoped for. Performance is believed to be limited by irreconcilable write-back conflicts, rather than lack of computational resources or data dependencies. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-15 23:04:38 +02:00
Andy Polyakov	91dbdc63bd	sha/asm/keccak1600-avx2.pl: remodel register usage. This gives much more freedom to rearrange instructions. This is unoptimized version, provided for reference. Basically you need to compare it to initial `29724d0e15` to figure out the key difference. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-15 23:04:18 +02:00

1 2 3 4 5 ...

332 commits