openssl

Author	SHA1	Message	Date
Andy Polyakov	24d06e8ca0	Add sha/asm/keccak1600-avx512vl.pl. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4948)	2017-12-22 12:38:40 +01:00
Andy Polyakov	7533162322	ARMv8 assembly pack: add Qualcomm Kryo results. [skip ci] Reviewed-by: Tim Hudson <tjh@openssl.org>	2017-11-13 11:13:00 +01:00
Josh Soref	46f4e1bec5	Many spelling fixes/typo's corrected. Around 138 distinct errors found and fixed; thanks! Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3459)	2017-11-11 19:03:10 -05:00
Patrick Steuer	bc4e831ccd	s390x assembly pack: extend s390x capability vector. Extend the s390x capability vector to store the longer facility list available from z13 onwards. The bits indicating the vector extensions are set to zero, if the kernel does not enable the vector facility. Also add capability bits returned by the crypto instructions' query functions. Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com> Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4542)	2017-10-30 14:31:32 +01:00
Patrick Steuer	af1d638730	s390x assembly pack: remove capability double-checking. An instruction's QUERY function is executed at initialization, iff the required MSA level is installed. Therefore, it is sufficient to check the bits returned by the QUERY functions. The MSA level does not have to be checked at every function call. crypto/aes/asm/aes-s390x.pl: The AES key schedule must be computed if the required KM or KMC function codes are not available. Formally, the availability of a KMC function code does not imply the availability of the corresponding KM function code. Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com> Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4501)	2017-10-17 21:55:33 +02:00
Rich Salz	e3713c365c	Remove email addresses from source code. Names were not removed. Some comments were updated. Replace Andy's address with openssl.org Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Paul Dale <paul.dale@oracle.com> (Merged from https://github.com/openssl/openssl/pull/4516)	2017-10-13 10:06:59 -04:00
Andy Polyakov	236dd46339	sha/asm/keccak1600-armv8.pl: fix return value buglet and ... ... script data load. On related note an attempt was made to merge rotations with logical operations. I mean as we know, ARM ISA has merged rotate-n-logical instructions which can be used here. And they were used to improve keccak1600-armv4 performance. But not here. Even though this approach resulted in improvement on Cortex-A53 proportional to reduction of amount of instructions, ~8%, it didn't exactly worked out on non-Cortex cores. Presumably because they break merged instructions to separate μ-ops, which results in higher operations count. X-Gene and Denver went ~20% slower and Apple A7 - 40%. The optimization was therefore dismissed. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-09-09 19:09:36 +02:00
Andy Polyakov	e0584e96c1	sha/asm/keccak1600-armv4.pl: optimize for Thumb-2. Reduce per-round instruction count in Thumb-2 case by 16%. This is achieved by folding ldr/str pairs to their double-word counterparts. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-08-16 20:25:20 +02:00
Andy Polyakov	3c1a60e56f	sha/asm/keccak1600-avx512.pl: fix buglet in SHA3_squeeze tail. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-08-12 12:23:31 +02:00
Andy Polyakov	d9ca12cbf6	sha/asm/keccak1600-armv4.pl: improve non-NEON performance by ~10%. This is achieved mostly by ~10% reduction of amount of instructions per round thanks to a) switch to KECCAK_2X variant; b) merge of almost 1/2 rotations with logical instructions. Performance is improved on all observed processors except on Cortex-A15. This is because it's capable of exploiting more parallelism and can execute original code for same amount of time. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/4057)	2017-08-02 23:22:28 +02:00
Xiaoyin Liu	bac5b39c96	Fix typo in sha1-thumb.pl Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4056)	2017-07-30 21:26:38 -04:00
Andy Polyakov	e3c79f0f19	sha/asm/keccak1600-avx512.pl: improve performance by 17%. Improvement is result of combination of data layout ideas from Keccak Code Package and initial version of this module. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-24 21:23:01 +02:00
Andy Polyakov	0d7903f83f	sha/asm/keccak1600-avx512.pl: absorb bug-fix and minor optimization. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-21 14:12:14 +02:00
Andy Polyakov	64d92d7498	x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results. "Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. AVX-512 results cover even Skylake-X :-) Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-21 14:07:32 +02:00
Andy Polyakov	d212b98b36	sha/asm/keccak1600-avx2.pl: optimized remodelled version. New register usage pattern allows to achieve sligtly better performance. Not as much as I hoped for. Performance is believed to be limited by irreconcilable write-back conflicts, rather than lack of computational resources or data dependencies. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-15 23:04:38 +02:00
Andy Polyakov	91dbdc63bd	sha/asm/keccak1600-avx2.pl: remodel register usage. This gives much more freedom to rearrange instructions. This is unoptimized version, provided for reference. Basically you need to compare it to initial `29724d0e15` to figure out the key difference. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-15 23:04:18 +02:00
Andy Polyakov	c7c7a8e601	Optimize sha/asm/keccak1600-avx2.pl. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-10 10:16:42 +02:00
Andy Polyakov	29724d0e15	Add sha/asm/keccak1600-avx2.pl. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-10 10:16:31 +02:00
Andy Polyakov	313fa47fea	Add sha/asm/keccak1600-avx512.pl. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3861)	2017-07-07 10:04:33 +02:00
Andy Polyakov	edbc681d22	sha/asm/keccak1600-x86_64.pl: close gap with Keccak Code Package. [Also typo and readability fixes. Ryzen result is added.] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>	2017-07-03 18:18:02 +02:00
Andy Polyakov	b547aba954	sha/asm/keccak1600-s390x.pl: typo and readability, minor size optimization. Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>	2017-07-03 18:17:55 +02:00
Andy Polyakov	54f8f9a1ed	x86_64 assembly pack: fill some blanks in Ryzen results. Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>	2017-07-03 18:17:00 +02:00
Andy Polyakov	7807267bed	Add sha/asm/keccak1600-s390x.pl. Reviewed-by: Richard Levitte <levitte@openssl.org>	2017-06-29 21:16:02 +02:00
Andy Polyakov	d6f0c94a65	sha/asm/keccak1600-x86_64.pl: add CFI directives. Reviewed-by: Richard Levitte <levitte@openssl.org>	2017-06-29 21:15:56 +02:00
Andy Polyakov	a1613840dd	sha/asm/keccak1600-x86_64.pl: optimize by re-ordering instructions. Reviewed-by: Richard Levitte <levitte@openssl.org>	2017-06-29 21:15:51 +02:00
Andy Polyakov	a078d9dfa9	sha/asm/keccak1600-x86_64.pl: remove redundant moves. Reviewed-by: Richard Levitte <levitte@openssl.org>	2017-06-29 21:15:45 +02:00
Andy Polyakov	64aef3f53d	Add sha/asm/keccak1600-x86_64.pl. Reviewed-by: Richard Levitte <levitte@openssl.org>	2017-06-29 21:15:09 +02:00
Andy Polyakov	a163e60d95	sha/asm/keccak1600-mmx.pl: optimize for Atom and add comparison data. Curiously enough out-of-order Silvermont benefited most from optimization, 33%. [Originally mentioned "anomaly" turned to be misreported frequency scaling problem. Correct results were collected under older kernel.] Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3739)	2017-06-24 09:42:14 +02:00
Andy Polyakov	415248e1e1	Add sha/asm/keccak1600-mmx.pl, x86 MMX module. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3739)	2017-06-24 09:42:08 +02:00
Andy Polyakov	b5cdec2fea	sha/asm/sha512p8-ppc.pl: add POWER8 performance data. [skip ci] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3705)	2017-06-21 16:26:59 +02:00
Andy Polyakov	53ddf7dd05	Add Keccak-1600 modules for PPC64 and POWER8. [skip ci] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3705)	2017-06-21 16:24:36 +02:00
Andy Polyakov	1d23bbccd3	Add sha/asm/keccak1600-c64x.pl [skip ci] Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/3708)	2017-06-21 15:21:47 +02:00
Andy Polyakov	5eb2dd88b3	Add sha/asm/keccak1600-armv8.pl. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-06-15 21:53:30 +02:00
Andy Polyakov	6dad1efef7	sha/asm/keccak1600-armv4.pl: switch to more efficient bit interleaving algorithm. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-06-08 20:21:31 +02:00
Andy Polyakov	367c552790	sha/asm/keccak1600-armv4.pl: add NEON code path. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-06-06 19:54:29 +02:00
Andy Polyakov	56676f877d	sha/asm/keccak1600-armv4.pl: add SHA3_absorb and SHA3_squeeze. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-06-06 19:54:24 +02:00
Andy Polyakov	5371810714	sha/asm/keccak1600-armv4.pl: optimization based on profiler feedback. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-06-06 19:54:19 +02:00
Andy Polyakov	aabfd32910	Add sha/asm/keccak1600-armv4.pl. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-06-06 19:54:12 +02:00
David Benjamin	e195c8a256	Remove filename argument to x86 asm_init. The assembler already knows the actual path to the generated file and, in other perlasm architectures, is left to manage debug symbols itself. Notably, in OpenSSL 1.1.x's new build system, which allows a separate build directory, converting .pl to .s as the scripts currently do result in the wrong paths. This also avoids inconsistencies from some of the files using $0 and some passing in the filename. Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Andy Polyakov <appro@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3431)	2017-05-11 17:00:23 -04:00
FdaSilvaYY	69687aa829	More typo fixes Fix some comments too [skip ci] Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3069)	2017-03-29 07:14:29 +02:00
Andy Polyakov	6cbfd94d08	x86_64 assembly pack: add some Ryzen performance results. Reviewed-by: Tim Hudson <tjh@openssl.org>	2017-03-22 10:58:01 +01:00
Emilia Kasper	b53338cbf8	Clean up references to FIPS This removes the fips configure option. This option is broken as the required FIPS code is not available. FIPS_mode() and FIPS_mode_set() are retained for compatibility, but FIPS_mode() always returns 0, and FIPS_mode_set() can only be used to turn FIPS mode off. Reviewed-by: Stephen Henson <steve@openssl.org>	2017-02-28 15:26:25 +01:00
Andy Polyakov	399976c7ba	sha/asm/*-x86_64.pl: add CFI annotations. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-15 15:43:05 +01:00
Adam Langley	1f9e00a6fc	sha/asm/sha1-x86_64.pl: add CFI annotations. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Andy Polyakov <appro@openssl.org> (Merged from https://github.com/openssl/openssl/pull/2590)	2017-02-11 21:33:33 +01:00
Andy Polyakov	384e6de4c7	x86_64 assembly pack: Win64 SEH face-lift. - harmonize handlers with guidelines and themselves; - fix some bugs in handlers; - add missing handlers in chacha and ecp_nistz256 modules; Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-06 08:21:42 +01:00
Andy Polyakov	a30b0522cb	x86 assembly pack: update performance results. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-12-19 16:18:25 +01:00
Andy Polyakov	32bbb62ea6	sha/asm/sha512-armv8.pl: fix big-endian support in __KERNEL__ case. In non-__KERNEL__ context 32-bit-style __ARMEB__/__ARMEL__ macros were set in arm_arch.h, which is shared between 32- and 64-bit builds. Since it's not included in __KERNEL__ case, we have to adhere to official 64-bit pre-defines, __AARCH64EB__/__AARCH64EL__. [If we are to share more code, it would need similar adjustment.] Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-11-17 19:29:58 +01:00
Andy Polyakov	866e505e0d	sha/asm/sha512-armv8.pl: add NEON version of SHA256. This provides up to 30% better performance on some of recent processors. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-11-11 13:48:16 +01:00
Andy Polyakov	413b6a8259	sha/asm/sha512-armv8.pl: adapt for kernel use. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-10-24 14:21:07 +02:00
Andy Polyakov	ace05265d2	x86_64 assembly pack: add Goldmont performance results. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-10-24 13:01:13 +02:00

1 2 3 4 5 ...

298 commits