openssl

Author	SHA1	Message	Date
Andy Polyakov	9a708bf982	{arm64\|x86_64}cpuid.pl: add special 16-byte case to OPENSSL_memcmp. OPENSSL_memcmp is a must in GCM decrypt and general-purpose loop takes quite a portion of execution time for short inputs, more than GHASH for few-byte inputs according to profiler. Special 16-byte case takes it off top five list in profiler output. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6312)	2018-06-03 21:15:18 +02:00
Bryan Donlan	082193ef2b	Fix issues in ia32 RDRAND asm leading to reduced entropy This patch fixes two issues in the ia32 RDRAND assembly code that result in a (possibly significant) loss of entropy. The first, less significant, issue is that, by returning success as 0 from OPENSSL_ia32_rdrand() and OPENSSL_ia32_rdseed(), a subtle bias was introduced. Specifically, because the assembly routine copied the remaining number of retries over the result when RDRAND/RDSEED returned 'successful but zero', a bias towards values 1-8 (primarily 8) was introduced. The second, more worrying issue was that, due to a mixup in registers, when a buffer that was not size 0 or 1 mod 8 was passed to OPENSSL_ia32_rdrand_bytes or OPENSSL_ia32_rdseed_bytes, the last (n mod 8) bytes were all the same value. This issue impacts only the 64-bit variant of the assembly. This change fixes both issues by first eliminating the only use of OPENSSL_ia32_rdrand, replacing it with OPENSSL_ia32_rdrand_bytes, and fixes the register mixup in OPENSSL_ia32_rdrand_bytes. It also adds a sanity test for OPENSSL_ia32_rdrand_bytes and OPENSSL_ia32_rdseed_bytes to help catch problems of this nature in the future. Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/5342)	2018-03-08 10:27:49 -05:00
Andy Polyakov	7933762870	crypto/x86_64cpuid.pl: suppress AVX512F flag on Skylake-X. It was observed that AVX512 code paths can negatively affect overall Skylake-X system performance. But we are talking specifically about 512-bit code, while AVX512VL, 256-bit variant of AVX512F instructions, is supposed to fly as smooth as AVX2. Which is why it remains unmasked. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4838)	2017-12-08 12:57:09 +01:00
Andy Polyakov	88ac224cda	crypto/x86_64cpuid.pl: fix AVX512 capability masking. Originally it was thought that it's possible to use AVX512VL+BW instructions with XMM and YMM registers without kernel enabling ZMM support, but it turned to be wrong assumption. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-11-23 21:05:44 +01:00
Andy Polyakov	d6ee8f3dc4	OPENSSL_ia32cap: reserve for new extensions. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-11-08 21:45:16 +01:00
David Benjamin	d67e755418	Fix comment typo. Reviewed-by: Ben Kaduk <kaduk@mit.edu> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4023)	2017-07-26 23:10:52 -04:00
Andy Polyakov	d84df59440	crypto/x86_64cpuid.pl: fix typo in Knights Landing detection. Thanks to David Benjamin for spotting this! Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4009)	2017-07-25 21:27:47 +02:00
Andy Polyakov	64d92d7498	x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results. "Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. AVX-512 results cover even Skylake-X :-) Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-21 14:07:32 +02:00
Andy Polyakov	1aed5e1ac2	crypto/x86*cpuid.pl: move extended feature detection. Exteneded feature flags were not pulled on AMD processors, as result a number of extensions were effectively masked on Ryzen. Original fix for x86_64cpuid.pl addressed this problem, but messed up processor vendor detection. This fix moves extended feature detection past basic feature detection where it belongs. 32-bit counterpart is harmonized too. Reviewed-by: Rich Salz <rsalz@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org>	2017-03-13 18:42:10 +01:00
Andy Polyakov	f8418d87e1	crypto/x86_64cpuid.pl: move extended feature detection upwards. Exteneded feature flags were not pulled on AMD processors, as result a number of extensions were effectively masked on Ryzen. It should have been reported for Excavator since it implements AVX2 extension, but apparently nobody noticed or cared... Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-03-07 11:17:32 +01:00
Andy Polyakov	5e32cfb2b6	crypto/x86_64cpuid.pl: add CFI annotations. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-26 21:26:27 +01:00
Andy Polyakov	66bee01c82	crypto/x86_64cpuid.pl: detect if kernel preserves %zmm registers. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-03 12:21:50 +01:00
Andy Polyakov	9c940446f6	crypto/x86[_64]cpuid.pl: add OPENSSL_ia32_rd[rand\|seed]_bytes. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-07-15 13:20:52 +02:00
Andy Polyakov	cfe1d9929e	x86_64 assembly pack: tolerate spaces in source directory name. [as it is now quoting $output is not required, but done just in case] Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-29 14:12:51 +02:00
Andy Polyakov	e33826f01b	Add assembly CRYPTO_memcmp. GH: #102 Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-05-19 22:33:00 +02:00
Rich Salz	e0a651945c	Copyright consolidation: perl files Add copyright to most .pl files This does NOT cover any .pl file that has other copyright in it. Most of those are Andy's but some are public domain. Fix typo's in some existing files. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-04-20 09:45:40 -04:00
Andy Polyakov	f4d456408d	x86[_64]cpuid.pl: add low-level RDSEED.	2014-02-14 17:24:12 +01:00
Andy Polyakov	46bf83f07a	x86_64 assembly pack: make Windows build more robust. PR: 2963 and a number of others	2013-01-22 22:27:28 +01:00
Andy Polyakov	c5cd28bd64	Extend OPENSSL_ia32cap_P with extra word to accomodate AVX2 capability.	2012-11-17 19:04:15 +00:00
Andy Polyakov	6251989eb6	x86_64 assembly pack: make it possible to compile with Perl located on path with spaces. PR: 2835	2012-06-27 10:08:23 +00:00
Andy Polyakov	ff6f9f96fd	cryptlib.c, etc.: fix linker warnings in 64-bit Darwin build.	2011-11-12 13:10:00 +00:00
Andy Polyakov	4d01f2761d	x86_64cpuid.pl: fix typo.	2011-06-04 13:08:25 +00:00
Andy Polyakov	301799b803	x86[_64]cpuid.pl: add function accessing rdrand instruction.	2011-06-04 12:20:45 +00:00
Andy Polyakov	4bb90087d7	x86[_64]cpuid.pl: harmonize usage of reserved bits #20 and #30 .	2011-05-27 15:32:43 +00:00
Andy Polyakov	2bc3ad28b3	x86_64cpuid.pl: get AVX masking right.	2011-05-26 13:16:26 +00:00
Andy Polyakov	ddc20d4da9	x86_64cpuid.pl: allow shared build to work without -Bsymbolic. PR: 2466	2011-05-18 16:24:19 +00:00
Andy Polyakov	b906422149	x86[_64]cpuid.pl: handle new extensions.	2011-05-16 20:35:11 +00:00
Andy Polyakov	5fabb88a78	Multiple assembler packs: add experimental memory bus instrumentation.	2011-04-17 12:46:00 +00:00
Andy Polyakov	3efe51a407	Revert previous Linux-specific/centric commit#19629. If it really has to be done, it's definitely not the way to do it. So far answer to the question was to ./config -Wa,--noexecstack (adopted by RedHat).	2010-05-05 22:05:39 +00:00
Ben Laurie	0e3ef596e5	Non-executable stack in asm.	2010-05-05 15:50:13 +00:00
Andy Polyakov	1fd79f66ea	x86_64cpuid.pl: ml64 is allergic to db on label line.	2010-04-14 19:24:48 +00:00
Andy Polyakov	7676eebf42	OPENSSL_cleanse to accept zero length parameter [matching C implementation].	2010-01-24 14:54:24 +00:00
Andy Polyakov	761393bba7	x86[_64]cpuid.pl: further refine shared cache detection.	2009-05-14 18:17:26 +00:00
Andy Polyakov	5cd91b5055	x86_64cpuid.pl: refine shared cache detection logic.	2009-05-12 21:01:13 +00:00
Andy Polyakov	aa8f38e49b	x86_64 assembler pack to comply with updated styling x86_64-xlate.pl rules.	2008-11-12 08:15:52 +00:00
Andy Polyakov	89778b7f3f	x86_64cpuid.pl cosmetics: harmonize $dir treatment with other modules.	2008-07-15 19:52:20 +00:00
Dr. Stephen Henson	a9e96d724d	Use default value for $dir if it is empty.	2008-02-25 13:14:06 +00:00
Andy Polyakov	abe7f8b457	Make all x86_64 modules independent on current working directory.	2008-01-13 17:42:04 +00:00
Andy Polyakov	55eab3b74b	Make x86_64 modules work under Win64/x64.	2007-08-23 12:01:58 +00:00
Andy Polyakov	3df2eff4bd	x86*cpuid update.	2007-07-21 14:46:27 +00:00
Andy Polyakov	5d86336746	Flush output in x86_64cpuid.pl.	2007-06-21 11:39:35 +00:00
Andy Polyakov	b2dba9bf1f	Profiling revealed that OPENSSL_cleanse consumes more CPU time than sha1_block_data_order when hashing short messages. Move OPENSSL_cleanse to "cpuid" assembler module and gain 2x.	2007-05-14 21:35:25 +00:00
Andy Polyakov	932cc129ee	x86_64 assembler updates.	2007-05-14 15:57:19 +00:00
Andy Polyakov	9babf3929b	RC4_set_key for x86_64 and Core2 optimization. PR: 1447	2007-04-02 09:50:14 +00:00
Andy Polyakov	e442c36252	Solaris x86_64 /usr/ccs/bin/as support.	2005-06-20 14:56:48 +00:00
Andy Polyakov	5f1841cdca	Rename amd64 modules to x86_64 and update RC4 implementation.	2005-05-03 15:42:05 +00:00

46 commits