"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.
AVX-512 results cover even Skylake-X :-)
Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!
Reviewed-by: Rich Salz <rsalz@openssl.org>
Original text:
Clarify use of |$end0| in stitched x86-64 AES-GCM code.
There was some uncertainty about what the code is doing with |$end0|
and whether it was necessary for |$len| to be a multiple of 16 or 96.
Hopefully these added comments make it clear that the code is correct
except for the caveat regarding low memory addresses.
Change-Id: Iea546a59dc7aeb400f50ac5d2d7b9cb88ace9027
Reviewed-on: https://boringssl-review.googlesource.com/7194
Reviewed-by: Adam Langley <agl@google.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3700)
There was some uncertainty about what the code is doing with |$end0|
and whether it was necessary for |$len| to be a multiple of 16 or 96.
Hopefully these added comments make it clear that the code is correct
except for the caveat regarding low memory addresses.
Change-Id: Iea546a59dc7aeb400f50ac5d2d7b9cb88ace9027
Reviewed-on: https://boringssl-review.googlesource.com/7194
Reviewed-by: Adam Langley <agl@google.com>
Signed-off-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>