Commit graph

5 commits

Author SHA1 Message Date
Andy Polyakov
dd2d7b19f8 sha/asm/keccak1600-armv8.pl: halve the size of hw-assisted subroutine.
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
2018-04-23 17:19:57 +02:00
Andy Polyakov
b761ff4e77 sha/asm/keccak1600-armv8.pl: add hardware-assisted ARMv8.2 subroutines.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5358)
2018-02-19 14:15:31 +01:00
Andy Polyakov
7533162322 ARMv8 assembly pack: add Qualcomm Kryo results.
[skip ci]

Reviewed-by: Tim Hudson <tjh@openssl.org>
2017-11-13 11:13:00 +01:00
Andy Polyakov
236dd46339 sha/asm/keccak1600-armv8.pl: fix return value buglet and ...
... script data load.

On related note an attempt was made to merge rotations with logical
operations. I mean as we know, ARM ISA has merged rotate-n-logical
instructions which can be used here. And they were used to improve
keccak1600-armv4 performance. But not here. Even though this approach
resulted in improvement on Cortex-A53 proportional to reduction of
amount of instructions, ~8%, it didn't exactly worked out on
non-Cortex cores. Presumably because they break merged instructions
to separate μ-ops, which results in higher *operations* count. X-Gene
and Denver went ~20% slower and Apple A7 - 40%. The optimization was
therefore dismissed.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-09-09 19:09:36 +02:00
Andy Polyakov
5eb2dd88b3 Add sha/asm/keccak1600-armv8.pl.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2017-06-15 21:53:30 +02:00