4592172376
registers. As the AES table is already 1K aligned we can use it everywhere and speedup table address calculation by 10%. Performance numbers: decryption 16B 64B 256B 1024B 8192B ------------------------------------------------------------------- aes-256-cbc 5636.84k 6443.26k 6689.02k 6752.94k 6766.59k bef. aes-256-cbc 6200.31k 7195.71k 7504.30k 7585.11k 7599.45k aft. ------------------------------------------------------------------- aes-128-cbc 7313.85k 8653.67k 9079.55k 9188.35k 9205.08k bef. aes-128-cbc 7925.38k 9557.99k 10092.37k 10232.15k 10272.77k aft. encryption 16B 64B 256B 1024B 8192B ------------------------------------------------------------------- aes-256 cbc 6009.65k 6592.70k 6766.59k 6806.87k 6815.74k bef. aes-256 cbc 6643.93k 7388.69k 7605.33k 7657.81k 7675.90k aft. ------------------------------------------------------------------- aes-128 cbc 7862.09k 8892.48k 9214.04k 9291.78k 9311.57k bef. aes-128 cbc 8639.29k 9881.17k 10265.86k 10363.56k 10392.92k aft. Reviewed-by: Paul Dale <paul.dale@oracle.com> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/8206) |
||
---|---|---|
.. | ||
aes | ||
aria | ||
asn1 | ||
async | ||
bf | ||
bio | ||
blake2 | ||
bn | ||
buffer | ||
camellia | ||
cast | ||
chacha | ||
cmac | ||
cms | ||
comp | ||
conf | ||
ct | ||
des | ||
dh | ||
dsa | ||
dso | ||
ec | ||
engine | ||
err | ||
ess | ||
evp | ||
gmac | ||
hmac | ||
idea | ||
include/internal | ||
kdf | ||
kmac | ||
lhash | ||
md2 | ||
md4 | ||
md5 | ||
mdc2 | ||
modes | ||
objects | ||
ocsp | ||
pem | ||
perlasm | ||
pkcs7 | ||
pkcs12 | ||
poly1305 | ||
property | ||
rand | ||
rc2 | ||
rc4 | ||
rc5 | ||
ripemd | ||
rsa | ||
seed | ||
sha | ||
siphash | ||
sm2 | ||
sm3 | ||
sm4 | ||
srp | ||
stack | ||
store | ||
ts | ||
txt_db | ||
ui | ||
whrlpool | ||
x509 | ||
x509v3 | ||
alphacpuid.pl | ||
arm64cpuid.pl | ||
arm_arch.h | ||
armcap.c | ||
armv4cpuid.pl | ||
build.info | ||
c64xpluscpuid.pl | ||
context.c | ||
cpt_err.c | ||
cryptlib.c | ||
ctype.c | ||
cversion.c | ||
dllmain.c | ||
ebcdic.c | ||
ex_data.c | ||
getenv.c | ||
ia64cpuid.S | ||
init.c | ||
LPdir_nyi.c | ||
LPdir_unix.c | ||
LPdir_vms.c | ||
LPdir_win.c | ||
LPdir_win32.c | ||
LPdir_wince.c | ||
mem.c | ||
mem_clr.c | ||
mem_dbg.c | ||
mem_sec.c | ||
mips_arch.h | ||
o_dir.c | ||
o_fips.c | ||
o_fopen.c | ||
o_init.c | ||
o_str.c | ||
o_time.c | ||
pariscid.pl | ||
ppc_arch.h | ||
ppccap.c | ||
ppccpuid.pl | ||
README.sparse_array | ||
s390x_arch.h | ||
s390xcap.c | ||
s390xcpuid.pl | ||
sparc_arch.h | ||
sparccpuid.S | ||
sparcv9cap.c | ||
sparse_array.c | ||
threads_none.c | ||
threads_pthread.c | ||
threads_win.c | ||
uid.c | ||
vms_rms.h | ||
x86_64cpuid.pl | ||
x86cpuid.pl |
The sparse_array.c file contains an implementation of a sparse array that attempts to be both space and time efficient. The sparse array is represented using a tree structure. Each node in the tree contains a block of pointers to either the user supplied leaf values or to another node. There are a number of parameters used to define the block size: OPENSSL_SA_BLOCK_BITS Specifies the number of bits covered by each block SA_BLOCK_MAX Specifies the number of pointers in each block SA_BLOCK_MASK Specifies a bit mask to perform modulo block size SA_BLOCK_MAX_LEVELS Indicates the maximum possible height of the tree These constants are inter-related: SA_BLOCK_MAX = 2 ^ OPENSSL_SA_BLOCK_BITS SA_BLOCK_MASK = SA_BLOCK_MAX - 1 SA_BLOCK_MAX_LEVELS = number of bits in size_t divided by OPENSSL_SA_BLOCK_BITS rounded up to the next multiple of OPENSSL_SA_BLOCK_BITS OPENSSL_SA_BLOCK_BITS can be defined at compile time and this overrides the built in setting. As a space and performance optimisation, the height of the tree is usually less than the maximum possible height. Only sufficient height is allocated to accommodate the largest index added to the data structure. The largest index used to add a value to the array determines the tree height: +----------------------+---------------------+ | Largest Added Index | Height of Tree | +----------------------+---------------------+ | SA_BLOCK_MAX - 1 | 1 | | SA_BLOCK_MAX ^ 2 - 1 | 2 | | SA_BLOCK_MAX ^ 3 - 1 | 3 | | ... | ... | | size_t max | SA_BLOCK_MAX_LEVELS | +----------------------+---------------------+ The tree height is dynamically increased as needed based on additions. An empty tree is represented by a NULL root pointer. Inserting a value at index 0 results in the allocation of a top level node full of null pointers except for the single pointer to the user's data (N = SA_BLOCK_MAX for breviety): +----+ |Root| |Node| +-+--+ | | | v +-+-+---+---+---+---+ | 0 | 1 | 2 |...|N-1| | |nil|nil|...|nil| +-+-+---+---+---+---+ | | | v +-+--+ |User| |Data| +----+ Index 0 Inserting at element 2N+1 creates a new root node and pushes down the old root node. It then creates a second second level node to hold the pointer to the user's new data: +----+ |Root| |Node| +-+--+ | | | v +-+-+---+---+---+---+ | 0 | 1 | 2 |...|N-1| | |nil| |...|nil| +-+-+---+-+-+---+---+ | | | +------------------+ | | v v +-+-+---+---+---+---+ +-+-+---+---+---+---+ | 0 | 1 | 2 |...|N-1| | 0 | 1 | 2 |...|N-1| |nil| |nil|...|nil| |nil| |nil|...|nil| +-+-+---+---+---+---+ +---+-+-+---+---+---+ | | | | | | v v +-+--+ +-+--+ |User| |User| |Data| |Data| +----+ +----+ Index 0 Index 2N+1 The nodes themselves are allocated in a sparse manner. Only nodes which exist along a path from the root of the tree to an added leaf will be allocated. The complexity is hidden and nodes are allocated on an as needed basis. Because the data is expected to be sparse this doesn't result in a large waste of space. Values can be removed from the sparse array by setting their index position to NULL. The data structure does not attempt to reclaim nodes or reduce the height of the tree on removal. For example, now setting index 0 to NULL would result in: +----+ |Root| |Node| +-+--+ | | | v +-+-+---+---+---+---+ | 0 | 1 | 2 |...|N-1| | |nil| |...|nil| +-+-+---+-+-+---+---+ | | | +------------------+ | | v v +-+-+---+---+---+---+ +-+-+---+---+---+---+ | 0 | 1 | 2 |...|N-1| | 0 | 1 | 2 |...|N-1| |nil|nil|nil|...|nil| |nil| |nil|...|nil| +---+---+---+---+---+ +---+-+-+---+---+---+ | | | v +-+--+ |User| |Data| +----+ Index 2N+1 Accesses to elements in the sparse array take O(log n) time where n is the largest element. The base of the logarithm is SA_BLOCK_MAX, so for moderately small indices (e.g. NIDs), single level (constant time) access is achievable. Space usage is O(minimum(m, n log(n)) where m is the number of elements in the array. Note: sparse arrays only include pointers to types. Thus, SPARSE_ARRAY_OF(char) can be used to store a string.