2019-01-24 02:15:54 +00:00
|
|
|
The sparse_array.c file contains an implementation of a sparse array that
|
|
|
|
attempts to be both space and time efficient.
|
|
|
|
|
|
|
|
The sparse array is represented using a tree structure. Each node in the
|
|
|
|
tree contains a block of pointers to either the user supplied leaf values or
|
|
|
|
to another node.
|
|
|
|
|
|
|
|
There are a number of parameters used to define the block size:
|
|
|
|
|
|
|
|
OPENSSL_SA_BLOCK_BITS Specifies the number of bits covered by each block
|
|
|
|
SA_BLOCK_MAX Specifies the number of pointers in each block
|
|
|
|
SA_BLOCK_MASK Specifies a bit mask to perform modulo block size
|
|
|
|
SA_BLOCK_MAX_LEVELS Indicates the maximum possible height of the tree
|
|
|
|
|
|
|
|
These constants are inter-related:
|
|
|
|
SA_BLOCK_MAX = 2 ^ OPENSSL_SA_BLOCK_BITS
|
|
|
|
SA_BLOCK_MASK = SA_BLOCK_MAX - 1
|
|
|
|
SA_BLOCK_MAX_LEVELS = number of bits in size_t divided by
|
|
|
|
OPENSSL_SA_BLOCK_BITS rounded up to the next multiple
|
|
|
|
of OPENSSL_SA_BLOCK_BITS
|
|
|
|
|
|
|
|
OPENSSL_SA_BLOCK_BITS can be defined at compile time and this overrides the
|
|
|
|
built in setting.
|
|
|
|
|
|
|
|
As a space and performance optimisation, the height of the tree is usually
|
|
|
|
less than the maximum possible height. Only sufficient height is allocated to
|
|
|
|
accommodate the largest index added to the data structure.
|
|
|
|
|
|
|
|
The largest index used to add a value to the array determines the tree height:
|
|
|
|
|
|
|
|
+----------------------+---------------------+
|
|
|
|
| Largest Added Index | Height of Tree |
|
|
|
|
+----------------------+---------------------+
|
|
|
|
| SA_BLOCK_MAX - 1 | 1 |
|
|
|
|
| SA_BLOCK_MAX ^ 2 - 1 | 2 |
|
|
|
|
| SA_BLOCK_MAX ^ 3 - 1 | 3 |
|
|
|
|
| ... | ... |
|
|
|
|
| size_t max | SA_BLOCK_MAX_LEVELS |
|
|
|
|
+----------------------+---------------------+
|
|
|
|
|
|
|
|
The tree height is dynamically increased as needed based on additions.
|
|
|
|
|
|
|
|
An empty tree is represented by a NULL root pointer. Inserting a value at
|
|
|
|
index 0 results in the allocation of a top level node full of null pointers
|
|
|
|
except for the single pointer to the user's data (N = SA_BLOCK_MAX for
|
2019-07-02 08:04:04 +00:00
|
|
|
brevity):
|
2019-01-24 02:15:54 +00:00
|
|
|
|
|
|
|
+----+
|
|
|
|
|Root|
|
|
|
|
|Node|
|
|
|
|
+-+--+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v
|
|
|
|
+-+-+---+---+---+---+
|
|
|
|
| 0 | 1 | 2 |...|N-1|
|
|
|
|
| |nil|nil|...|nil|
|
|
|
|
+-+-+---+---+---+---+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v
|
|
|
|
+-+--+
|
|
|
|
|User|
|
|
|
|
|Data|
|
|
|
|
+----+
|
|
|
|
Index 0
|
|
|
|
|
|
|
|
|
|
|
|
Inserting at element 2N+1 creates a new root node and pushes down the old root
|
|
|
|
node. It then creates a second second level node to hold the pointer to the
|
|
|
|
user's new data:
|
|
|
|
|
|
|
|
+----+
|
|
|
|
|Root|
|
|
|
|
|Node|
|
|
|
|
+-+--+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v
|
|
|
|
+-+-+---+---+---+---+
|
|
|
|
| 0 | 1 | 2 |...|N-1|
|
|
|
|
| |nil| |...|nil|
|
|
|
|
+-+-+---+-+-+---+---+
|
|
|
|
| |
|
|
|
|
| +------------------+
|
|
|
|
| |
|
|
|
|
v v
|
|
|
|
+-+-+---+---+---+---+ +-+-+---+---+---+---+
|
|
|
|
| 0 | 1 | 2 |...|N-1| | 0 | 1 | 2 |...|N-1|
|
|
|
|
|nil| |nil|...|nil| |nil| |nil|...|nil|
|
|
|
|
+-+-+---+---+---+---+ +---+-+-+---+---+---+
|
|
|
|
| |
|
|
|
|
| |
|
|
|
|
| |
|
|
|
|
v v
|
|
|
|
+-+--+ +-+--+
|
|
|
|
|User| |User|
|
|
|
|
|Data| |Data|
|
|
|
|
+----+ +----+
|
|
|
|
Index 0 Index 2N+1
|
|
|
|
|
|
|
|
|
|
|
|
The nodes themselves are allocated in a sparse manner. Only nodes which exist
|
|
|
|
along a path from the root of the tree to an added leaf will be allocated.
|
|
|
|
The complexity is hidden and nodes are allocated on an as needed basis.
|
|
|
|
Because the data is expected to be sparse this doesn't result in a large waste
|
|
|
|
of space.
|
|
|
|
|
|
|
|
Values can be removed from the sparse array by setting their index position to
|
|
|
|
NULL. The data structure does not attempt to reclaim nodes or reduce the
|
|
|
|
height of the tree on removal. For example, now setting index 0 to NULL would
|
|
|
|
result in:
|
|
|
|
|
|
|
|
+----+
|
|
|
|
|Root|
|
|
|
|
|Node|
|
|
|
|
+-+--+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v
|
|
|
|
+-+-+---+---+---+---+
|
|
|
|
| 0 | 1 | 2 |...|N-1|
|
|
|
|
| |nil| |...|nil|
|
|
|
|
+-+-+---+-+-+---+---+
|
|
|
|
| |
|
|
|
|
| +------------------+
|
|
|
|
| |
|
|
|
|
v v
|
|
|
|
+-+-+---+---+---+---+ +-+-+---+---+---+---+
|
|
|
|
| 0 | 1 | 2 |...|N-1| | 0 | 1 | 2 |...|N-1|
|
|
|
|
|nil|nil|nil|...|nil| |nil| |nil|...|nil|
|
|
|
|
+---+---+---+---+---+ +---+-+-+---+---+---+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v
|
|
|
|
+-+--+
|
|
|
|
|User|
|
|
|
|
|Data|
|
|
|
|
+----+
|
|
|
|
Index 2N+1
|
|
|
|
|
|
|
|
|
|
|
|
Accesses to elements in the sparse array take O(log n) time where n is the
|
|
|
|
largest element. The base of the logarithm is SA_BLOCK_MAX, so for moderately
|
|
|
|
small indices (e.g. NIDs), single level (constant time) access is achievable.
|
|
|
|
Space usage is O(minimum(m, n log(n)) where m is the number of elements in the
|
|
|
|
array.
|
|
|
|
|
|
|
|
Note: sparse arrays only include pointers to types. Thus, SPARSE_ARRAY_OF(char)
|
|
|
|
can be used to store a string.
|