Separate size and capacity of LGDO ArrayLike objects #109
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See: #107
Added an Abstract Base Class for LGDOContainers, which are ArrayLike objects and can be placed inside of tables. LGDOContainers are expected to implement methods for manipulating the capacity and size of the containers. Capacity controls the number of rows in memory, while size controls the length of the container; capacity may be larger than size, resulting in "padding" at the end. The advantage of this approach is that it enables changes in size without reallocating memory (always for shrinking, most of the time for growing). The cost is the extra memory used by this padding (although resizes require allocating a new array and then copying data over, requiring a spike in memory usage, so this cost is not so big). Methods and options have been added for "trimming" the capacity to match the size of the array in case the user prefers to not have this padding for some reason.
As a result of this change, the returned values for
core.read
andstore.read
have (sometimes) changed. Before, when reading an LGDO object in place, they would return the number of rows read. Now, they resize the array toobj_buf_start + n_rows_read
. The reason for the old behavior was that we did not want to shrink arrays when not enough rows were in the file to fill a Table in place, since the reallocation could cause problems. Since shrinking no longer requires reallocation, this is no longer problematic. A major benefit is that the read functions are both simpler to use since they have consistent returns. It should be noted that this change will break code in other LEGEND packages.When resizing, if the capacity is smaller than the new size, the behavior is to reserve more capacity up to the next power of 2 larger than the new size. When reading a huge number of files, this will reduce the time spent reallocating/copying arrays from O(n) to O(log(n)); it is not clear to me how important this is, but someone can try running tests of reading a large number of files to check.
Detailed changes:
reserve_capacity
,get_capacity
, andtrim_capacity
defined. In some cases, this involved implementing other methods likeappend
andinsert
.nda
is now a property, which returns a slice (of lengthlen(obj)
) of the internal data buffer (which has length equal to capacity). This will preserve the old behavior as far as the user is concerneddtype
was turned into a property in Array, AoESA and VoV.n_dims
was for VoVcore.read
resizes the returned array to sizeobj_buf_start + n_rows_read
. Note that this resize could probably be done in the_serializer.read
functions in the future (which still returnn_rows_read
)store.read
has been simplified, and now just callscore.read
aftergimme_files
. This reduces redundant codelh5.iterator
no longer returnsn_rows_read
n_rows_read
. Most of them were filling it into_
; for the ones that actually had asserts onn_rows_read
, these were replaced with asserts onlen(obj)
loc
, which was a member variable that was more or less used to enable the size to be different from capacity before; since this is now a core feature, this is unnecessary. It does not seem to play any role in our other code packagesLGDOContainer
methods for consistency