Allow label based indexing in Rows (incl. test updates) #268
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enables access to the items in a Dataset Row by index or by column header. For example,
data[0]['first_name'] == data[0][0]
if'first_name'
is the label of the first column as specified in the Dataset's headers. (Ref. issues #22, #158, #265.)Implemented by adding a Row attribute
_dset
that stores a reference to the Dataset that "owns" the Row and thus allowing each Row access to the parent Dataset's headers. Constructors,insert
methods and itemgetters/setters have been updated accordingly. In addition Dataset has a new attribute_lblidx
that indicates whether label based indexing is possible (i.e. header with unique labels exists)._lblidx
is maintained via updatedheaders
property.To allow label based access within a Row the Dataset's
__getitem__
now returns a Row rather than a tuple, with the Row basically behaving like a list externally. This has the potential to cause some backwards compatibility issues if client code relied on Dataset items being returned as plain tuples. To minimize this impact the PR adds__add__
,__eq__
, and__ne__
methods for Rows. Tests have been updated by applying theRow.tuple
property for comparisons with tuple literals (PR will fail existing tests otherwise). Independent of the label based indexing I'd suggest returning Dataset items as Rows instead of plain tuples may be preferable in any case to enable adding additional functionality in the future.Other changes/additions:
copy
method for Datasets that updates_dset
references in new object's Rows and usescopy.deepcopy
instead ofcopy.copy
. This should also fix a bug in the current version where copies (infilter
andstack
) are shallow and the new object's_data
attribute points to the same list as the original object (filter
andstack
updated accordingly)._dset
points to the new object and that the new object is not a shallow copy (filter
,stack
,stack_col
,subset
,sorted
, andtranspose
)filter
)