v0.4.2
Runtime Changes
Notes
This update reduces runtime by on average 50%.
Profiler
- Add support for HistogramOptions
- Add multiprocessing support
- Reduced runtime for shuffling indices
- Vectorized precision function
- Improved unique set & vocab merging
- By default histogram only runs 'auto' bin edge detection
Data
- Add length attribute to the data class
data.length()
orlen(data)
Report
- Added optional
omit_keys
to the report options function, remove keys from the final report - Added
row_has_null_count
(global), one or more nulls in the row - Added
row_is_null_count
(global), the entire row is null - Rename
total_samples
(global) ->row_count
- Rename label
BACKGROUND
->UNKNOWN
(column) - Removed
covariance
(global) - Removed
data_classification
(global) - Removed
data_label_probability
(column) - Removed
median
(column)
Bug fixes
- Accurate null count and total_samples on profile updates
- Each column now receives the same sampled indices; enabling
row_is_null_count