Skip to content

v0.2.1: Vector features, Norwegian and Lithuanian tokenizers, many bugfixes.

Compare
Choose a tag to compare
@w4nderlust w4nderlust released this 13 Oct 00:36
· 3190 commits to master since this release

Improvements

Add Filter Bank features for audio.
Added two more parameters skip_save_test_predictions and skip_save_test_statistics to train and experiment CLI commands and API.
Updated to spaCy 2.2 with support for Norvegian and Lithuanian tokenizers.
Reorganized dependencies, now the defaults are barebone and there are several axtra ones.
Added fc_layers to H3 embed encoder.
Added get_preprocessing_params in preprocessing.
Refactored image features preprocessing to use multiprocessing.
Refactored preprocessing with strategy pattern.

Bugfixes

Fix #452: Removed dependency on gmpy.
Fix #465: Adds capability to set the vocabulary from a Glove file.
Fix #480: Adds a health check to ludwig serve.
Fix #481: Added some examples of visualization commands.
Fix #491: Improved skip parameters, now no directories are created if not needed.
Fix #492: Adds skip saving unprocessed output api.py.
Fix #493: Added parameters for the vocabulary file and the UNK and PAD symbols in sequence feature call to create_vocabulary in the calculation of metadata.
Fix #500: Fixed learning_curves() when the training statistics file does not contain validation.
Fix #509: Fixes in_memory issues in image features.
Fix #525: Adding check is_on_master() before creating save_path dir./ectory
Fix #510: Fixed version of pydantic.
Fix #532: Improved speed of add_sequence_feature_column().

Potentially breaking changes

Fix #520: Renamed field parameter in visualization to output_feature_name for clarity and improved documentation. Please make sure to rename you function calls if you were using this parameter by name (the order keeps the same).

Contributors

@sriki18 @carlogrisetti @areeves87 @naresh-bhandari @revolunet @patrickvonplaten @Athanaziz @dsblank @tgaddair @Mechachleopteryx @AlexeyGy @yu-iskw