Skip to content

v0.4: Distributed processing and training with Ray and Dask, Distributed hyperopt with RayTune, TabNet, Remote FS, MLflow for monitoring and serving, new Datasets

Compare
Choose a tag to compare
@w4nderlust w4nderlust released this 15 Jun 04:22
· 1705 commits to master since this release

Changelog

Additions

  • Integrate ray tune into hyperopt (#1001)
  • Added Ames Housing Kaggle dataset (#1098)
  • Added functionality to obtain subtrees in the SST dataset (#1108)
  • Added comparator combiner (#1113)
  • Additional Text Classification Datasets (#1121)
  • Added Ray remote backend and Dask distributed preprocessing (#1090)
  • Added TabNet combiner and needed modules (#1062)
  • Added Higgs Boson dataset (#1157)
  • Added GitHub workflow to push to Docker Hub (#1160)
  • Added more tagging schemes for Docker images (#1161)
  • Added Docker build matrix (#1162)
  • Added category feature > 1 dim to TabNet (#1150)
  • Added timeseries datasets (#1149)
  • Add TabNet Datasets (#1153)
  • Forest Cover Type, Adult Census Income and Rossmann Store Sales datasets (#1165)
  • Added KDD Cup 2009 datasets (#1167)
  • Added Ray GPU image (#1170)
  • Added support for cloud object storage (S3, GCS, ADLS, etc.) (#1164)
  • Perform inference with Dask when using the Ray backend (#1128)
  • Added schema validation to config files (#1186)
  • Added MLflow experiment tracking support (#1191)
  • Added export to MLflow pyfunc model format (#1192)
  • Added MLP-Mixer image encoder (#1178)
  • Added TransformerCombiner (#1177)
  • Added TFRecord support as a preprocessing cache format (#1194)
  • Added higgs boson tabnet examples (#1209)

Improvements

  • Abstracted Horovod params into the Backend API (#1080)
  • Added allowed_origins to serving to support to allow cross-origin requests (#1091)
  • Added callbacks to hook into the training loop programmatically (#1094)
  • Added scheduler support to Ray Tune hyperopt and fixed GPU usage (#1088)
  • Ray Tune: enforced that epochs equals max_t and early stopping is disabled (#1109)
  • Added register_trainable logic to RayTuneExecutor (#1117)
  • Replaced Travis CI with GitHub Actions (#1120)
  • Split distributed tests into separate test suite (#1126)
  • Removed unused regularizer parameter from training defaults
  • Restrict docker built GA to only ludwig-ai repos (#1166)
  • Harmonize return object for categorical, sequence generator and sequence tagger (#1171)
  • Sourcing images from either file path or in-memory ndarrays (#1174)
  • Refactored hyperopt results into object structure for easier programmatic usage (#1184)
  • Refactored all contrib classes to use the Callback interface (#1187)
  • Improved performance of Dask preprocessing by adding parallelism (#1193)
  • Improved TabNetCombiner and Concat combiner (#1177)
  • Added additional backend configuration options (#1195)
  • Made should_shuffle configurable in Trainer (#1198)

Bugfixes

  • Fix SST parentheses issue
  • Fix serve.py adding a try around the form parsing (#1111)
  • Fix #1104: add lengths to text encoder output with updated unit test (#1105)
  • Fix sst2 substree logic to match glue sst2 dataset (#1112)
  • Fix #1078: Avoid recreating cache when using image preproc (#1114)
  • Fix checking is dask exists in figure_data_format_dataset
  • Fixed bug in EthosBinary dataset class and model directory copying logic in RayTuneReportCallback (#1129)
  • Fix #1070: error when saving model with image feature (#1119)
  • Fixed IterableBatcher incompatibility with ParquetDataset and remote model serialization (#1138)
  • Fix: passing backend and TF config parameters to model load path in experiment
  • Fix: improved TabNet numerical stability + refactoring
  • Fix #1147: passing bn_epsilon to AttentiveTransformer initialization in TabNet
  • Fix #1093: loss value mismatch (#1103)
  • Fixed CacheManager to correctly handle test_set and validation_set (#1189)
  • Fixing TabNet sparsity loss issue (#1199)

Breaking changes

Most models trained with v0.3.3 would keep working in v0.4.
The main changes in v0.4 are additional options, so what worked previously should not be broken now.
One exception to this is that now there is a much strictier check of the validity of the model configuration.
This is great as it allows to catch errors earlier, although configurations that despite errors worked in the past may not work anymore.
The checks should help identify the issues in the configurations though, so errors should be easily ficable.

Contributors

@tgaddair @jimthompson5802 @ANarayan @kaushikb11 @mejackreed @ronaldyang @zhisbug @nimz @kanishk16