Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to export as sklearn RF #587

Merged
merged 12 commits into from
Oct 12, 2024
Merged

Conversation

hcho3
Copy link
Collaborator

@hcho3 hcho3 commented Oct 4, 2024

This PR adds a function to export Treelite models as scikit-learn random forests.
It makes use of the field accessor API.

Usage:

# tl_model is an treelite.Model object

# Returns either RandomForestClassifier / RandomForestRegressor
clf = treelite.sklearn.export_model(tl_model)  

Limitations

  • Export to GradientBoosting is not yet supported.
  • Treelite models with categorical splits are not yet supported.
  • Test coverage currently ensures that predict() and predict_proba() return correct result. Other estimator metadata may be missing or incorrect.
  • Most of export logic is written in the Python layer so may be slow for very large models.

@hcho3 hcho3 marked this pull request as draft October 4, 2024 07:13
Copy link

codecov bot commented Oct 4, 2024

Codecov Report

Attention: Patch coverage is 90.30303% with 16 lines in your changes missing coverage. Please review.

Project coverage is 84.87%. Comparing base (386f50c) to head (36cf0bf).
Report is 1 commits behind head on mainline.

Files with missing lines Patch % Lines
python/treelite/sklearn/exporter.py 85.58% 16 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           mainline     #587      +/-   ##
============================================
+ Coverage     84.80%   84.87%   +0.06%     
============================================
  Files            75       77       +2     
  Lines          6587     6743     +156     
  Branches        528      531       +3     
============================================
+ Hits           5586     5723     +137     
- Misses         1001     1020      +19     
Flag Coverage Δ
?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@hcho3 hcho3 changed the title [WIP] Add ability to export as sklearn RF Add ability to export as sklearn RF Oct 4, 2024
@hcho3 hcho3 marked this pull request as ready for review October 4, 2024 08:49
Copy link
Contributor

@wphicks wphicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! One question for my own benefit, but it should not block merge.

src/model_query.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@betatim betatim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me. Left a few comments which are mostly for my education.

It could be good to add some tests to trigger the various "this model can't be exported to scikit-learn" conditions. That way it is tested and we make codecov happy.

Could also be worth having one test that uses a round trip from scikit-learn to scikit-learn to check that all the fitted attributes (the ones ending in _) are set to the right value (or not set if not supported)

python/treelite/sklearn/exporter.py Outdated Show resolved Hide resolved
python/treelite/sklearn/exporter.py Show resolved Hide resolved
python/treelite/sklearn/exporter.py Show resolved Hide resolved
python/treelite/sklearn/exporter.py Outdated Show resolved Hide resolved
@hcho3 hcho3 mentioned this pull request Oct 12, 2024
@hcho3
Copy link
Collaborator Author

hcho3 commented Oct 12, 2024

Created #588 to remind myself to remove the pin cpplint==1.6.1 later.

@hcho3 hcho3 merged commit 98f3ff8 into dmlc:mainline Oct 12, 2024
19 checks passed
@hcho3 hcho3 deleted the sklearn_export branch October 12, 2024 06:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants