Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: combine LanceFragment.merge_columns and LanceDataset.add_columns in one implementation #3021

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

westonpace
Copy link
Contributor

These two methods were very similar but took different arguments. This combines the implementation of the two methods.

In addition:

  • The python implementation of Updater was removed as:
    • It is no longer used (since merge_columns now uses Fragment::add_columns)
    • It was internal and could potentially be a maintenance burden in the future
  • The python method LanceFragment.add_columns was removed since it has been deprecated for six months and could be confusing.

In a few weeks it might be good to deprecate LanceFragment.merge_columns and add LanceFragment.add_columns back with a new signature that mirrors LanceDataset.add_columns.

BREAKING CHANGE: The rust method LanceDataset.add_columns now has an optional batch_size parameter.

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 62.29508% with 23 lines in your changes missing coverage. Please review.

Project coverage is 78.15%. Comparing base (f9024ce) to head (7b5f836).

Files with missing lines Patch % Lines
rust/lance/src/dataset/fragment.rs 5.55% 17 Missing ⚠️
rust/lance/src/dataset/schema_evolution.rs 87.17% 3 Missing and 2 partials ⚠️
rust/lance/src/dataset/updater.rs 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3021      +/-   ##
==========================================
- Coverage   78.19%   78.15%   -0.05%     
==========================================
  Files         239      239              
  Lines       76782    76825      +43     
  Branches    76782    76825      +43     
==========================================
+ Hits        60043    60045       +2     
- Misses      13669    13693      +24     
- Partials     3070     3087      +17     
Flag Coverage Δ
unittests 78.15% <62.29%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wjones127 wjones127 self-requested a review October 18, 2024 16:31
Copy link
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. This looks great!

@wjones127
Copy link
Contributor

In a few weeks it might be good to deprecate LanceFragment.merge_columns and add LanceFragment.add_columns back with a new signature that mirrors LanceDataset.add_columns.

Agreed. The naming inconsistency makes me sad :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants