-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug fix when all columns match but no rows match #277
Conversation
When I compare the two dataframes with join columns Dataframe 1:
Dataframe 2:
I believe |
I might not be following right, but I think since none of the join columns match ( |
Whoops, I must've been late and I didn't understand it correctly. LGTM |
* refactor SparkCompare * tweaking SparkCompare and adding back Legacy * conditional import * cleaning up tests and using pytest-spark for legacy * adding docs * caching and some typo fixes * adding in doc and pandas 2 changes * adding pandas to testing matrix * drop 3.8 * drop 3.8 * refactoring ^ * rebase fix for #277 * fixing legacy uncode column names * unicode fix for legacy * unicode test for new spark logic * typo fix * changes from PR review
* fixes capitalone#276 * bump version
* refactor SparkCompare * tweaking SparkCompare and adding back Legacy * conditional import * cleaning up tests and using pytest-spark for legacy * adding docs * caching and some typo fixes * adding in doc and pandas 2 changes * adding pandas to testing matrix * drop 3.8 * drop 3.8 * refactoring ^ * rebase fix for capitalone#277 * fixing legacy uncode column names * unicode fix for legacy * unicode test for new spark logic * typo fix * changes from PR review
Fixes #276
@SimonBFrank would you mind pulling down this branch and seeing if it fixes your issue. Long story short, there are no results and the subsequent dictionary with values which get displayed out is not able to generate because of
None
values.More of a hack since we will be deprecating this legacy Spark implementation (#275) for a much more readable and logically similar to Pandas version. Just catching the
TypeError
for this situation and putting out{}
forcolumns_with_any_diffs
andcolumns_fully_matching