-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3-SQS source does not populate partition columns in the dataframne #2
Comments
@DipeshV seems like a bug. |
Hi Abhishek, I am currently adding partition manually, which makes my code a bit messy and cannot be used as is while adding new integrations. Thanks, |
@DipeshV yeah i'll raise a PR for the fix today. |
@DipeshV I've created a pull request. Can you build a jar from the new branch and try it out? |
@DipeshV Did you get a chance to try out the new code? Does it solve your use case? |
@abhishekd0907 - I haven't checked the new code since I had currently manually added the partitions from input_file_name(). |
Hi,
I are using this "s3-sqs" connector with spark structured streaming and deltalake to process incoming data in partitioned s3 buckets.
The problem I are facing is with "s3-sqs" source is that the file is directly read and returns a dataframe/dataset without the partition columns.
Hence, when we merge the source and target dataframes, we get all the partition columns as HIVE_DEFAULT_PARTITION.
Do have any solution/workaround to add partition colums as a part of dataframe??
Thanks and regards,
Dipesh Vora
The text was updated successfully, but these errors were encountered: