Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinator behaviour when task.max > 1 and producer uses round-robin strategy due to null keys #290

Open
anmol opened this issue Sep 11, 2024 · 0 comments

Comments

@anmol
Copy link

anmol commented Sep 11, 2024

Hi,

I am implementing a CDC pipeline from Oracle which has tables not having explicit primary keys. We are specifying the id columns in sink connector based data awareness(no constraint though) and the sink connector is able to work fine.

However, my concern is that the lack of primary key on source means null keys in Kafka and that the mutations on a source record (multiple Updates) are not guaranteed an ordering in Kafka. (Kafka producer behaviour)
Then if we set task.max>1 in sink connector properties, the Updates on the same records may be processed by different tasks(workers) and in a different order.

Can there be a possibility that this results in an inconsistent behaviour during commit, like update ordering getting changed, due to coordinator committing second update in first batch and first update in subsequent commit?

cc @bryanck

Thanks in Advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant