materialize-{snowflake,databricks}: make deletions idempotent #2015
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
The Snowflake and Databricks materializations rely on idempotent merge queries. Previously a repeated merge query with a root document field having a
"delete"
sentinel value would cause that row to be re-added to the table after having first been deleted, in rare cases where the merge query is run a second time before the staged files are delete & before the runtime acknowledgement is sent.This fixes that scenario by not inserting rows if the root document field is
"delete"
in merge queries.I was able to manually test this by hacking up a version of both connectors that always fails on the first non-recovery commit it tries to run, and never cleans up any files. For both, I reproduced the insertion of a row with the
"delete"
document when the merge query with the deletion event was re-tried. And with this new code the"delete"
row is not inserted in those situations. In reality, these conditions are rare but are possible, and will result in either inconsistent data in the destination, or worse a completely broken materialization since loading a document column that contains the string"delete"
for an update will not work.Workflow steps:
(How does one use this feature, and how has it changed)
Documentation links affected:
(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)
Notes for reviewers:
(anything that might help someone review this PR)
This change is