indexer-alt: sequential pipeline #20053
Open
+505
−15
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Introduce a new kind of pipeline for indexing that needs commit data in checkpoint order. This will be used for indexing data that would previously have gone into
objects
orobjects_snapshot
, where rows are modified in place, and so can't be committed out-of-order.Sequential pipelines are split into two parts:
processor
which is shared with the existing concurrent pipeline, and is responsible for turning checkpoint data into values to be sent to the database.committer
which is responsible for batching up prefixes of updates and sending them to the DB when they are complete (no gaps between the last write and what has been buffered).The key design constraints of the sequential pipeline are as follows:
MIN_BATCH_ROWS
: The threshold for eagerly writing to the DB.MAX_BATCH_CHECKPOINTS
: The maximum number of checkpoints that will be batched together in a single transaction.Test plan
This change is primarily tested by the
sum_obj_types
pipeline introduced in the next change.Stack
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.