-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Access] Add registerDB pruning module #6068
Comments
I fixed the example a bit for clarity: [0x01/key/path/1/99999] [keep, > 99989] |
The raw implementation stepsNew CLI flagsThe Access and Observer nodes should have new command arguments: RegisterDBPruner moduleThe module should be initialized with a set of configuration flow-go/storage/pebble/registers.go Lines 17 to 21 in eeac479
APIThe module should be a WithOptions: Configures the
Height Update: It first updates the Row Deletion: The method iterates over the database, identifying the first key who’s height is less than or equal to the pruneHeight and deleting rows with the same key prefix with lower heights, ensuring that all outdated data is removed efficiently. Batch Operations: Deletions are batched to minimize I/O operations and improve performance. The method commits the batch after processing all relevant rows.
|
Hey @peterargue! Do we understand correctly that |
@peterargue Also, could you please check the implementation steps, that @UlyanaAndrukhiv provided for this issue above? Thanks in advance! |
The implementation would probably look like collecting some amount of deletes into a batch, committing the batch, then pausing. So we would need to tune |
Let's not make then
The pruner is single threaded, so we probably don't need to use atomics. If there turn out to be cases when we need to expose values to outside processes, we can use atomic.
pruning should happen on some regular interval (10m, 1h, etc).
if prune interval is a const, we won't need an option for it.
this method makes sense, but we don't need to expose it. it should be called by the loop.
it should update
yes
yes, and we will need to tune the batch size along with
yes |
How about this breakdown?
Steps 2-4 would all include writing comprehensive unit tests. I think 1 must be done first, but could be a collaboration. |
I think 1-2, should be done first by one person, but other steps should work, as you describe. Will try it out, and parallelize the work as described. Thank you for your advice! |
Problem Definition
This is the second step to enabling pruning on the registers db, and depends on #6066.
Now that db queries will only include responses for unpruned heights, we can layer on the pruning module. This module will ensure that unneeded pruned data is removed from the db, freeing up disk space.
Proposed Solution
Configuration:
pruneThreshold
⇒ number of blocks below thelatestHeight
should be keptpruneInterval
⇒ number of pruned blocks in the db, above which pruning should be triggeredSteps:
Periodically scan through all rows in the db
pruneHeight - firstHeight
>pruneInterval
pruneHeight = latestHeight - pruneThreshold
is the lowest height to keepfirstHeight
entry in the db to be the prune height. This ensures that any entries below will not be used on subsequent startups, avoiding consistency problems if the node were to crash or restart mid pruning cycleFor each register prefix, find the first key who’s height is less than or equal to the prune height. This is the earliest entry to keep
Delete all rows with the same key prefix with lower heights
💡 Note: we will want to throttle how quickly the pruner cycles through keys to manage resource consumption.For example
If
latestHeight
is99999
andpruneThreshold
is 10, then we will remove entries for all heights less than99989
[0x01/key/path/1/99999]
[keep, ≥99989
][0x01/key/path/1/99990]
[keep, ≥99989
][0x01/key/path/1/99988]
[keep, first row <99989
][0x01/key/path/1/85000]
[remove][0x01/key/path/2/99989]
[keep, ≥99989
][0x01/key/path/2/99988]
[remove][0x01/key/path/3/99988]
[keep, first row <99989
][0x01/key/path/3/98001]
[remove][0x02/key/path/0/99900]
[keep, first row <99989
]Considerations
Definition of Done
pruneInterval
andpruneThrottleDelay
which sets a delay the pruner should use while iterating the db to reduce load on the overall system.Tasks
The text was updated successfully, but these errors were encountered: