-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Access] Add registerDB pruning module #6397
base: master
Are you sure you want to change the base?
[Access] Add registerDB pruning module #6397
Conversation
…ub.com:The-K-R-O-K/flow-go into UlyanaAndrukhiv/6068-registerDB-pruning-module
…ub.com:The-K-R-O-K/flow-go into UlyanaAndrukhiv/6068-registerDB-pruning-module
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6397 +/- ##
==========================================
+ Coverage 41.20% 42.65% +1.44%
==========================================
Files 2052 1651 -401
Lines 182191 149190 -33001
==========================================
- Hits 75075 63639 -11436
+ Misses 100824 80169 -20655
+ Partials 6292 5382 -910
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
…ub.com:The-K-R-O-K/flow-go into UlyanaAndrukhiv/6068-registerDB-pruning-module
node.Logger, | ||
builder.RegisterDB, | ||
pstorage.WithPrunerMetrics(builder.RegisterDBPrunerMetrics), | ||
//pstorage.WithPruneThreshold(builder.registerDBPruneThreshold), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WithPruneThreshold
is temporarily commented out and will be re-enabled once PR #6345 is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work! I haven't finished reviewing everything, but here are my comments so far.
…ling according to comments
…ub.com:The-K-R-O-K/flow-go into UlyanaAndrukhiv/6068-registerDB-pruning-module
…ub.com:The-K-R-O-K/flow-go into UlyanaAndrukhiv/6068-registerDB-pruning-module
…ub.com:The-K-R-O-K/flow-go into UlyanaAndrukhiv/6068-registerDB-pruning-module
…ub.com:The-K-R-O-K/flow-go into UlyanaAndrukhiv/6068-registerDB-pruning-module
…drukhiv/6068-registerDB-pruning-module
storage/pebble/registers_pruner.go
Outdated
// Keep the first entry found for this registerID that is <= pruneHeight | ||
keepKeyFound = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[this register's height + 1, last pruned height]
.
This pruning logic also depends on the key iteration direction. Do we iterate a register in increasing height order or decreasing order? We should make this assumption explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is needed to implement the requirement from issue #6068:
For each register prefix, find the first key whose height is less than or equal to the pruning height. This is the earliest entry to keep.
For example, here’s how we currently iterate if pruneHeight
is 99989:
[0x01/key/owner1/99990] [keep, > 99989]
[0x01/key/owner1/99988] [first key to keep < 99989]
[0x01/key/owner1/85000] [remove]
- ...
[0x01/key/owner2/99989] [first key to keep == 99989]
[0x01/key/owner2/99988] [remove]
- ...
[0x01/key/owner3/99988] [first key to keep < 99989]
[0x01/key/owner3/98001] [remove]
I simplified the logic a bit, renamed some variables, and added comments in this commit for more clarity.
Or maybe I misunderstood your concerns about the logic?
module/metrics.go
Outdated
NumberOfRowsPruned(rows uint64) | ||
|
||
// ElementVisited records the element that were visited during the pruning operation. | ||
ElementVisited() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking about how we monitor the pruning.
We are probably interested:
- The last pruned height. It's a useful metrics so that we know the script query below this height would fail.
- Is it pruning right now. If yes, what's the progress percentage. (e.g 5%, 50%).
We are probably not interested:
- how many actual registers are pruned. If we are interested, we can estimated by looking at how often the last pruning progress changes, since pruning is done by batch delete, and each batch has the same size.
NumberOfRowsPruned
tells us it's pruning, but it doesn't tell us the progress.
ElementVisited
also tell su it's pruning, but the actual number is not very meaningful.
I think we could consider just measure one metrics:LatestPrunedHeightWithProgressPercentage
, it could be just a uint64 value.
So if the metrics shows 8923910015
, then it means it's pruning, and the last pruned height 89239100
and the progress is 15%
in the existing pruning iteration. Once the current pruning iteration is completed, the metrics will become 8924910000
, which means we pruned from 89239100
to 89249100
(100%).
Now the question is, how do we know the 15% pruning progress?
We can estimate that by checking the first few hex chars in the register ID key, since we are iterate all keys in a certain order and assuming the keys are distributed evenly, the first few hex chars basically divides all registers into different buckets, and we can calculate the percentage from that.
With only one single metrics, we could reduce the impact to the key iteration.
storage/pebble/testutil.go
Outdated
for k := range data { | ||
keys = append(keys, k) | ||
} | ||
sort.Slice(keys, func(i, j int) bool { return keys[i] < keys[j] }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not matter in which order we store the keys in pebble. Pebble is supposed to store keys in a sorted order for iteration.
And we also don't require the call to sort registers before storing them.
sort.Slice(keys, func(i, j int) bool { return keys[i] < keys[j] }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should sort the keys, as this test data is retrieved from a map where the key is the height, and the entries are out of order. Before storing them in the DB one by one using Registers::Store
, they should be sorted from lowest to highest height.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, Store
requires to be called from low height to higher height.
…ub.com:The-K-R-O-K/flow-go into UlyanaAndrukhiv/6068-registerDB-pruning-module
Closes #6068
In this PR:
pruner
module forregisterDB
which will ensure that unneeded pruned data is removed from the db, freeing up disk space.pruner
into Access and Observer nodes.