Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Badger] Add universal database operations #6465

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

zhangchiqing
Copy link
Member

@zhangchiqing zhangchiqing commented Sep 14, 2024

This PR implemented the low level database operations that can be shared by both badger and pebble implementation.

Since badger operations will be refactored to use batch updates instead of transactions, this makes the badger db operations and pebble db operations very similar and I managed to unify them.

A separate PR that refactors the approvals will demonstrate how the unified database operations can be used there. (see #6466)

This allows us to remove both the badger-specific and pebble-specific database modules, and only keep one universal version, easier to maintain.

@codecov-commenter
Copy link

codecov-commenter commented Sep 14, 2024

Codecov Report

Attention: Patch coverage is 28.60262% with 327 lines in your changes missing coverage. Please review.

Project coverage is 41.16%. Comparing base (681dd9a) to head (17972e0).

Files with missing lines Patch % Lines
storage/operation/badgerimpl/writer.go 0.00% 48 Missing ⚠️
storage/operation/pebbleimpl/writer.go 0.00% 42 Missing ⚠️
storage/operation/reads.go 67.74% 27 Missing and 13 partials ⚠️
storage/operation/badgerimpl/iterator.go 0.00% 33 Missing ⚠️
storage/operation/pebbleimpl/iterator.go 0.00% 33 Missing ⚠️
storage/operation/dbtest/helper.go 0.00% 31 Missing ⚠️
utils/unittest/unittest.go 0.00% 27 Missing ⚠️
storage/operation/badgerimpl/reader.go 0.00% 20 Missing ⚠️
storage/operation/codec.go 27.27% 16 Missing ⚠️
storage/operation/pebbleimpl/reader.go 0.00% 14 Missing ⚠️
... and 3 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6465      +/-   ##
==========================================
- Coverage   41.19%   41.16%   -0.04%     
==========================================
  Files        2052     2064      +12     
  Lines      182191   182649     +458     
==========================================
+ Hits        75062    75183     +121     
- Misses     100839   101156     +317     
- Partials     6290     6310      +20     
Flag Coverage Δ
unittests 41.16% <28.60%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -0,0 +1,189 @@
package operation_test
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test cases are taken from similar ones in storage/badger/operation/common_test.go

@@ -0,0 +1,278 @@
package operation_test
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test cases are taken from similar ones in storage/badger/operation/common_test.go

Comment on lines 74 to 78
keyCopy := make([]byte, len(key))
copy(keyCopy, key)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copying the key is for safety, otherwise caller might make mistake storing the key in a slice and causing problem because this iteration here might overwrite the underlying slice. Making a copy of the key could prevent caller from making such mistake.

tx := db.NewTransaction(false)
iter := tx.NewIterator(options)

lowerBound, upperBound := storage.StartEndPrefixToLowerUpperBound(startPrefix, endPrefix)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, we are applying the same approach as pebble to compute the exclusive upperBound here. This allows us to get rid of the global max value stored in the database (see: storage/badger/operation/max.go)

Comment on lines +22 to +36
keys := [][]byte{
// before start -> not included in range
{0x09, 0xff},
// within the start prefix -> included in range
{0x10, 0x00},
{0x10, 0xff},
// between start and end -> included in range
{0x15, 0x00},
{0x1A, 0xff},
// within the end prefix -> included in range
{0x20, 0x00},
{0x20, 0xff},
// after end -> not included in range
{0x21, 0x00},
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same test cases to test the boundary cases

import (
// #nosec
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix linter error

}

func (b *ReaderBatchWriter) Commit() error {
err := b.batch.Commit(pebble.Sync)
Copy link
Member Author

@zhangchiqing zhangchiqing Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the database operations being universal, it's easy to write benchmark to compare the performance between badger and pebble implementation.

The read performance for badger and pebble are very similar.

BenchmarkRetrieve/BadgerStorage-10               1217998               948.1 ns/op
BenchmarkRetrieve/PebbleStorage-10               1699320               725.4 ns/op

However, for writes, the benchmark shows pebble is 100x slower than badger.

I initially did:

cd storage/operation

go test -bench=BenchmarkUpsert
goos: darwin
goarch: arm64
pkg: github.com/onflow/flow-go/storage/operation
BenchmarkUpsert/BadgerStorage-10                   31804             35173 ns/op
BenchmarkUpsert/PebbleStorage-10                     270           4267359 ns/op
PASS
ok      github.com/onflow/flow-go/storage/operation     4.886s

I tracked down to this Commit call by adding the following code.

    n1 := time.Now()
    err := b.batch.Commit(pebble.Sync)
    n2 := time.Now()
    fmt.Println("pebbleimpl.Writer.go: Commit() time:", n2.Sub(n1).Nanoseconds())

Then I did the same to badger to record the duration for the Commit call, and run the TestReadWrite test case which writes a single entity data to the database, the result shows pebble is 25x slower than badger for writes (read performance is similar). I'm not sure why.

cd storage/operation
gt -run=TestReadWrite
badgerimpl.Writer.go: Commit() time: 147250
badgerimpl.Writer.go: Commit() time: 71458
badgerimpl.Writer.go: Commit() time: 72708
pebbleimpl.Writer.go: Commit() time: 3684875
pebbleimpl.Writer.go: Commit() time: 3006416
pebbleimpl.Writer.go: Commit() time: 3001083
badgerimpl.Writer.go: Commit() time: 6500
pebbleimpl.Writer.go: Commit() time: 584
PASS
ok      github.com/onflow/flow-go/storage/operation     1.136s

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the benchmarks and finding this regression @zhangchiqing 🙇

100x slower writes is a big difference! 😬

I think it would be worthwhile to spend some time better understanding this finding and investigating how it would impact Flow (for example, the consensus hot-path). My gut feeling is that this benchmark in isolation is overstating the actual impact we would see. If it isn't, and switching to Pebble is going to make all our writes 2 orders of magnitude slower, I'd rather know that now.

  • Are there Pebble configs we can change to bring write speed down?
  • Does this result hold if aspects of the environment are changed (eg. maybe it's an artifact of the test database being very small? Just guessing)
  • Does this result hold if aspects of the load are changed (eg. committing a batch containing many writes)

I spent a short time during the review trying a few different iterations on BenchmarkUpsert:

  • Setting WithKeepL0InMemory to false (previously was true).
  • Setting WithValueThreshold to 1 byte (previously was 1kb).
    • Badger is different from Pebble in that it will sometimes store values directly in the LSM tree, and other times in a value log file, based on the value's size. All the values we were storing were below that 1kb threshold.
  • Using larger 10kb values

Unfortunately, all of these had similar results where Pebble was still ~100x slower...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this might take some time to investigate. I will do it in a separate issue. But it should not block us from getting in with this PR, since it also allows us to move from badger transaction to badger batch updates.

Regarding the regression, Peter suspect it might be to do with the key-compression and decompression, since he did a profiling in the past and saw a lot of CPU cycles are spent on that. So might worth to try disabling compression in pebble and see if it makes a difference.

storage/batch.go Outdated Show resolved Hide resolved
storage/batch.go Outdated Show resolved Hide resolved
utils/unittest/unittest.go Outdated Show resolved Hide resolved
storage/operations.go Outdated Show resolved Hide resolved
storage/operations.go Show resolved Hide resolved
storage/operation/reads_test.go Outdated Show resolved Hide resolved

func TestTraverse(t *testing.T) {
dbtest.RunWithStorages(t, func(t *testing.T, r storage.Reader, withWriter dbtest.WithWriter) {
keys := [][]byte{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would probably be easier to work with as a map from key -> value. If we define the key type as [2]byte, and then do key[:] when inserting it, that would work.

Feel free to skip this suggestion.

storage/operation/writes_test.go Outdated Show resolved Hide resolved
storage/operation/writes_test.go Outdated Show resolved Hide resolved
storage/operation/writes_test.go Outdated Show resolved Hide resolved
Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I like the way you structured testing so we can easily test both database backends together 💯

Besides the suggestions above, could you:

  • Make sure exported types have at least a basic godoc (copying the interface documentation to structs that implement it works). I added suggestions about this in a few places, but not all.
  • Make sure all public functions that return errors document any expected error types, or that "no errors are expected during normal operation".

@zhangchiqing zhangchiqing force-pushed the leo/db-ops branch 4 times, most recently from 284750a to 93fcaa1 Compare November 1, 2024 21:45
Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎸

}

// Writer is an interface for batch writing to a storage backend.
// It cannot be used concurrently for writing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// It cannot be used concurrently for writing.
// One Writer instance cannot be used concurrently by multiple goroutines.
// However, many goroutines can write concurrently, each using their own Writer instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants