Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(store): correct corrupted files on write #3859

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

walldiss
Copy link
Member

This PRs adds corruption detection to store. Corrupted files are detected by checking if their length is corresponding to expected file size. There is no content inspection for files verification yet, but it can be added via datahash compute. Added support for:

  • ODSQ4 file
  • ODS file

TODO: store level tests are blocked by #3847

@codecov-commenter
Copy link

codecov-commenter commented Oct 17, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 48 lines in your changes missing coverage. Please review.

Project coverage is 45.07%. Comparing base (2469e7a) to head (5eed744).
Report is 344 commits behind head on main.

Files with missing lines Patch % Lines
store/store.go 41.46% 22 Missing and 2 partials ⚠️
store/file/ods.go 65.38% 6 Missing and 3 partials ⚠️
store/file/q4.go 52.63% 6 Missing and 3 partials ⚠️
store/file/ods_q4.go 40.00% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3859      +/-   ##
==========================================
+ Coverage   44.83%   45.07%   +0.23%     
==========================================
  Files         265      314      +49     
  Lines       14620    21895    +7275     
==========================================
+ Hits         6555     9869    +3314     
- Misses       7313    10991    +3678     
- Partials      752     1035     +283     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

store/store.go Outdated
Comment on lines 186 to 196
// Validate the size of the file to ensure it's not corrupted
err = file.CheckODSQ4Size(pathODS, pathQ4, square)
if err != nil {
err = s.removeODSQ4(height, roots.Hash())
if err != nil {
return false, fmt.Errorf("removing corrupted ODSQ4 file: %w", err)
}
err = file.CreateODSQ4(pathODS, pathQ4, roots, square)
if err != nil {
return false, fmt.Errorf("recreating ODSQ4 file: %w", err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Readbility nit: extract into a validateAnRecoverX method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is much easier to understand what is going on here by not hiding details into smaller func. Function calls are self-explanatory here and validateAnRecoverX is not going to be reused anywhere

Copy link
Member

@Wondertan Wondertan Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recovery details have layers of nesting involved with edcase logic unrelated to createX method. It would be more readable for a novice to read what create does only and jump on recovery logic only if needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially recovery logic is a separate operation thus needs a separate method

store/file/ods.go Outdated Show resolved Hide resolved
}

shares := filledSharesAmount(eds)
expectedSize := ods.hdr.OffsetWithRoots() + shares*ods.hdr.ShareSize()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldnt we verify header as well instead of trusting it? Whats if header also corrupted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will fail to read the header in check and return the error (same results as corrupted)

Copy link
Member

@Wondertan Wondertan Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrupted doesn't necessarily mean malformed. It can be read, but it has a wrong value.

Copy link
Member Author

@walldiss walldiss Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the header has corrupted values it is very unlikely it will match exactly expected file size, and will fail the check. Keep in mind this is only Size check, not a full content inspection. In case of full content integrity check it would be necessary to verify headers values as well as content of roots and data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the header has corrupted values it is very unlikely it will match exactly expected file size

This is what concerns me, we don't know if it will match or not. I don't think trusting header is valid when we dont trust the file. We could simply get the file size from the eds, arent we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Share size is tricker than length, but I think I have a workaround. We can use the size of first share. So we don't have to trust header on disk at all. Will update the code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:storage kind:feat Attached to feature PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants