Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine domain payloads and provide on-the-fly migration #3664

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

fxamacker
Copy link
Member

@fxamacker fxamacker commented Oct 31, 2024

Closes #3584

Description

This PR:

  1. Combines all domain (non-atree) payloads into one account (non-atree) payload per account.
  2. Combines all domain (atree) storage maps into one account (atree) storage map per account.
  3. Uses on-the-fly (OTF) migration when accounts are modified (write ops), so this won't change idle or read-only accounts.

NOTE: This requires HCU to deploy. Full impact (e.g. eliminating 425+ million mtrie nodes, etc.) won't be seen until all accounts (including idle accounts with no write ops) are eventually migrated.

Context

Currently, accounts store data on-chain under pre-defined domains, such as "storage". Each domain requires domain payload (8-byte non-atree payload) and domain storage map payload (atree payload). Also, each payload requires ~2 mtrie nodes (~2x96 byte overhead).

New domains were added in Cadence 1.0 and domain payloads count increased to 150 million on Sept. 4 (was 80 million pre-spork). Nearly 25% of total payloads on-chain are 8-byte domain payloads. Each account on mainnet has an average of ~4 domain payloads and ~4 domain storage maps.

Solution

This PR creates 1 account (non-atree) payload and 1 account (atree) storage map per account, eliminating all domain (non-atree) payloads and domain storage maps for that given account.

Based on preliminary estimates using Sept. 17, 2024 mainnet state, this approach can:

  • eliminate mtrie nodes: -425 million (-28.5%)
  • reduce payload count: -174 million (also -28.5%)

This commit also includes on-the-fly migration so we can see improvements to accounts that have write activity without requiring downtime. Given this, we won't see the full benefits/impact until all accounts (including idle accounts) are eventually migrated (e.g. using full migration or other means).


  • Targeted PR against master branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work
  • Code follows the standards mentioned here
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

This commit:
1. Combines all domain (non-atree) payloads into
   one account (non-atree) payload per account.
2. Combines all domain (atree) storage maps into
   one account (atree) storage map per account.
3. Uses on-the-fly (OTF) migration when an
   account is modified (write ops).

Currently, accounts store data on-chain under pre-defined domains,
such as "storage".  Each domain requires domain payload
(8-byte non-atree payload) and domain storage map payload (atree payload).
Also, each payload requires ~2 mtrie nodes (~2x96 byte overhead).

New domains were added in Cadence 1.0 and domain payloads count
increased to 150 million on Sept. 4 (was 80 million pre-spork).
Nearly 25% of total payloads on-chain are 8-byte domain payloads.
Each account on mainnet has an average of ~4 domain payloads and
~4 domain storage maps.

This commit creates 1 account (non-atree) payload and 1 account
(atree) storage map per account, eliminating all domain (non-atree)
payloads and domain storage maps for that given account.

Based on preliminary estimates using Sept. 17, 2024 mainnet state,
this approach can:
- eliminate mtrie nodes: -425 million (-28.5%)
- reduce payload count: -174 million (also -28.5%)

This commit also includes on-the-fly migration so we can see
improvements to accounts that have write activity without requiring
downtime.  Given this, we won't see the full benefits/impact until
all accounts (including idle accounts) are eventually migrated
(e.g. using full migration or other means).
Copy link

Cadence Benchstat comparison

This branch with compared with the base branch onflow:master commit a56e521
The command for i in {1..N}; do go test ./... -run=XXX -bench=. -benchmem -shuffle=on; done was used.
Bench tests were run a total of 7 times on each branch.

Collapsed results for better readability

@fxamacker
Copy link
Member Author

fxamacker commented Nov 5, 2024

To test the impact of this PR, I created a full migration program yesterday and ran it on recent mainnet data (Nov 1, 2024 state).

The full migration program on a test vm produced expected results for reducing number of payloads and mtrie nodes by 28.7% each. 🎉

The plan is to deploy this PR as HCU and use its on-the-fly migration, so we won't see full impact until all accounts are eventually migrated (including idle and read-only accounts).

Also ran storage health check and diff-states after migration on Nov 4, 2024 with no problems detected.

@fxamacker
Copy link
Member Author

@turbolent and Cadence team, many thanks for code review session today!

Next steps (as discussed):

  • Create a feature branch and merge this PR into the feature branch (after approval)
  • Open new PRs to:
    • replace domain key string to integer in account storage map (great idea @turbolent!)
    • refactor to simplify code in storage (no problems spotted but code is complex)
    • look into maybe using a feature flag (to eliminate the need for HCU)

I will also cleanup and share the private gist I presented in the meeting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Combine non-atree domain payloads into atree payloads
1 participant