[config change] use MessageCommitMode when executing future head block messages #2705

magicxyyz · 2024-09-26T15:48:30Z

Fixes NIT-2812
Pulls: OffchainLabs/go-ethereum#362
Includes: #2712

This PR:

Fixes use of MessageRunMode values, so as MessageCommitMode is used when, and only when, the message is part of a soon-to-be head block. Previously, newly sequenced / synced messages were executed in MessageReplayMode - newly activated / set-cached stylus programs were not cached in long term cache (only in LRU).
Improves repopulating of long term cache after node restart - if program is onchain marked as cached, if its wasm is found in LRU then it is also added to long term cache. That can happen e.g. when a ephemeral call to cached program precedes its onchain execution.
Adds tests for stylus long term cache + for repopulating long term cache from LRU cache.
Adds metrics for Stylus long term cache (merged from Diego's draft: Stylus cache improvements #2712)
Adds config to disable collection of Stylus metrics from Go side (also from the Diego's draft)

system_tests/state_fuzz_test.go

diegoximenes

LGTM overall, requesting changes because I miss a test for this fix

system_tests/state_fuzz_test.go

execution/gethexec/executionengine.go

diegoximenes · 2024-09-30T20:36:18Z

I created a draft PR that exposes Stylus long term cache metrics that can be helpful when implementing tests in this PR.

I didn't implement tests in my PR since it requires that long term caching is working properly, which is not true in the master branch 😬

In case you want to use what I developed you can get the changes from my branch into your branch, and then continue and implement the tests.
You can merge my PR into your branch.

…g is 1)

…stylus-lt-cache-test

tsahee · 2024-10-02T18:54:59Z

arbitrator/stylus/src/cache.rs


        // See if the item is in the long term cache
        if let Some(item) = cache.long_term.get(&key) {
            return Some(item.data());
        }

        // See if the item is in the LRU cache, promoting if so
-        if let Some(item) = cache.lru.get(&key) {
-            let data = item.data();
+        if let Some(item) = cache.lru.peek(&key).cloned() {


This codepath clones data twice: once here in the "get" and the other when returning item.data().
Cloning the entry_size_estimate_bytes is o.k., but we don't want to clone module and engine unnecessarily.
This is where rust gets you :)
There are probably some solutions that would avoid cloning result of the peek, but I think simplest would probably be if you can avoid cloning in item.data() because item itself is discarded right after.

fixed, thanks!

Tsahi pointed out that there's one more unnecessary clone, so that's not fixed yet, working on it :)

passed item.module and item.engine without cloning to the returned Option, let me know if that checks out :)

diegoximenes

Nice :)

arbos/programs/native.go

system_tests/common_test.go

arbitrator/stylus/src/lib.rs

system_tests/program_test.go

Co-authored-by: Diego Ximenes Mendes <[email protected]>

…testing

tsahee

Seems good. Still need to review program_test.go

will create an issue to fix the remaining problem in caching

tsahee · 2024-10-09T01:43:15Z

arbitrator/stylus/src/cache.rs

        }
+        cache.long_term_counters.misses += 1;


I think this should only be increased if long_term_tag is 1. Because that would mean "this should be in long term cache".

good point, it will filter out API calls noise and we should be able to observe how many misses are there when a node starts up. Changed :)

tsahee · 2024-10-09T02:05:51Z

arbitrator/stylus/src/lib.rs

-pub extern "C" fn stylus_get_lru_cache_metrics() -> LruCacheMetrics {
-    InitCache::get_lru_metrics()
+pub extern "C" fn stylus_get_cache_metrics() -> CacheMetrics {
+    InitCache::get_metrics()


@diegoximenes
this is a bug in the previous PR as well.
You're allocating memory here in rust and returning the pointer to go.
Go discards it because it's a garbage collected language and the memory is never released.
Solution is is to allocate CacheMetrics in go, pass a pointer and let rust update the data in the struct. That way rust doesn't allocate anything new and no memory is lost.

I'll open a separate issue.

magicxyyz added 2 commits September 26, 2024 14:37

add MessageRunMode to ProduceBlock parameters

910666f

update geth pin

0513325

cla-bot bot added the s Automatically added by the CLA bot if the creator of a PR is registered as having signed the CLA. label Sep 26, 2024

remove outdated todo comments

ed68053

PlasmaPower reviewed Sep 26, 2024

View reviewed changes

system_tests/state_fuzz_test.go Outdated Show resolved Hide resolved

magicxyyz added 4 commits September 27, 2024 00:51

fuzz state transition: skip malformed batch posting report

39e9481

fuzz state transition: test only existing message run modes

e4d4b97

make lint happy

760081d

Merge branch 'master' into fix-run-mode

bb2df4b

magicxyyz requested review from PlasmaPower, tsahee and diegoximenes September 26, 2024 23:23

tsahee added the design-approved label Sep 27, 2024

magicxyyz added 2 commits September 27, 2024 12:41

Merge branch 'master' into fix-run-mode

04db8e1

Merge branch 'master' into fix-run-mode

85a4fe2

diegoximenes requested changes Sep 30, 2024

View reviewed changes

system_tests/state_fuzz_test.go Show resolved Hide resolved

execution/gethexec/executionengine.go Show resolved Hide resolved

diegoximenes added 3 commits September 30, 2024 11:15

DisableStylusCacheMetricsCollection flag

14d57e1

Stylus long term cache metrics

5e8c4a2

Rust lint

02f1dc0

magicxyyz added 7 commits October 1, 2024 21:38

InitCache: add items found in LRU to long term cache (if long_term_ta…

a478460

…g is 1)

Merge branch 'master' into fix-run-mode

ff947b0

Merge branch 'stylus-lru-to-long-term' into fix-run-mode

b017c55

rustfmt InitCache.get

d61710f

update geth pin

328a386

Merge branch 'master' into fix-run-mode

6d7ba9c

Merge remote-tracking branch 'origin/stylus_cache_improvements' into …

989c099

…stylus-lt-cache-test

tsahee requested changes Oct 2, 2024

View reviewed changes

magicxyyz added 2 commits October 3, 2024 02:46

system_tests: fix cache tag used when wrapping wasm database in test

da58307

add stylus wasm long term cache test

6c69a96

magicxyyz added 2 commits October 3, 2024 23:21

Merge branch 'master' into fix-run-mode

e702910

don't clone cache item twice

f65156d

magicxyyz requested review from diegoximenes and tsahee October 3, 2024 23:12

magicxyyz changed the title ~~use MessageCommitMode when executing future head block messages~~ [config change] use MessageCommitMode when executing future head block messages Oct 3, 2024

magicxyyz marked this pull request as draft October 4, 2024 10:45

magicxyyz and others added 2 commits October 4, 2024 13:14

refactor wasm lru cache test

d5f5f11

Merge branch 'master' into fix-run-mode

bb21220

magicxyyz marked this pull request as ready for review October 4, 2024 11:29

magicxyyz and others added 2 commits October 4, 2024 18:10

Merge branch 'master' into fix-run-mode

2a915ba

avoid unncessary cloning of cache item data

fe2233f

diegoximenes requested changes Oct 7, 2024

View reviewed changes

magicxyyz and others added 7 commits October 8, 2024 14:56

system_tests: use wasmCacheTag when calling WrapDatabaseWithWasm

67e09f3

Co-authored-by: Diego Ximenes Mendes <[email protected]>

Merge branch 'master' into fix-run-mode

6f7fd8e

fix names of stylus cache metrics

b178d70

fix build2ndNode

c6619a5

use fixed arbos tag in stylus_clear_long_term_cache as it's only for …

495ade6

…testing

program_test: add entry sizes sanity check

dca2484

program_test: add comment

737ffb1

magicxyyz requested a review from diegoximenes October 8, 2024 16:22

diegoximenes previously approved these changes Oct 8, 2024

View reviewed changes

tsahee reviewed Oct 9, 2024

View reviewed changes

count long term cache misses only when cache tag is 1

6396a77

magicxyyz dismissed diegoximenes’s stale review via 6396a77 October 9, 2024 12:03

tsahee approved these changes Oct 9, 2024

View reviewed changes

tsahee enabled auto-merge October 9, 2024 20:57

Merge branch 'master' into fix-run-mode

c99c24c

tsahee merged commit 32c3f4b into master Oct 9, 2024
16 checks passed

tsahee deleted the fix-run-mode branch October 9, 2024 21:31

diegoximenes mentioned this pull request Oct 10, 2024

Fix memory leak when getting stylus cache metrics #2734

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[config change] use MessageCommitMode when executing future head block messages #2705

[config change] use MessageCommitMode when executing future head block messages #2705

magicxyyz commented Sep 26, 2024 •

edited

Loading

diegoximenes left a comment

diegoximenes commented Sep 30, 2024

tsahee Oct 2, 2024

magicxyyz Oct 3, 2024

magicxyyz Oct 4, 2024

magicxyyz Oct 4, 2024

diegoximenes left a comment

tsahee left a comment

tsahee Oct 9, 2024

magicxyyz Oct 9, 2024

tsahee Oct 9, 2024

tsahee Oct 9, 2024

[config change] use MessageCommitMode when executing future head block messages #2705

[config change] use MessageCommitMode when executing future head block messages #2705

Conversation

magicxyyz commented Sep 26, 2024 • edited Loading

diegoximenes left a comment

Choose a reason for hiding this comment

diegoximenes commented Sep 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

diegoximenes left a comment

Choose a reason for hiding this comment

tsahee left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

magicxyyz commented Sep 26, 2024 •

edited

Loading