[Move/Example] Run tests in parallel #18802

amnn · 2024-07-25T17:23:27Z

Description

Run the Move example tests in parallel, so that they don't timeout and ruin everyone's day.

Test plan

sui-framework-test$ cargo nextest run -- run_examples_move_unit_tests

Locally, tests used to run in 180s, after this change it takes about 30 seconds.

Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

## Description Run the Move example tests in parallel, so that they don't timeout and ruin everyone's day. ## Test plan ``` sui-framework-test$ cargo nextest run -- run_examples_move_unit_tests ```

vercel · 2024-07-25T17:23:31Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
sui-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jul 25, 2024 5:26pm

3 Skipped Deployments

Name	Status	Updated (UTC)
multisig-toolkit	⬜️ Ignored (Inspect)	Jul 25, 2024 5:26pm
sui-kiosk	⬜️ Ignored (Inspect)	Jul 25, 2024 5:26pm
sui-typescript-docs	⬜️ Ignored (Inspect)	Jul 25, 2024 5:26pm

amnn · 2024-07-25T17:27:00Z

crates/sui-framework-tests/src/unit_tests.rs

+
+    futures::future::join_all(move_packages.into_iter().map(|p| {
+        tokio::task::spawn(async move {
+            check_package_builds(&p);


I played around with splitting this up, but it only shaved about a second of the run-time, so thought it would be clearer to do it as one step.

I don't see any async here, why use this over spawning threads?

In fact this is an anti-pattern, executing synchronous work in an async threadpool

Indeed, there is nothing async here. I ended up implementing it this way because it seemed like the easiest way to take advantage of a multi-threaded pool (to limit the number of threads being spawned). If you can point me to a better way of doing that, I'll do that instead.

https://docs.rs/rayon/latest/rayon/iter/index.html

Does this work?

I'm just dogscience-ing my way through life here, so I can be convinced that this is the right thing to do. My impression was that rayon was primarily intended to introduce parallelism into functional stream-based programs. It looks like it has a thread pool library, and we could use that, but I can't really make out what the moral difference is between abusing rayon for its thread pool implementation vs abusing tokio.

I'll admit that in this exact instance its not a huge issue to abuse tokio like this (because nothing else is sharing this particular runtime) but it is a very bad anti-pattern and if was done in production code could cause bad things to happen.

You can essentially break computation down into two camps: io bound (async) and cpu bound (sync) work. for io bound tasks, the idea is that lots of independent tasks can operate concurrently on a single (or multiple) thread, relinquishing control of execution back to the schedule when the task needs to wait for some io operation to happen. cpu bound tasks generally operate in a way such that they monopolize the thread they are scheduled on until the entire task is complete.

rayon is specifically designed to handle and schedule cpu bound work while tokio actually has two thread pools, a blocking pool for scheduling cpu bound work and an async pool (where things go when tokio::spawnd) for async/io bound tasks.

If you schedule blocking/cpu bound work on an async pool then other async tasks could get stuck behind them waiting for a very long time till they're able to make any forward progress.

All of this aside, did you take a look at using datatest-stable like we use for a number of other file-based tests such that each move example would easily be their own test with little to no care to those who are working or adding examples?

I think datatest-stable would be a great fit here. I will put up a separate PR for that, but will keep this one around in case the timeouts do cause a problem, because I haven't had to set-up a fresh data test before, so I don't know how long it will take me, and I don't want people to be blocked if it takes a little while.

The reason I wasn't too hot on rayon in this case was that

although the test is not async, it's also not CPU bound -- it's I/O bound but written to use blocking I/O,

rayon really seems geared towards running a computation to produce a certain result, while these tests are purely running for their effects.

#18813 -- thanks for the suggestion @bmwill !

although the test is not async, it's also not CPU bound -- it's I/O bound but written to use blocking I/O,

Yes, i would generally put "blocking I/O" in the same camp as CPU bound tasks as they block the current thread until the operation is complete vs handing it off to something else to attempt to make progress on.

## Description Use `datatest-stable` to find all the Move examples we might want to build and test, instead of stashing this away in a rust test. ## Test plan ``` sui$ cargo nextest run -p sui-framework-tests --test move_tests ``` + CI Closes #18802 --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK: - [ ] REST API:

[Move/Example] Run tests in parallel

1bd2379

## Description Run the Move example tests in parallel, so that they don't timeout and ruin everyone's day. ## Test plan ``` sui-framework-test$ cargo nextest run -- run_examples_move_unit_tests ```

amnn requested review from bmwill and a team July 25, 2024 17:23

amnn self-assigned this Jul 25, 2024

vercel bot deployed to Preview – sui-docs July 25, 2024 17:26 View deployment

amnn commented Jul 25, 2024

View reviewed changes

amnn mentioned this pull request Jul 26, 2024

[Move/Examples] Switch to datatest #18813

Merged

8 tasks

amnn closed this in #18813 Jul 26, 2024

amnn closed this in 3dd9ddd Jul 26, 2024

amnn deleted the amnn/fast-move-test branch August 4, 2024 13:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Move/Example] Run tests in parallel #18802

[Move/Example] Run tests in parallel #18802

amnn commented Jul 25, 2024

vercel bot commented Jul 25, 2024 •

edited

Loading

amnn Jul 25, 2024

bmwill Jul 25, 2024

bmwill Jul 25, 2024

amnn Jul 25, 2024

tnowacki Jul 25, 2024

amnn Jul 25, 2024

bmwill Jul 26, 2024

amnn Jul 26, 2024

amnn Jul 26, 2024

bmwill Jul 26, 2024

[Move/Example] Run tests in parallel #18802

[Move/Example] Run tests in parallel #18802

Conversation

amnn commented Jul 25, 2024

Description

Test plan

Release notes

vercel bot commented Jul 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vercel bot commented Jul 25, 2024 •

edited

Loading