-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: separate out PR tests and run fuzzing and race in nightlies #2673
Conversation
In order for this to work we will need to disable the required tests as they no longer exist @evan-forbes (and require the new tests) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It breaks up our test suite into 4 parallel test jobs.
Why? In my experience, running 4 parallel jobs isn't faster than running 1 sequential job. See #1439. IMO 1 sequential job is easier to monitor than 4 parallel jobs.
It doesn't replicate the work by removing test-short and test-coverage, combining them into a single job.
It looks like make test-short
got dropped from this PR and I think we should retain it so that CI can inform us if someone adds a long running test (i.e. 10+ mins) and forgets to add it to the list of tests skipped in short mode.
Ref: #2361
This PR also moves our race tests to the nightlies
Why? make test-race
doesn't appear to be a bottleneck. It takes ~8m to run and test takes ~17m to run. Not blocking on it but it seems like we can run make test-race
on every PR b/c make test
takes longer and they run in parallel.
Ref: https://github.com/celestiaorg/celestia-app/actions/runs/6499346269
and adds fuzzing to the nightlies
👍
I think the advantage is more that you only have to rerun the test split that failed instead of rerunning everything (I think you're right that it doesn't allocate more resources and most tests run in parallel anyway). There actually might be a way where we can cache the results so rerunning doesn't rerun the tests that already passed (I actually don't know how this works)
I don't think there's a need to run
Yes I guess the majority of the time of tests is taken is integration tests. Maybe I can just add this back. The reason for removing it is because race detection for a single threaded application is useless. There is only one place in the entire repo that we spin up a go routine and that is the txsim. At a high level, I'm just trying to reduce the time our c.i. takes. It would be nice to get everything down to about 5mins. Open to other suggestions |
Ahh thanks for pointing out the re-run thing for when test fails. I agree we have a problem with needing to re-run the test task frequently. IMO the problem is caused by frequent test flakes. Proposals that come to mind:
Part of my hesitation is that running the tests in parallel on CI is more difficult to monitor and debug when a test fails. Additionally parallel tests in CI are less representative of how tests are run by contributors during development. So I'm concerned of edge cases where
Sorry I don't follow what you mean by integration tests that needed to be skipped... I'm repeating myself but I do think there's a reason to run Example
Makes sense. Down to move |
cc @MSevey fwiw the only thing maintainers can do is to require merge/squash/rebase, otherwise its identical to write permissions |
@rootulp has admin rights currently from doing the binary signing work. So he can resolve 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong preference here, so I will defer to the author and second reviewer
we should definitely take the time to fix our tests and refactor them so that the tests that actually need to use comet can but in a robust way and the tests that don't need it don't use it
ensuring that all of the tests always pass is clearly a difficult thing to maintain, so having the ability to quickly rerun only portions of the tests is also useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LMK if you want me to disable required CI checks in order to merge this. I'm personally not a fan of split test files (for reasons mentioned above) but I won't block on it if other maintainers prefer it.
Before we merge this, it may make sense to verify that all the new CI jobs pass on a fork
go-version: ${{ env.GO_VERSION }} | ||
|
||
- name: Generate coverage.txt | ||
run: make test-coverage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] if we're removing usage of make test-coverage
should we remove the command from the Makefile so that we don't have to maintain it?
@@ -26,3 +27,26 @@ jobs: | |||
|
|||
- name: Run e2e tests | |||
run: make test-e2e | |||
|
|||
race-tests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit][not blocking] so maintainers can easily link the Makefile command to the CI task. Without consistent naming, it's easy to get lost translating between CI job and Makefile command
race-tests: | |
test-race: |
- name: Run tests in race mode | ||
run: make test-race | ||
|
||
fuzz_tests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[new issue] it may be helpful to add a Makefile command that runs go test -fuzz
with similar params to this CI job so that devs can attempt to reproduce an error locally if one is surfaced on CI
Closing this in favour of #2787 which just focuses on fuzzing |
This PR attempts to improve our CI. It breaks up our test suite into 4 parallel test jobs. It doesn't replicate the work by removing
test-short
andtest-coverage
, combining them into a single job.This PR also moves our race tests to the nightlies and adds fuzzing to the nightlies