-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parallel processing mode #1
base: develop
Are you sure you want to change the base?
Add parallel processing mode #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I don't like the accumulation of the output, feels highly problematic to not have interactivity until the command stop. WaitGroup with concurrent worker library already exist like https://github.com/remeh/sizedwaitgroup, I don't think we need to re-implement this.
Finally, I have some concerns with ordering, current behavior has strict ordering, you will see the ranges scanning happening in order of the defined range. Now, the ordering is not specified anymore and anything could happen.
So, I want to see:
- Output interactivity (at least at the level of a merged blocks bundle)
- Respected ordering of jobs
- No overly high memory usage
For concurrency and ordered output, we have a library dhammer that contains such structure, it's something that could be used, see example https://github.com/streamingfast/dhammer/blob/develop/example_nailer_test.go (I just converted to use Go Generics, so requires 1.18+).
I'm fine with other concurrency solution, there is no need to use dhammer.Nailer
, as long as we keep ordering
@maoueh with the addition of the latest commit, I think that have resolved all the outstanding objections except for the output ordering. I am looking at the dhammer example which has ONLY 1 output per threads and it not readily clear how a scenario with unlimited outputs per thread can be reordered without buffering. Please advise. |
@cfrankb Yeah you are right that buffering is still required, we cannot maintain ordering without buffering. So, I suggest we do the following. We disable parallel job and print details is to print full blocks, make no sense for this. The CLI PR would then refuse to run in parallel mode when use is requesting to print full blocks (I think actually that we should remove capability to print full block(s) in check command altogether, it doesn't belong there). This means that actual output to write to |
8eba728
to
e6a77f3
Compare
e6a77f3
to
a74e623
Compare
@maoueh I addressed the ordering in the latest commit. Please advise on additional tweaks needed. |
I tried the current code in
I know that the 'PrintFull' was discussed as something that shouldn't be possible in parallel, but I feel that the 'PrintStats', containing at most one line per block, is not something too costly to buffer for reordering.
How to reproduce in firehose-ethereum
It should get stuck somewhere after the first few jobs. (In my case, the merged-blocks are remote, maybe affects the bug that looks like a race condition) |
No description provided.