Add a Configuration to Change the Number of Reduce Tasks of the Analyze Process #593
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When running the DFSIO test, I found that the analysis part of the procedure took a very long time, even much longer than the test itself. It is set to own only one reduce task, which is not reasonable. After adding this config, the number of reduce task of the analysis part becomes configurable. I run a test on an 8-core server and the analysis part took only 1/8 time than before.
I wonder if the community can consider to merge this change. I only changed the code for the DFSIO test, but there are same problems for other test cases. If these changes looks feasible, I'm glad to try changing other test cases.