Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thunk cost estimation, chunk caching, benchmark updates #210

Merged
merged 21 commits into from
May 29, 2021
Merged

Conversation

jpsamaroo
Copy link
Member

@jpsamaroo jpsamaroo commented Apr 8, 2021

This PR adds runtime estimation of thunk costs (per signature) to get as close to max utilization as possible (without impacting total runtime). It also adds chunk data caches to workers, which will be freed by the scheduler once they're no longer needed.

Depends on JuliaData/MemPool.jl#49 Not doing this for now, wrong API for performance reasons.

Closes #205

Todo:

  • Parameterize function cost cache on argument types
  • Cache Chunk arguments per-process
  • Also key chunk cache on processor
  • Track Chunk usage and evict from caches ASAP
  • Test caching behavior
  • Scale thunk cost by number of running thunks on same processor Something for later, possibly
  • Add lots more benchmarks from https://blog.dask.org/2017/07/03/scaling Add Dask scheduler performance benchmarks #220
  • Confirm we benchmark better than master

@jpsamaroo jpsamaroo changed the title Allow changing default network for transfers Alternate network support for UCX and various scheduler optimizations Apr 16, 2021
src/sch/Sch.jl Outdated Show resolved Hide resolved
@jpsamaroo jpsamaroo force-pushed the jps/ucx branch 2 times, most recently from 78be8b7 to 2f22449 Compare April 24, 2021 17:24
@jpsamaroo jpsamaroo changed the title Alternate network support for UCX and various scheduler optimizations Thunk cost estimation, chunk caching, benchmark updates May 10, 2021
Adds a linked list-based cache of available processors (O(N)->O(1) best case)
Adds round-robin scheduling option to SchedulerOptions
Concretizes some Ref types in ComputeState
Measure and cache task cost in scheduler (per-function)
Use estimated task cost to indicate expected pressure
Batch up per-processor task launches into one remote_do
Record load average for future usage
Reorganization of Sch.jl
Fix init_proc capacity detection
Start only a single render server in live mode
Add Context copy ctor
Allow rendering to fail, not hang
Disable rendering by default
Reduce bench samples from 5 to 3
Summarize bench results with minimum
Add option to automatically run visualize script post-benchmarks
@jpsamaroo jpsamaroo closed this May 29, 2021
@jpsamaroo jpsamaroo reopened this May 29, 2021
@jpsamaroo jpsamaroo marked this pull request as ready for review May 29, 2021 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto-detect thunk utilization
2 participants