Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: making non-IO schedulers efficient #9

Open
rrnewton opened this issue Nov 12, 2017 · 1 comment
Open

Idea: making non-IO schedulers efficient #9

rrnewton opened this issue Nov 12, 2017 · 1 comment

Comments

@rrnewton
Copy link
Member

This was a thought experiment I dumped into Slack, but Slack makes it very easy to lose things, so I'm archiving it here:

July 10th

The thing that is getting me excited about it is closing the gap between a practical (real, physically parallel) scheduler, and this idealized one -- by changing the proof as little as possible, but changing the abstractions we implement parallel scheduling with instead.

I.e. the normal way we implement runPar is to give an interpretation to our logical threads as real IO threads.

That's gross because we don't have good reasoning tools for IO. (Albeit, Colin Gordon and Niki have had a start at this issue.)

My understanding is that it's only an early start -- we don't have a formalized memory model... we don't have something like an Iris program logic for reasoning about IO programs concurrently interacting with the heap.

So ... no IO. Forget IO for implementing schedulers. (edited)

Instead, I think we can build parallel schedulers using a combination of:
(1) laziness
(2) futures
(3) linear types

We've got this fork of GHC with linear types (our other POPL submission) and it allows the magic of purely functional interfaces to efficient, in-place mutable data structures.

I'm attracted by the idea that we can combine Liquid + Linear Haskell trivially. Correct me if I'm wrong, but the linear-or-non-linear status of a binding shouldn't matter one bit for refinements, right? It's orthogonal information....

So here's a basic sketch of the idea:
(1) @vikraman's current scheduler steps one (random) thread from the threadpool at each timestep, this new scheduler would do something similar, but it would do it by evaluating a random thunk in a sequence of thunks to WHNF.
(2) The thunks represent schedulable computations, and they return a result value, coroutines style: Get iv k, Put iv val k, Done. Gets and Puts request modifications of the heap, and supply continuations. Pure computation is "instantaneous", and each thread computes atomically until its next ivar interaction. (We can trivially add Yield as well, if desired.)
(3) In the initial version, all updates to the heap are done by the scheduler. The scheduler is a pure function, but it uses a linear API to the heap. Heap updates can be turned into actual imperative updates to a concurrent map (IvarID -> Value).

Ok, so yes, the mutable data with linear API would be part of the TCB. But the scheduler would not be!

July 14

Casting it as a deforestation problem:
Let's say execute actually runs an action such as Put iv 3 in the scheduler. For example, execute = (\Put k v -> writeMap heap k v).

The scheduler normally reads and forces a (random) worker thread result (act, pool') = deque pool, and then calls execute act.
But it's a normal program optimization to "push" execute inside the child thread, deforesting Put iv 3 with subsequent optimizations.

map execute [result1, result2, ...] == [execute result1, execute result2, ...]

The steps in order would be something like:
(1) make the proof go through for the cartoon scheduler in LH
(2) linear-types: deforest the entropy source (list of random numbers)
(3) linear-types: use a concurrent map for the heap
(4) trickiness: deforest the spine of the concurrent map to make it behave like references
(5) perform some interesting program manipulation/deforestation to make the worker threads do the heap updates themselves, rather than the scheduler being a (sequential) bottleneck.

For the IO-free version I described.
For @ranjitjhala's proposed version, maybe step (1) at least would be the same
And subsequent steps would include developing some kind of RGRef, plus some kind of limited ST monad ....
Clarification of (2):
Current entropy: [Int] (or an explicit stream)

BUT if you rewrite that as an explicit stream / iterator
RNG -> (Int, RNG)
And you make that linear, it becomes efficient:
RNG -o (Unrestricted Int, RNG)
("Unrestricted" = Bang, ! modality in linear logic)

@ranjitjhala
Copy link
Collaborator

ranjitjhala commented Nov 12, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants