-
Notifications
You must be signed in to change notification settings - Fork 0
Possible projects
As the outcome of our survey, we have to select a project to spend the remaining 4 months on. This is a list of such proposals.
All languages we have investigated seems to implement some kind of fusion/deforestation and avoid uncoalesced memory accesses when running on the GPU (Accelerate and Nikola)
In the paper "Financial Software on GPUs: Between Haskell and Fortran" presents a technique for limiting branch divergence.
The implementation of the technique is not that well described in the paper. We find it problematic for several reasons:
- We would have to predict which branches to take before invoking the kernels. We are not sure this can be done on the GPU, and we thus might have to compute the (arbitrarily complex) conditions on the CPU.
- What if there are several branches in the same kernel? It seems like we would have to split each kernel up into several kernels, and the kernel invocation overhead is also pretty high, so it might not be worthwhile.
- If we shuffle the index space, the memory operations might not be performed in an order where we can guarantee coalesced access
- The benchmarks performed in the paper mentioned above shows that it can both raise and lower the performance.
The paper also explains strength reduction. We cannot see this as a project in itself, but rather as an optimization technique that should be kept in mind during implementation.
Described in the paper "Reducing branch divergence in GPU programs" (Tianyi David Han & Tarek S. Abdelrahman).
We have previously talked about the possibility of a GPU backend for Feldspar, but as we have excluded Feldspar from our survey we haven't really any knowledge about how feasible this would be.
Perform flattening transformation in Accelerate or Nikola.
We know that the flattening transformation would be possible, but we don't really know if it is desirable to have in these languages.