Skip to content
dybber edited this page Oct 30, 2012 · 23 revisions

As the outcome of our survey, we have to select a project to spend the remaining 4 months on. This is a list of such proposals.

All languages we have investigated seems to implement some kind of fusion/deforestation and avoid uncoalesced memory accesses when running on the GPU (Accelerate and Nikola)

Limiting branch divergence by index space permutation

In the paper "Financial Software on GPUs: Between Haskell and Fortran" presents a technique for limiting branch divergence.

The implementation of the technique is not that well described in the paper. We find it problematic for several reasons:

  • We would have to predict which branches to take before invoking the kernels. We are not sure this can be done on the GPU, and we thus might have to compute the (arbitrarily complex) conditions on the CPU.
  • What if there are several branches in the same kernel? It seems like we would have to split each kernel up into several kernels, and the kernel invocation overhead is also pretty high, so it might not be worthwhile.
  • If we shuffle the index space, the memory operations might not be performed in an order where we can guarantee coalesced access
  • The benchmarks performed in the paper mentioned above shows that it can both raise and lower the performance.

Strength reduction

The paper also explains strength reduction. We cannot see this as a project in itself, but rather as an optimization technique that should be kept in mind during implementation.

Limiting branch divergence by iteration delaying

Described in the paper "Reducing branch divergence in GPU programs" (Tianyi David Han & Tarek S. Abdelrahman).

GPU backend for Feldspar

We have previously talked about the possibility of a GPU backend for Feldspar, but as we have excluded Feldspar from our survey we haven't really any knowledge about how feasible this would be.

Flattening transformation

Perform flattening transformation in Accelerate or Nikola.

We know that the flattening transformation would be possible, but we don't really know if it is desirable to have in these languages.

Survey "VectorMARK" in progress

Clone this wiki locally