Large memory allocation due to sparse tensors #30

lepmik · 2023-09-26T08:49:00Z

Spikeometric/spikeometric/models/base_model.py

Line 223 in 31fe94e

x = torch.zeros((n_neurons, n_steps + T), device=device, dtype=store_as_dtype)

Any reason for why we should not work on sparse tensors?

JakobSonstebo · 2023-10-01T13:39:53Z

With the operations we're doing throughout the simulation, I couldn't find a sparse tensor format that didn't need to be converted to dense to perform some of the operations, and thus everything became much slower. Since we need very low precision for the spikes, it occupies a small fraction of total memory usage (compared to the weights), so I decided it was worth it. After the simulation is completed, the idea has been to delegate to the user to save the results as sparse tensors. In case you want to do some post-processing of the spikes, it is handy to have them in dense tensor form before saving. However, if the main usage is just immediately saving the results, returning them as a sparse tensor is probably better. What do you think?

JakobSonstebo · 2023-10-01T13:46:28Z

I can try benchmarking the sparse branch now to see how performance is affected. In the beginning of the project I compared this way of storing spikes to the just writing them to a dense tensor and found that the latter was significantly faster, but if memory is a problem (even when using torch.uin8), then it might be worth it.

JakobSonstebo · 2023-10-01T15:29:12Z

Here is a plot showing the performance. I suspect the rolling of x might be the thing slowing it down, so maybe there is a faster way of "forgetting" the first column?

lepmik · 2023-10-11T06:39:56Z

Thank you for running the benchmarks!

I think we have to have the option for sparse iteration at least; for example, when running 100 neurons for 1e8 timesteps, it breaks on a NVIDIA GeForce RTX 3090, which for small timesteps is not that much. We could introduce a parameter sparse=True by default?

I think you are right that the roll is slow, but I can't think of any faster way of doing it. We could potentially see if we can implement a faster way of rolling.

JakobSonstebo · 2023-10-18T15:27:24Z

Maybe we could also consider saving to a file during the simulation. That is, for every N steps we save the progress to a file and resume from that point. This way we could limit memory usage and it would be faster that sparsifying at every step.

lepmik assigned JakobSonstebo Sep 26, 2023

lepmik added a commit that referenced this issue Sep 26, 2023

update simulator to return sparse coo tensors to save memory allocation

e7838b1

#30

lepmik linked a pull request Sep 26, 2023 that will close this issue

update simulator to return sparse coo tensors #31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large memory allocation due to sparse tensors #30

Large memory allocation due to sparse tensors #30

lepmik commented Sep 26, 2023 •

edited

Loading

JakobSonstebo commented Oct 1, 2023

JakobSonstebo commented Oct 1, 2023

JakobSonstebo commented Oct 1, 2023

lepmik commented Oct 11, 2023

JakobSonstebo commented Oct 18, 2023

Large memory allocation due to sparse tensors #30

Large memory allocation due to sparse tensors #30

Comments

lepmik commented Sep 26, 2023 • edited Loading

JakobSonstebo commented Oct 1, 2023

JakobSonstebo commented Oct 1, 2023

JakobSonstebo commented Oct 1, 2023

lepmik commented Oct 11, 2023

JakobSonstebo commented Oct 18, 2023

lepmik commented Sep 26, 2023 •

edited

Loading