Is there a way to know how much memory is requried for a task? #423

feiyang-k · 2023-09-01T04:07:07Z

Hi,

jax seems to reserve all the gpu memory at import. So we cannot see how much memory is used exactly by the ott package from the nvidia panels. Right now, if some problem runs into memory issues, the only thing we can do is to reduce the problem size until the error disappears. Is there a more direct way to know how much memory is required for a target task?

Thanks!

The text was updated successfully, but these errors were encountered:

michalk8 · 2023-09-01T10:05:13Z

Hi @feiyang-k , please check https://jax.readthedocs.io/en/latest/gpu_memory_allocation.html on how to change the pre-allocation JAX does. In short, you can do

import os  # before importing anything jax
os.environ['XLA_PYTHON_CLIENT_PREALLOCATE'] = "false"

import jax
import jax.numpy as jnp
...

feiyang-k · 2023-09-08T03:40:10Z

Thanks @michalk8 ! I tried it and it works exactly as I wished!

By the way, I'm using it jupyter notebook and the GPU memory recycling seems not fully working. Each time an OT problem is computed, the GPU memory will not be released.

More interestingly, if the computing successfully completed, it is ok to use the memory to compute the next problem. But if a problem went into error, then the allocated memory seems "dead". The available memory for the next OT problem will be the remaining memory, which could be much smaller. Thus, I would need to restart the Jupyter Notebook kernel every time I went into any error with ott. Is this a known issue?

Also, it seems I'm never able to interrupt the block running OT problems. It will never respond. I will need restart to jupyter notebook kernel whenever a task seems will never finish in a reasonable time. Is this as expected?

Thanks again!

michalk8 · 2023-09-08T13:45:59Z

By the way, I'm using it jupyter notebook and the GPU memory recycling seems not fully working. Each time an OT problem is computed, the GPU memory will not be released.

According to the docs, XLA_PYTHON_CLIENT_PREALLOCATE='false' will re-use the memory, XLA_PYTHON_CLIENT_ALLOCATOR='platform' will de-allocate it, but is much slower.

But if a problem went into error, then the allocated memory seems "dead". The available memory for the next OT problem will be the remaining memory, which could be much smaller.

I will go and investigate this behavior.

Also, it seems I'm never able to interrupt the block running OT problems. It will never respond. I will need restart to jupyter notebook kernel whenever a task seems will never finish in a reasonable time. Is this as expected?

I'm not 100% sure, but would say yes, as the code runs on device and the interrupt will happen when execution is given to host (will check if this statement is true). Maybe adding a printing callback (see this tutorial) will allow for easier interruption of an execution.

michalk8 added the question Further information is requested label Sep 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a way to know how much memory is requried for a task? #423

Is there a way to know how much memory is requried for a task? #423

feiyang-k commented Sep 1, 2023

michalk8 commented Sep 1, 2023

feiyang-k commented Sep 8, 2023

michalk8 commented Sep 8, 2023

Is there a way to know how much memory is requried for a task? #423

Is there a way to know how much memory is requried for a task? #423

Comments

feiyang-k commented Sep 1, 2023

michalk8 commented Sep 1, 2023

feiyang-k commented Sep 8, 2023

michalk8 commented Sep 8, 2023