Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when copying data from one worker to another #547

Open
RomeoV opened this issue Jul 11, 2024 · 2 comments · May be fixed by #558
Open

Memory leak when copying data from one worker to another #547

RomeoV opened this issue Jul 11, 2024 · 2 comments · May be fixed by #558
Assignees

Comments

@RomeoV
Copy link

RomeoV commented Jul 11, 2024

Allocating matrices on one worker and copying them to another worker repeatedly leads to a memory leak on my computer, and the Julia session being killed.

julia> using Distributed
julia> addprocs(8)
julia> using Dagger
julia> for _ in 1:5
       @time fetch(let
         foo = Dagger.@spawn scope=Dagger.scope(worker=1) rand(10000, 10000);
         Dagger.@spawn scope=Dagger.scope(worker=2) copy(foo)
       end);
       end

The foo matrix and its copy should be garbage collected, which I don't think they are. But even then, each matrix is 0.8GB, so if they exist 5 times on both workers we have 5 * 2 * 0.8GB=8GB of memory, which should not overflow my RAM. (I have at least 16GB free).

Session is a clean temp project with Dagger v0.18.12, Julia 1.10.4,

@RomeoV RomeoV changed the title Memory leak? Memory leak when copying data from one worker to another Jul 14, 2024
@jpsamaroo
Copy link
Member

Sorry for the slow reply - this is probably a known memory leak, also reported offline by @mofeing in a similar case. I'll investigate and see if I can resolve it.

@jpsamaroo jpsamaroo self-assigned this Jul 23, 2024
@jpsamaroo jpsamaroo linked a pull request Jul 24, 2024 that will close this issue
2 tasks
@jpsamaroo
Copy link
Member

Using the example above, I've found the initial source of retained memory, and am fixing it in #558 (branch is very WIP, expect it to not work right now). I'll close this issue once that PR is merged, which should fully address this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants