Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File gets truncated upon unpickling #657

Open
JanPokorny opened this issue May 13, 2024 · 3 comments
Open

File gets truncated upon unpickling #657

JanPokorny opened this issue May 13, 2024 · 3 comments

Comments

@JanPokorny
Copy link

It seems that whenever a file is open for writing in the session, the (already closed) file object f gets unpickled in such a way that re-opens the file in writing mode, effectively truncating the file to zero length.

Is this expected behavior?

❯ python -c "
  with open('file.txt', 'w') as f:
    f.write('hello world')
  import dill
  dill.dump_module('session.pkl')
  "

❯ cat file.txt
hello world⏎                                                                                                                                                                                                                                                                                                                                                                                                   

❯ python -c "
  import dill
  dill.load_module('session.pkl')
  "

❯ cat file.txt

❯
@akiss-ic
Copy link

akiss-ic commented Aug 19, 2024

This issue was problematic for me even with the file that contained the session data. if I did something like

x = 1
y = 2
with open("foo.pckl", "wb") as file:
	dill.dump_session(file)

and then

with open("foo.pckl", "rb") as file:
	dill.load_session(file)

in another interpreter session/notebook I would see it start to load and then the pickle file go to zero bytes and a UnpicklingError: pickle data was truncated was thrown because the pickle file was changed underneath its feet (not to mention all the data I dumped to the pickle file is now gone!). It seems that if any file object instantiated with write permissions is dumped and then loaded, even if it was closed before dumping, it will re-open the file on loading and delete all the data it contains. This seems incredibly dangerous.

The way I got around this was to never instantiate a file object but rather pass the path object or path string to dill.dump_session() directly. This obviously works in this narrow case, but not in the general case.

@JanPokorny
Copy link
Author

This could probably be solved by unpickling files open for write in "write append" mode. Seems like a more logical/rational/expected/intuitive behavior.

@akiss-ic
Copy link

Agreed, or to be extra safe open them in a read only mode and allow the caller to re-open in a write mode if needed. It isn't intuitive to me that a file object should be opened at all when un-pickled if it was closed prior to pickling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants