-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid serialization blues by computing + caching the hash. #429
Conversation
@alxmrs, should we go ahead and merge this? |
Sure thing! I'll do the honors :) |
Brings in support for ttps://github.com/pangeo-forge/pangeo-forge-recipes/pull/429, but will stop working with any prior versions
Thanks @alxmrs! What an elegant solution to this problem. |
Hi @cisaacstern and @alxmrs, fyi recipes that use
# we exclude the format function and combine dims from ``root`` because they determine the
# index:filepath pairs yielded by iterating over ``.items()``. if these pairs are generated in
# a different way in the future, we ultimately don't care.
root = {
"fsspec_open_kwargs": pattern.fsspec_open_kwargs,
"query_string_secrets": pattern.query_string_secrets,
"file_type": pattern.file_type,
"nitems_per_file": {
op.name: op.nitems_per_file # type: ignore
for op in pattern.combine_dims
if op.name in pattern.concat_dims
},
} |
Thanks for flagging this, @derekocallaghan. And also for becoming so engaged these last few months! The project is benefiting so much from your contributions. This issue of not being able to serialize Could I ask that you summarize this problem in a new, standalone issue here on pangeo-forge-recipes/pangeo_forge_recipes/serialization.py Lines 10 to 25 in fe5d1f7
that'd be great. My guess is that we can resolve this by simply adding an additional conditional block(s) to that function. Open to all other suggestions, though. However we fix it, having a dedicated issue will be helpful to frame the discussion. Thanks so much! |
Thanks @cisaacstern, I'll create a new standalone issue for the |
Thanks @derekocallaghan! |
process_chunk
andprocess_input
functions #427 by computing the hash at recipe construction time instead of pipeline runtime. This avoids the need for serializing functions.