Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do not merge] Ensure that new model instance has correct name or path in hf checkpointer #1612

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
12 changes: 6 additions & 6 deletions llmfoundry/callbacks/hf_checkpointer.py
Original file line number Diff line number Diff line change
Expand Up @@ -600,12 +600,6 @@ def tensor_hook(
new_model_instance.load_state_dict(state_dict, assign=True)
del state_dict

# Transform the model and tokenizer before saving
new_model_instance, original_tokenizer = self.transform_model_and_tokenizer(
new_model_instance,
original_tokenizer,
)

# Ensure that the pretrained model name is correctly set on the saved HF checkpoint.
if self.pretrained_model_name is not None:
new_model_instance.name_or_path = self.pretrained_model_name
Expand All @@ -616,6 +610,12 @@ def tensor_hook(
k
].base_model_name_or_path = self.pretrained_model_name

# Transform the model and tokenizer before saving
new_model_instance, original_tokenizer = self.transform_model_and_tokenizer(
new_model_instance,
original_tokenizer,
)

log.debug('Saving Hugging Face checkpoint to disk')

if upload_to_save_folder:
Expand Down
Loading