How to convert torchtune .pt output files? #1201

troy256 · 2024-07-19T18:10:02Z

troy256
Jul 19, 2024

I did a fine-tuning run llama3-8b using a parquet dataset and it produced a couple of checkpoint .pt files. Can these be converted into something like gguf for usage in other inference platforms?

Answered by ebsmothers

Jul 25, 2024

@troy256 personally I would expect convert_hf_to_gguf.py to be a decent starting point (though I haven't tried it myself). But first you would need to convert from the Meta checkpoint format into the HF checkpoint format. There are basically two ways you can do this:

(1) re-run your fine-tuning job but replace Meta checkpointer with HF checkpointer. This isn't ideal since you have to re-run the fine-tuning job again, but the actual changes to do it should be pretty straightforward. You just need to make sure you have the HF versions of the checkpoints downloaded, you should be able to run something like this to do it:

tune download meta-llama/Meta-Llama-3-8B-Instruct --output-dir /tmp/Met…

View full answer

ebsmothers · 2024-07-25T14:43:33Z

ebsmothers
Jul 25, 2024
Collaborator

Hi @troy256 thanks for creating the discussion and sorry for the delayed response here. Are you using the Meta checkpoint format or the HF checkpoint format in your fine-tuning run (i.e. what is the checkpointer used here in your config)? (The default for our Llama3-8B configs should be Meta checkpointer, so if you didn't change anything it's probably that one.)

Currently it should be possible to convert HF-style checkpoints to GGUF, but this is something that's not yet well-tested (I am hoping we can add some kind of test for this soon). At least for HF-style checkpoints you can try following the process in this discussion. If you're using the Meta checkpoint you can try this as well, but I suspect it will be more challenging. I will look into changing the default checkpoint format for our Llama3-8B fine-tuning configs from Meta to HF (most of our other configs use HF format anyways).

Either way if you do try to follow the process outlined in the above discussion please let me know any errors you run into and we can help debug.

1 reply

troy256 Jul 25, 2024
Author

Looking at the output from the fine tuning run I am using the Meta checkpoint format:

_component_: torchtune.utils.FullModelMetaCheckpointer
checkpoint_dir: /data/llama3-8b-instruct/original
checkpoint_files:
-consolidated.00.pth
model_type: LLAMA3
output_dir: /data/llama3-8b-instruct

We are using ollama for inference so I wanted to import/add my fine tuned llama3 model to it somehow. I know if it's in GGUF format this should be easy to do. I did look into the llama.cpp project, but they replaced their convert.py script with a set of 3 specialized conversion scripts and at first glance none seemed to be directly applicable. I am new to this so apologies for my ignorance. Any guidance at all would be appreciated and thanks for the reply.

ebsmothers · 2024-07-25T22:28:41Z

ebsmothers
Jul 25, 2024
Collaborator

@troy256 personally I would expect convert_hf_to_gguf.py to be a decent starting point (though I haven't tried it myself). But first you would need to convert from the Meta checkpoint format into the HF checkpoint format. There are basically two ways you can do this:

(1) re-run your fine-tuning job but replace Meta checkpointer with HF checkpointer. This isn't ideal since you have to re-run the fine-tuning job again, but the actual changes to do it should be pretty straightforward. You just need to make sure you have the HF versions of the checkpoints downloaded, you should be able to run something like this to do it:

tune download meta-llama/Meta-Llama-3-8B-Instruct --output-dir /tmp/Meta-Llama-3-8B-Instruct \
--ignore-patterns "original/consolidated.00.pth"

Then change the checkpointer in your fine-tuning run, should be from something like this to something like this:

checkpointer:
  _component_: torchtune.utils.FullModelHFCheckpointer
  checkpoint_dir: /tmp/Meta-Llama-3-8B-Instruct/
  checkpoint_files: [
    model-00001-of-00004.safetensors,
    model-00002-of-00004.safetensors,
    model-00003-of-00004.safetensors,
    model-00004-of-00004.safetensors
  ]
  recipe_checkpoint: null
  output_dir: /tmp/Meta-Llama-3-8B-Instruct/
  model_type: LLAMA3
resume_from_checkpoint: False

(2) Alternatively, you can manually remap the checkpoints you have saved already into HF format. This would basically be a two-hop conversion: you'd need to convert from Meta format to torchtune format, then from torchtune format to HF format. So it'd probably start like this:

from torchtune.models.convert_weights import tune_to_hf, meta_to_tune
from torchtune.utils._checkpointing._checkpointer_utils import safe_torch_load

meta_state_dict = safe_torch_load(finetuned_checkpoint_path)
tune_state_dict = meta_to_tune(meta_state_dict)

To convert to HF format you can even use the checkpointer directly. You should run the tune download command from (1) first to make things easier. Then:

from torchtune.utils._checkpointing import FullModelHFCheckpointer
checkpointer = FullModelHFCheckpointer(
	checkpoint_dir='/tmp/Meta-Llama-3-8B-Instruct/',
	...
	# should use the same args as the config from (1) 
)
checkpointer.save_checkpoint(
	state_dict={"model": tune_state_dict},
	epoch=n, # however many epochs you fine-tuned
)

This should save HF safetensors files to the checkpointer's output_dir containing the remapped weights of your fine-tuned checkpoint.

Once you've done one of these two you should be able to try out convert_hf_to_gguf.py for the actual conversion. This will rely on the HF config being available, but you should already have that from your tune download invocation. But again, if you see something weird here do let me know and we can help walk through it.

1 reply

troy256 Jul 26, 2024
Author

Thanks to your help, I was able to use option (1) to generate HF compatible checkpoints from fine-tuning and then use convert_hf_to_gguf.py to convert to GGUF format.

I am experiencing a different issue when testing which I will start a separate discussion on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to convert torchtune .pt output files? #1201

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

How to convert torchtune .pt output files? #1201

troy256 Jul 19, 2024

Replies: 2 comments · 2 replies

ebsmothers Jul 25, 2024 Collaborator

troy256 Jul 25, 2024 Author

ebsmothers Jul 25, 2024 Collaborator

troy256 Jul 26, 2024 Author

troy256
Jul 19, 2024

Replies: 2 comments 2 replies

ebsmothers
Jul 25, 2024
Collaborator

troy256 Jul 25, 2024
Author

ebsmothers
Jul 25, 2024
Collaborator

troy256 Jul 26, 2024
Author