Get flux working with MPS on torch 2.4.1, with GGUF support #7113

Vargol · 2024-10-13T09:51:46Z

Summary

builds on top of my last pull request, #7063, to get a subset of Flux working with Torch 2.4.1. so new installs can use it. With these change, fp16 Flux will work and GGUF quantised Flux will work.
bitandbytes quantised models will still not work.

Related Issues / Discussions

#7060
https://discord.com/channels/1020123559063990373/1294664228757569587

QA Instructions

It will need testing against none MacOS setups. Has been tested locally on MacOS using Flux Q8 and Q5.1 models
and is all wrapped in "if mps then" code so it shouldn't have broke anything.

Merge Plan

It's pretty straight forward to merge. Note I ran ruff against the directors these files are in but it made changes
to other code and other files so I'm assuming ruff doesn't run against these. I have copied my code from the ruff formatted file to use in the repo.

psychedelicious · 2024-10-15T07:11:08Z

Thank you!

brandonrising · 2024-10-16T18:50:16Z

What kind of mac do you have? I'm unable to run this without OOMing on an m3 max with 32gb of ram so I'm having trouble testing

Vargol · 2024-10-16T20:40:52Z

24 Gig M3 iMac, have you set PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0
The T5 encoder is using stupid amounts of memory on MacOS for some reason.
I've hacked GGUF loading for T5 and that seems to bring it down to something sensible when using the fp16 T5 but the proper quantised models aren't working so reasons I do not understand.

Vargol · 2024-10-16T21:40:16Z

If you feeling experimental try editing

invokeai/backend/model_manager/load/model_loaders/flux.py

to force it to load the T5 model as bfloat16 by adding torch_dtype=torch.bfloat16 to the from_pretrained call, see the comment in the snippet below

@ModelLoaderRegistry.register(base=BaseModelType.Any, type=ModelType.T5Encoder, format=ModelFormat.T5Encoder)
class T5EncoderCheckpointModel(ModelLoader):
    """Class to load main models."""

    def _load_model(
        self,
        config: AnyModelConfig,
        submodel_type: Optional[SubModelType] = None,
    ) -> AnyModel:
        if not isinstance(config, T5EncoderConfig):
            raise ValueError("Only T5EncoderConfig models are currently supported here.")

        match submodel_type:
            case SubModelType.Tokenizer2:
                return T5Tokenizer.from_pretrained(Path(config.path) / "tokenizer_2", max_length=512)
            case SubModelType.TextEncoder2:
                return T5EncoderModel.from_pretrained(Path(config.path) / "text_encoder_2", torch_dtype=torch.bfloat16)   #<------HERE
        raise ValueError(
            f"Only Tokenizer and TextEncoder submodels are currently supported. Received: {submodel_type.value if submodel_type else 'None'}"
        )

I was suspecting that to was loading as float32, so I thought I'd try and force its hand.

brandonrising · 2024-10-18T17:25:40Z

So I was able to end up getting it working on my mac by setting the dtype you suggested on the t5encoder, and also adding this env variable PYTORCH_ENABLE_MPS_FALLBACK=1 since some of the dequantize methods apparently don't work on MPS. Really surprised that mac's are able to do this, great find!

Vargol requested review from lstein, blessedcoolant, brandonrising, RyanJDick and hipsterusername as code owners October 13, 2024 09:51

github-actions bot added python PRs that change python files backend PRs that change backend files labels Oct 13, 2024

Vargol mentioned this pull request Oct 17, 2024

Load non quantized t5encoder using same dtype the model is saved in/ #7140

Open

Get flux working with MPS on 2.4.1, with GGUF support

5a00fcc

Vargol force-pushed the GUFFFlux branch from cf1c1e3 to 5a00fcc Compare October 17, 2024 13:08

brandonrising approved these changes Oct 18, 2024

View reviewed changes

Vargol mentioned this pull request Oct 18, 2024

[bug]: MacOSX/M1 Flux workflow 5.0.0: TypeError: BFloat16 is not supported on MPS #6991

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get flux working with MPS on torch 2.4.1, with GGUF support #7113

Get flux working with MPS on torch 2.4.1, with GGUF support #7113

Vargol commented Oct 13, 2024

psychedelicious commented Oct 15, 2024

brandonrising commented Oct 16, 2024

Vargol commented Oct 16, 2024

Vargol commented Oct 16, 2024

brandonrising commented Oct 18, 2024

Get flux working with MPS on torch 2.4.1, with GGUF support #7113

Are you sure you want to change the base?

Get flux working with MPS on torch 2.4.1, with GGUF support #7113

Conversation

Vargol commented Oct 13, 2024

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

psychedelicious commented Oct 15, 2024

brandonrising commented Oct 16, 2024

Vargol commented Oct 16, 2024

Vargol commented Oct 16, 2024

brandonrising commented Oct 18, 2024