Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get flux working with MPS on torch 2.4.1, with GGUF support #7113

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Vargol
Copy link
Contributor

@Vargol Vargol commented Oct 13, 2024

Summary

builds on top of my last pull request, #7063, to get a subset of Flux working with Torch 2.4.1. so new installs can use it. With these change, fp16 Flux will work and GGUF quantised Flux will work.
bitandbytes quantised models will still not work.

Related Issues / Discussions

#7060
https://discord.com/channels/1020123559063990373/1294664228757569587

QA Instructions

It will need testing against none MacOS setups. Has been tested locally on MacOS using Flux Q8 and Q5.1 models
and is all wrapped in "if mps then" code so it shouldn't have broke anything.

Merge Plan

It's pretty straight forward to merge. Note I ran ruff against the directors these files are in but it made changes
to other code and other files so I'm assuming ruff doesn't run against these. I have copied my code from the ruff formatted file to use in the repo.

@github-actions github-actions bot added python PRs that change python files backend PRs that change backend files labels Oct 13, 2024
@psychedelicious
Copy link
Collaborator

Thank you!

@brandonrising
Copy link
Collaborator

What kind of mac do you have? I'm unable to run this without OOMing on an m3 max with 32gb of ram so I'm having trouble testing

@Vargol
Copy link
Contributor Author

Vargol commented Oct 16, 2024

24 Gig M3 iMac, have you set PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0
The T5 encoder is using stupid amounts of memory on MacOS for some reason.
I've hacked GGUF loading for T5 and that seems to bring it down to something sensible when using the fp16 T5 but the proper quantised models aren't working so reasons I do not understand.

@Vargol
Copy link
Contributor Author

Vargol commented Oct 16, 2024

If you feeling experimental try editing

invokeai/backend/model_manager/load/model_loaders/flux.py

to force it to load the T5 model as bfloat16 by adding torch_dtype=torch.bfloat16 to the from_pretrained call, see the comment in the snippet below

@ModelLoaderRegistry.register(base=BaseModelType.Any, type=ModelType.T5Encoder, format=ModelFormat.T5Encoder)
class T5EncoderCheckpointModel(ModelLoader):
    """Class to load main models."""

    def _load_model(
        self,
        config: AnyModelConfig,
        submodel_type: Optional[SubModelType] = None,
    ) -> AnyModel:
        if not isinstance(config, T5EncoderConfig):
            raise ValueError("Only T5EncoderConfig models are currently supported here.")

        match submodel_type:
            case SubModelType.Tokenizer2:
                return T5Tokenizer.from_pretrained(Path(config.path) / "tokenizer_2", max_length=512)
            case SubModelType.TextEncoder2:
                return T5EncoderModel.from_pretrained(Path(config.path) / "text_encoder_2", torch_dtype=torch.bfloat16)   #<------HERE
        raise ValueError(
            f"Only Tokenizer and TextEncoder submodels are currently supported. Received: {submodel_type.value if submodel_type else 'None'}"
        )

I was suspecting that to was loading as float32, so I thought I'd try and force its hand.

@brandonrising
Copy link
Collaborator

So I was able to end up getting it working on my mac by setting the dtype you suggested on the t5encoder, and also adding this env variable PYTORCH_ENABLE_MPS_FALLBACK=1 since some of the dequantize methods apparently don't work on MPS. Really surprised that mac's are able to do this, great find!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend PRs that change backend files python PRs that change python files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants