-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code With Strange Logic in CogVideoX's Dynamic CFG #9641
Labels
bug
Something isn't working
Comments
immortalCO
changed the title
Strange Behavior in CogVideoX's Dynamic CFG
Code With Strange Logic in CogVideoX's Dynamic CFG
Oct 11, 2024
Hey, thanks for reporting! We've come across this issue as well. This comes from maintaining 1:1 implementations with the original CogVideo code base. See this and this. I think @yiyixuxu was looking into this. I think what you mention is correct and creates the intended cosine guidance schedule. cc @zRzRzRzRzRzRzR as well for verifying this |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
As shown at pipeline_cogvideox_image2video.py L778, pipeline_cogvideox_video2video.py L778, and pipeline_cogvideox.py L697, the dynamic CFG is calculated in this way:
However:
num_inference_steps
is the number of inference denoising steps, which is default to 50.t.item()
is the denoising timesteps, which range from 1 to 999.((num_inference_steps - t.item()) / num_inference_steps
is not from 1 to 0, but in fact goes to negative very fast. And after** 5.0
, themath.cos
will have very severe fluctuations.I wonder: is this really the desired behavior of the CogVideoX pipeline? Shouldn't it be one of the following:
Both implementations will make the dynamic CFG like a cosine annealing.
Also, I think here
1 + guidance_scale * (...)
should be1 + (guidance_scale - 1) * (...)
, otherwise its value will be 1 ~ 1 + CFG instead of 1 ~ CFG.Please check it and fix it if it is really a bug, thank you very much.
Reproduction
Logs
# In the following setting, `guidance_scale=4` is passed. 10/11/2024 01:36:51 - INFO - root - Denoising 1/50: cfg = 1.645743587275726 10/11/2024 01:36:55 - INFO - root - Denoising 2/50: cfg = 1.7717514333823159 10/11/2024 01:36:59 - INFO - root - Denoising 3/50: cfg = 3.9871759160877414 10/11/2024 01:37:04 - INFO - root - Denoising 4/50: cfg = 3.7101792115724193 10/11/2024 01:37:08 - INFO - root - Denoising 5/50: cfg = 1.8940487645793973 10/11/2024 01:37:13 - INFO - root - Denoising 6/50: cfg = 2.635970965321337 10/11/2024 01:37:17 - INFO - root - Denoising 7/50: cfg = 1.0187988588703782 10/11/2024 01:37:22 - INFO - root - Denoising 8/50: cfg = 2.5852000899340863 10/11/2024 01:37:26 - INFO - root - Denoising 9/50: cfg = 1.3089873683577653 10/11/2024 01:37:31 - INFO - root - Denoising 10/50: cfg = 3.9915635934173324 10/11/2024 01:37:35 - INFO - root - Denoising 11/50: cfg = 1.0023944806862168 10/11/2024 01:37:40 - INFO - root - Denoising 12/50: cfg = 1.935990650841663 10/11/2024 01:37:44 - INFO - root - Denoising 13/50: cfg = 3.9884025377098555
System Info
This is a bug in the code agnostic to system.
Who can help?
@DN6 @a-r-r-o-w @zRzRzRzRzRzRzR
The text was updated successfully, but these errors were encountered: