Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration to reproduce results form the paper #14

Open
SimonGiebenhain opened this issue Apr 7, 2024 · 5 comments
Open

Configuration to reproduce results form the paper #14

SimonGiebenhain opened this issue Apr 7, 2024 · 5 comments

Comments

@SimonGiebenhain
Copy link

Dear authors,

thank you for releasing the code so quickly, and for the easy-to-use repositories.

After having run the method on a few identities, I have several questions:

  1. Are the released hyper parameters the ones that you suggest to use? I noticed that the paper specifies a coarse resolution of 512x512, while the code uses 256x256.
  2. After having completed the "MeshHead" stage, and using the "latest" checkpoint the loaded MLPs immediately result in NaN values, e.g. the rotation field of the gaussian point cloud holds NaNs. Have you ever experienced something similar? I could run the training by manually specifiying the epoch to load the checkkpoint from meshhead_epoch_20. What do you suggest me to do?
  3. How long does the "GaussianHead" stage roughly take? Which checkpoint should I use for evaluation?
  4. I am noticing a significant performance drop for strong expressions on the self-reenactment task. For this I simply took inspiration from your code for cross-reenactment and instead let the dataroot point to the tracking result of an unseen sequence for the person. Is this to be expected, or did I do something wrong?

Kind regards,
Simon

@YuelangX
Copy link
Owner

YuelangX commented Apr 7, 2024

Hello Simon!

  1. The hyper parameters are suggested and the code should reproduce all the results in the paper. The code uses 256x256 for training the mesh head model, while 512x512(coarse) and 2048x2048(fine) for training the Gaussian head model.

  2. I've never experienced "NaN values". Typically for a dataset containing around 3000 frames, I train the mesh head model for only 5-10 epochs. Actually, I experimentally found that training for 5 epochs is enough, and additional training has almost no impact on the final results.

  3. I suggest to train the Gaussian head model for 600000 iterations (over 2 days on RTX4090) or more. For different identities, the number of iterations required for convergence is different. Therefore, to be on the safe side, you can train it for more time until the loss no longer decreases. And for evaluation, the latest checkpoint is suggested.

  4. It is expected. The avatar cannot recover strong expressions that never appear in the training set. In other words, for any expression that can be recovered, its BFM expression coefficients should be approximated by interpolation of the BFM expression coefficients in the training set.

@SimonGiebenhain
Copy link
Author

Hello Yuelang,

thanks for you answer. The information helped a lot.

By now I am getting great results with your method.

However, I haven't succeeded to get good self-reenactment results (cross-reenactment works great).
Can you provide me with instructions on how to proceed with self-reenactment?
Currently I am doing the following:

  1. Run the preprocessing script a second time. This time exclude all sequences except for the free sequence.
  2. Run the trackiing on that folder (I guess I would ideally share the id code)
  3. Use the reenactment script, where dataset:dataroot points to the tracked FREE folder

Unfortunately, the results look terrible (the color of the head can change dramatically, turns almost black sometimes).

It would be very helpful for you to provided instructions.

Thanks in advance

@YuelangX
Copy link
Owner

Hello Simon!
Preprocessing the training data and the evaluation data separately will lead to different expression coefficient distributions and two different id coefficient, which greatly affects the results. I suggest putting the FREE sequence at the end of the training data, preprocessing them together, and then moving the FREE sequence out. This should give you decent results.

@MuQoe
Copy link

MuQoe commented Apr 17, 2024

我有看到在train_gaussian中写的trainer是0到1000,我需要把他改到0到600000么

@jeb0813
Copy link

jeb0813 commented Apr 18, 2024

Hi @MuQoe . You can convert iter to epoch according to the size of dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants