Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some possible bugs and questions #36

Open
zhoutianyang2002 opened this issue Jul 12, 2024 · 4 comments
Open

Some possible bugs and questions #36

zhoutianyang2002 opened this issue Jul 12, 2024 · 4 comments

Comments

@zhoutianyang2002
Copy link

Hi!
Thank you for your excellent work!
When I reading your code to learn how to implement a 3DGS experiment, I found some possible bugs:

  1. In GaussianHeadModule.py, we want to normalize the quaternion here. However, the default dim of F.normalize is 1, not 2. And the dimension of rotation is torch.Size([B,N,4]). So may be we need to let dim=2 to normalize the quaternion.
delta_rotation = delta_attributes[:, :, 3:7] # torch.Size([B,N,4])
rotation = self.rotation.unsqueeze(0).repeat(B, 1, 1) + delta_rotation * self.attributes_scale
rotation = torch.nn.functional.normalize(rotation) 
# maybe should change it to: rotation = torch.nn.functional.normalize(rotation, dim=2)
  1. In the paper, in formula(7), the gaussian attribute scale is not need to change. However, in GaussianHeadModule.py, we change the gaussian attribute scale according to the data S. Is that a mistake? Which version is correct, the paper or the code?
if 'pose' in data:
     ....
     scales = scales * S 
  1. In MeshHeadModule.py, the output of geo_mlp is already have the activation function tanh. However, when calculate the deform of vertices, we implement the tanh again. Is this a repetition?
self.geo_mlp = MLP(cfg.geo_mlp, last_op=nn.Tanh()) # (-1,1)
def deform(self, data):
    ...
    pred = self.geometry(geo_input) # (1,132,424)
    sdf, deform = pred[:, :1, :], pred[:, 1:4, :]
	query_pts = (query_pts + torch.tanh(deform).permute(0, 2, 1) / self.grid_res) # (1,424,3)+(1,424,3)

Besides, may I ask two questions about the code?

  1. In CameraModule.py, why did you comment this line of code? In other words, why we not need to change the face_vertices_camera?
# 可能是将OpenGL坐标系(x右y上z外)转化为OpenCV坐标系(x右y下z里)?亦或者反过来?两个应该都是右手系?
face_vertices_image[:, :, :, 1] = -face_vertices_image[:, :, :, 1]
# face_vertices_camera[:, :, :, 1:] = -face_vertices_camera[:, :, :, 1:] 原作者的注释。
face_normals[:, :, 1:] = -face_normals[:, :, 1:]
  1. what is the difference between the visible and mask? Why we not need to use the visible and mask in reenactment?

Sorry to bother you. Thank you very much!

@YuelangX
Copy link
Owner

  1. Thanks. It's a mistake.

  2. It is assumed S=1 in the paper. The code is right.

  3. The second torch.tanh() should be deleted.

  4. Just because face_vertices_camera[:, :, :, 1] not used later.

  5. I only calculate loss for the pixels where visible > 0. Mask is used to supervise the mesh geometry.

@zhoutianyang2002
Copy link
Author

  1. Thanks. It's a mistake.
  2. It is assumed S=1 in the paper. The code is right.
  3. The second torch.tanh() should be deleted.
  4. Just because face_vertices_camera[:, :, :, 1] not used later.
  5. I only calculate loss for the pixels where visible > 0. Mask is used to supervise the mesh geometry.

Thank you very much for your reply! May I ask another question? Since we already calculate the deformation of vertices by pose_deform_mlp using the pose as input, why we need to transform the vertices from canonical space to pose space? In other words, what's the difference between the offset calculated by pose_deform_mlp and the transformation of pose as the code below? Thank you very much!

# in MeshHeadModule.py
if 'pose' in data: 
    R = so3_exponential_map(data['pose'][:, :3]) # (1,3,3)
    T = data['pose'][:, None, 3:] # (1,1,3)
    S = data['scale'][:, :, None] # (1,1,1)
    verts_batch = torch.bmm(verts_batch * S, R.permute(0, 2, 1)) + T

@YuelangX
Copy link
Owner

YuelangX commented Jul 14, 2024

pose_deform_mlp predicts the offsets of the non-face points in canonical space.

@zhoutianyang2002
Copy link
Author

pose_deform_mlp predicts the offsets of the non-face points in canonical space.

I understand now. Thank you very much! Best wishes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants