Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convention of the World Frame for Rerun Visualization #9

Closed
Nik-V9 opened this issue Jul 15, 2024 · 5 comments
Closed

Convention of the World Frame for Rerun Visualization #9

Nik-V9 opened this issue Jul 15, 2024 · 5 comments

Comments

@Nik-V9
Copy link

Nik-V9 commented Jul 15, 2024

Hi, Thanks for releasing the code!

I find that the coordinate frame convention for the world frame tends to change across different sets of input images.

In DUSt3R, using a RDF convention (X - Right, Y - Down, Z - Forward) seemed to always give a consistent visualization where the pointcloud is oriented correctly in the viewer at initialization. This is exactly what the Rerun visualizer in Mini-DUSt3R does: https://github.com/pablovela5620/mini-dust3r/blob/b3f2ec7c829f4ae2ba46f603a19fd2f9107f47cb/mini_dust3r/api/inference.py#L36

However, in MASt3R, to have the pointcloud oriented correctly in the viewer (at initialization), I observe that the convention has to be changed for different sets of input images. On the other hand, I see that the local camera convention has not changed (it's still RDF) and I don't have this issue when I transform everything relative to the first frame (the pointcloud and other visualizations are oriented correctly).

I was wondering if this behavior has to do with the changed global BA procedure between DUSt3R & MASt3R? Is this behavior caused due to the avg-angle canonical view transforms performed in sparse_ga?

@yocabon
Copy link
Contributor

yocabon commented Jul 15, 2024

Hi,
it's probably due to the fact that poses are not initialized to a good guess in the sparse ga, unlike the global alignment used in dust3r. Poses all start from identity so the scene may end up in whatever orientation. As you say, it's fine if you transform everything relative to the first frame.

@Nik-V9
Copy link
Author

Nik-V9 commented Jul 15, 2024

Ah, I see, Thanks for the clarification! The relative transform works fine, I'll just use that instead.

@Nik-V9 Nik-V9 closed this as completed Jul 15, 2024
@yt2639
Copy link

yt2639 commented Jul 24, 2024

Hii @yocabon and @Nik-V9 , could you elaborate on how we could "transform everything relative to the first frame"?

I noticed in the demo, it has this line

scene.apply_transform(np.linalg.inv(cams2world[0] @ OPENGL @ rot))

I assume you are talking about this? Then, how do we exactly do this? I did look at trimesh's document and code but it seems way too complicated for me to parse what exactly has been done through the above line of code.

I am now trying to export poses estimated by Mast3r to a NeRFStudio-compatible format. If I directly go and export the poses following the discussions in this issue in Dust3r, the poses are completely off (I also created a new issue in Dust3r). Therefore, I am wondering if I should also transform the poses of the other images to be relative to the first one before saving/exporting poses to NeRFStudio.

I appreciate very much your help. Thanks!

@yocabon
Copy link
Contributor

yocabon commented Jul 24, 2024

transform everything relative to the first frame: express all poses in cam0 space instead of world

cams2world = scene.get_im_poses().cpu()
world_to_c0 = np.linalg.inv(cams2world[0])  # world to cam0

cams_to_c0 = []
for i in range(len(pts3d)):
    ci_to_world = cams2world[i]  # cam_i to world
    ci_to_c0 = world_to_c0 @ ci_to_world  # cam_i to cam0
    cams_to_c0.append(ci_to_c0)

@yt2639
Copy link

yt2639 commented Jul 26, 2024

Thanks a lot for your explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants