Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object pose of HOPE-Video #1

Open
Jing-lun opened this issue Sep 17, 2021 · 7 comments
Open

Object pose of HOPE-Video #1

Jing-lun opened this issue Sep 17, 2021 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@Jing-lun
Copy link

Jing-lun commented Sep 17, 2021

Hi @swtyree @Uio96 @sbirchfield,

Thanks for sharing the HOPE cad models and dataset!

My question is, when I try to project object pose back to the world frame from different scenes, I found that their pose in world frame are not the same, which means the pose has some errors. So is this error acceptable?

Thanks.

@Jing-lun Jing-lun reopened this Sep 17, 2021
@swtyree
Copy link
Owner

swtyree commented Sep 17, 2021

Hi @Jing-lun, thanks for spotting this. I think the issue is an error in camera extrinsics for HOPE-Video where the translation units appear to be in meters, while object poses are in cm. I'll confirm this and update the files later today.

@swtyree swtyree self-assigned this Sep 17, 2021
@swtyree swtyree added the bug Something isn't working label Sep 17, 2021
@Jing-lun
Copy link
Author

Hi @swtyree, thanks for your prompt reply and let me know!

I tested again and even though I make the units consistent, the 3D pose still cannot be matched.

I tested the pose of Mac&Cheese model in the first and the last view in scene_0000, and below is my calculation.

'''camera_extrinsic1 and pose1 are from hope_video/scene_0000/0000.json'''
camera_extrinsic1 = np.asarray([[
                -0.9886373,
                -0.14978693,
                0.012654976,
                79.977846
            ],[
                -0.1278811,
                0.7938205,
                -0.59455484,
                -32.258067
            ],[
                0.07901077,
                -0.5894174,
                -0.8039555,
                23.390512
            ],[
                0.0,
                0.0,
                0.0,
                1.0
            ]])
pose1 =  np.asarray([[
            -0.20787001630983457,
            -0.9763291480689646,
            0.059761308091495956,
            0.13647988469971395
        ],[
            -0.7948878577485222,
            0.13300297015968482,
            -0.5919996280969145,
            -21.892505447000136
        ],[
            0.5700380023040957,
            -0.17056251735382824,
            -0.8037195436940463,
            55.94770750870245
        ],[
            0.0,
            0.0,
            0.0,
            1.0
        ]])

'''camera_extrinsic2 and pose2 are from hope_video/scene_0000/0364.json'''
camera_extrinsic2 = np.asarray([[
                -0.8754454,
                -0.45876563,
                -0.15208338,
                58.14929
            ],[
                -0.2247508,
                0.6649921,
                -0.7122307,
                -19.330604
            ],[
                0.4278812,
                -0.58933824,
                -0.6852723,
                6.31446
            ],[
                0.0,
                0.0,
                0.0,
                1.0
            ]])
pose2 = np.asarray([[
            0.11922886974591602,
            -0.9869201213152595,
            -0.10850370274050081,
            -7.430878036322042
        ],[
            -0.709816306693385,
            -0.008316097101108606,
            -0.7043377604400789,
            -12.936461380072812
        ],[
            0.6942227429692126,
            0.1609950503421291,
            -0.7015235871501129,
            62.88667563186634
        ],[
            0.0,
            0.0,
            0.0,
            1.0
        ]])

'''Tow = Toc*Tcw'''
world1 = pose1.dot(camera_extrinsic1)
world2 = pose2.dot(camera_extrinsic2)

@swtyree
Copy link
Owner

swtyree commented Sep 17, 2021

Okay, thanks for the update. Have you confirmed that this issue is only with HOPE-Video and not HOPE-Image?

@Jing-lun
Copy link
Author

Okay, thanks for the update. Have you confirmed that this issue is only with HOPE-Video and not HOPE-Image?

Well, all the objects in the HOPE-Image folder stay still and have no translation and rotation (I think the only difference in HOPE-Image is the lighting condition), so I cannot use the same way to check if the object pose in the world frame is the same or not.

@swtyree
Copy link
Owner

swtyree commented Sep 18, 2021

I think I figured out the issues:

  1. As we already established, the translation in the camera extrinsic matrix was in m, while object poses are in cm.
  2. The extrinsic matrix is actually world-to-camera, rather than camera-to-world as you expected (and as I also expected until I dug into it). In the line preview.py#L112, the extrinsic matrix is used to transform the scene reconstruction point cloud from world coordinates to camera coordinates.

To project a pose from camera to world coordinates, use this for now:

extrinsics_w2c[:3,-1] *= 100  # correct translation units from m to cm
pose_world = np.linalg.inv(extrinsics_w2c) @ pose_camera

I'll update the documentation in the README, and I may upload a new version with more explicit key names in the json files. But I'll need to do that at a later time.

Thanks again for reaching out with the issue!

@Jing-lun
Copy link
Author

Thanks a lot @swtyree! Now the poses are matched!

@swtyree
Copy link
Owner

swtyree commented Sep 18, 2021

Thanks! I'm going to reopen the issue until I can get a new version of the annotations uploaded to Google Drive.

@swtyree swtyree reopened this Sep 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants