Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generated traning data for other method #26

Open
ZhiyaoZhou opened this issue Sep 21, 2023 · 3 comments
Open

generated traning data for other method #26

ZhiyaoZhou opened this issue Sep 21, 2023 · 3 comments

Comments

@ZhiyaoZhou
Copy link

the segmentation data(e.g. segmentation_0007 mapped.npz) is all the zero matries, cannot use it to train other model such as Total3DUnderstanding, i will be very appreciated of anyone could solve this problem~👀🙏🙏

@xheon
Copy link
Owner

xheon commented Sep 21, 2023

Hi, can you post the complete name of the sample?

The segmentation data is very sparse, so it is very likely, that it contains a lot of zeros.
To verify its content you can check that some elements are non-zero:

sem_data = np.load(sample_path)["data"]
semantic_ids, semantic_counts = np.unique(sem_data, return_counts=True)

Additionally, you can visualize the pointcloud:

occupied_voxels = torch.from_numpy(sem_data).squeeze().nonzero()
vis.write_pointcloud(occupied_voxels, None, tmp_output_path / "tmp.ply")

Regarding your second point:
The data formats between ours and other methods are very different,
Our method uses a sparse voxel representation of the scene (256^3) and learns objects and structures (floor, wall, etc) together.
Methods like Total3DUnderstanding use individual bounding boxes for each object and one for the room layout and de-composes them.

I hope that helps. Let me know, if you have further questions.

@ZhiyaoZhou
Copy link
Author

ZhiyaoZhou commented Sep 22, 2023

thank you for your reply! i runned the code as you suggested and it outputted some non-zero values(showed as below), so it proved that segmentation data(e.g. segmentation_0007 mapped.npz) aren't zero matrices.
dc9e042895743f4c78ea2b786ddf3705
The training data formats for Total3D and Panoptic are very different(as you mentioned), for model training, Total3D used SUNRGBD which contains bounding boxes coordinates for everything in the room and the room itself and annotations etc. which is very different from generated data from Front3D.
the question that I have been struggling is: I want to test the performance of one model, this model's architecture is similar to Total3D, and it needs SUNRGBD dataset for training, and I still confused that the performance result of other methods(such as Total3D) listed in the paper, if I want to use the generated data from Panoptic-reconstruction for Total3D model for training and verify its performance in PRQ etc(as mentioned in the paper Panoptic3D), how should i do?
I visualized the output generated from default data, the input picture rgb_0007.npg and the output/sample_0007/points_surface_instances.ply show as below:
input picture rgb_0007.npg:
rgb_0007
output/sample_0007/points_surface_instances.ply:
f5dfc21ddd86df218d83b01332634a7b
thanks for the amazing job in Panoptic3D, the pointcloud has reconstructed the picture perfectly, should i use the pointcloud coordinates for every stuffs respectively to construct a dataset like SUNRGBD and train the model? Any advice will be very appreciated~👀🙏

@xheon
Copy link
Owner

xheon commented Oct 19, 2023

Hi, sorry for the delay.

To get data in a Total3D-like format, i.e. per-object bounding boxes, it is easier to parse the per-object information directly from the original 3D-Front .json file and transform the object bounding boxes into the per-frame camera space of the Panoptic-3D views data:

A rough outline of the steps:

  • For each scene:
    • load original scene_data = <scene>.json
    • for each room in room_data = scene_data["scene"]["room"][room_idx]
      • objects_data = room_data["children"]
        • for each object
        • get "pos", "rot" (quaternion) and "scale" --> object transformation
        • load 3D-Future mesh ( / "raw_model.obj"), get the axis aligned bounding box, i.e. min & max vertex position
        • transform the mesh AABB with the object transformation to "world space"
        • transform the world_space object to "camera space" for the specific Panoptic-3D frame
        • for the specific Panoptic-3D frame, get the 2D segmentation mask (needs consistent mapping between 2D instance ID and 3D object ID)
        • extract the Total3D-like object parameterization (ori_cls, ori_reg, etc)

You can have a look here at this pickle: sample data

To visualize it you can extract the per-object points:

import pickle
#import pickle5 as pickle
import trimesh

sample_path = "30054695-4a17-4698-a482-06047262a526_0007.pkl"

with open(sample_path, "rb") as f:
   data = pickle.load(f)

for idx, points in data["boxes"]["gt_points"]:
   trimesh.PointCloud(points).export(f"object_{idx:02d}.ply")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants