Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feasibility of Training by Yourself #25

Open
DavidYaonanZhu opened this issue Jun 28, 2023 · 4 comments
Open

Feasibility of Training by Yourself #25

DavidYaonanZhu opened this issue Jun 28, 2023 · 4 comments

Comments

@DavidYaonanZhu
Copy link

DavidYaonanZhu commented Jun 28, 2023

Hi, Thanks for the excellent work. @zubair-irshad

We are currently working on robotics grasping and are particularly interested in your SOTA shape reconstruction.

  1. I have one Nvidia RTX 3090 card (24GB memory), would it be feasible to train your model on my PC?
    Since the data is 800GB, I guess it will take much training time.

  2. Alternatively, do you provide any pre-trained model that can be used instantly?

  3. Do you have documentation about how to use your model in real-time with an arbitrary depth camera connected to a PC?

Waiting for your reply.
Thanks in advance.

@DavidYaonanZhu
Copy link
Author

Hi, Can you briefly describe how to use the model with live-streamed data from an RGBD camera? @zubair-irshad

@DavidYaonanZhu
Copy link
Author

Tested with custom image, need to fine-tune the network
image

image

@zubair-irshad
Copy link
Owner

Hi @DavidYaonanZhu,

Please find my answers here:

  1. Yes, totally. Our model can be trained in around a day and a half on 13 GB GPU memory. If you have a larger GPU, you could also increase the batch size to make it train faster.

  2. Looks like you already tried our pretrained model. All the details are in our readme and our google colab. Please feel to also check that out.

  3. Please check my comment here for answers to both of your questions about a. finetuning the model and b. running real time on camera RGB-D stream of inputs. In short, running real-time from a camera is possible and this is what out work promises i.e. around 40 Frames per second inference. But we have not released the support for integrating our model with a camera hardware. Feel free to open up a PR for this, also look at the OAK-D/Realsense CenterSnap implementation I linked in my comment. I also have linked a way to get better results on other cameras without model finetuning in my comment here. You could always finetune if you have additional data which might be hard to get especially 3D data.

@DavidYaonanZhu
Copy link
Author

Thanks for the great reply!

I will try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants