Skip to content
This repository has been archived by the owner on Jul 30, 2024. It is now read-only.

Speed Up #10

Open
yenchenlin opened this issue May 2, 2020 · 3 comments
Open

Speed Up #10

yenchenlin opened this issue May 2, 2020 · 3 comments

Comments

@yenchenlin
Copy link

May I know the exact config (hyperparameters) to get an average speed of 14.2 it/s (on an RTX2080Ti, PyTorch 1.4.0, CUDA 9.2) reported here?

I couldn't get it by simply following modifications in #6.
(cc @kwea123, did you test it further?)

Thanks in advance!

@kwea123
Copy link

kwea123 commented May 2, 2020

As I commented in that thread, I can only get up to 8.5it/s (without his caching strategy). I didn't test further. He mentioned caching gave him +4 it/s, maybe that is the trick? But in my implementation data loading is not bottleneck at all (it's about 2e-4 sec to fetch a batch) so I didn't try it.

@krrish94
Copy link
Owner

krrish94 commented May 5, 2020

Hi @yenchenlin

Missed out on this issue. I haven't been able to follow up on that thread as I haven't been able to run additional experiments since. For the config though, it was PyTorch 1.4, CUDA 10.1, python 3.6. I was running on a GPU-cluster, and in particular, on a node that had a V100 GPU.

As for the config, I took care to replicate the exact lego config file (network of 8 layers, 128 fine samples per ray, and the like). And for speed comparison, I use the time taken until the optimizer parameter update (and not the tqdm loop reported times, which include tensorboard logging, etc.)

@yenchenlin
Copy link
Author

yenchenlin commented May 9, 2020

Is there a specific config file and command to reproduce that? thanks a lot!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants