Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Error #37

Open
shilpaullas97 opened this issue Jun 18, 2024 · 1 comment
Open

Training Error #37

shilpaullas97 opened this issue Jun 18, 2024 · 1 comment

Comments

@shilpaullas97
Copy link

Hi @zhyever ,

I recently tried out patchfusion model using this repo.
Currently I'm trying to run the training script to train a model by myself.

Following the training steps in [https://github.com/zhyever/PatchFusion/blob/main/docs/user_training.md] , I was able to run coarse and fine model training for depth_anything_vitb model.
But facing the below error while running the training for fusion model.

rank0: File "PatchFusion/estimator/trainer/trainer.py", line 32
6, in run
rank0: self.train_epoch(epoch_idx)
rank0: File "PatchFusion/estimator/trainer/trainer.py", line 25
0, in train_epoch
rank0: self.optimizer_wrapper.update_params(total_loss)
rank0: File "lib/python3.8/site-packages/mmengine/optim/optimizer/optimizer_wrapper.py",
line 196, in update_params
rank0: self.backward(loss)
rank0: File "lib/python3.8/site-packages/mmengine/optim/optimizer/optimizer_wrapper.py",
line 220, in backward
rank0: loss.backward(**kwargs)
rank0: File "lib/python3.8/site-packages/torch/tensor.py", line 525, in backward
rank0: torch.autograd.backward(
rank0: File "lib/python3.8/site-packages/torch/autograd/init.py", line 260, in backwa
rd
rank0: grad_tensors
= make_grads(tensors, grad_tensors, is_grads_batched=False)
rank0: File "lib/python3.8/site-packages/torch/autograd/init.py", line 133, in make
grads
rank0: raise RuntimeError(
rank0: RuntimeError: grad can be implicitly created only for scalar outputs

Could you please give some inputs on this?
Is there anything to be modified on the script?

One more question out of this. Do we have any onnx/tensorrt or any other deployment model version for patchfusion?

@zhyever
Copy link
Owner

zhyever commented Jul 1, 2024

Sorry for the late reply. Just came back from one conference. Would you mind to check the shape of the backward loss? Which version of torch are you using?

I will work for deployment models for patchrefiner, which is our follow-up work of patchfusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants