-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input[16, 9, 336, 336] to have 3 channels, but got 9 channels instead #5
Comments
Got this runtime error too, "RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input[2, 9, 336, 336] to have 3 channels, but got 9 channels instead". Has this been solved? |
@BubvieyKevin Thank you, let's wait until the code is ready again. |
Thank you for identifying some issues with our code. We have also noticed the same problems and are currently working on resolving them. |
Thanks all authors for this great work. How's the progress concerning addressing this issue? |
Thanks for report this problem and we have fixed it in the latest version of code. |
Thanks a lot. we will test it again |
I just tested the code again and still got this error, |
Am not able to train either. However, I still quite not very understand the code, the Anyone knows on this part? |
Change https://github.com/thunlp/LLaVA-UHD/blob/main/llava_uhd/train/llava-uhd/train.py#L766 if all(x is not None and x.shape == images[0].shape for x in images) and False: |
Does this change fix the training? And how's the training results of replicating LLaVA-UHD? |
No, this does not fix the bug, I still meet the same bug. |
Hi, guys @piantic @zyddnys @lucasjinreal @YFCYFC @gordonhu608 @guozonghao96 I've released another implementation of LLaVA-UHD here, which I believe is more stable and elegant. The code of the new repo originates from this repo, but its overall quality is improved, and the training program is tested to be able to normally run without bugs. When I reviewed this old repo and tried to fix this You are very welcome to use it, and I look forward to your feedback. |
Our repository has been fully improved, and almost all bugs have been eliminated. For details, please refer to the main branch and the LLaVA-UHD v1 branch. This issue is now closed. If there are any new problems, feel free to open a new issue. |
First of all, thank you for publishing a good paper.
As you mentioned in the issue, the benchmark performance is overall good.
Unfortunately, the weights are not public now, so I am trying to train the model myself.
I was able to train pretrain stage, so it is okay.
But there are some issues in fine-tuning stage.
Runtime errors keep occurring during this stage.
RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input[16, 9, 336, 336] to have 3 channels, but got 9 channels instead
I checked the loss and the loss did not change from 0.0.
{'loss': 0.0, 'learning_rate': 1.6279069767441862e-06, 'epoch': 0.0}
I suspected your slice_logic and noticed that the output was unusual.
But other issues say it's normal, so I don't think this is the problem.
Could you please give me some advice on this?
The text was updated successfully, but these errors were encountered: