You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The max-average pooling 'trick' doesn't have a theoretical basis to give better results that I know of. The reason it might be improving results could be due to the increased number of large crops from the image when using it.
A better solution would be to add the cropping scheme from BigSleep, which adds a bias for larger crops and improves results coherence significantly.
The text was updated successfully, but these errors were encountered:
The max-average pooling 'trick' doesn't have a theoretical basis to give better results that I know of. The reason it might be improving results could be due to the increased number of large crops from the image when using it.
A better solution would be to add the cropping scheme from BigSleep, which adds a bias for larger crops and improves results coherence significantly.
Do you think that we should just remove the current augmentation approach all together and just using BigSleep's crops? Or, do you want to preserve the non-cropping augmentations and only remove the cropping ones? Either way, I strongly support you making a branch featuring these changes.
Do you think that we should just remove the current augmentation approach all together and just using BigSleep's crops? Or, do you want to preserve the non-cropping augmentations and only remove the cropping ones? Either way, I strongly support you making a branch featuring these changes.
We Ideally we should run automated tests for what the best augmentations and hyper parameters are. I think we can and should change the cropping scheme to the big sleep ones. I've made a branch where I removed the average-max pooling and added the bigsleep crops https://github.com/EleutherAI/vqgan-clip/tree/Develop-1
The max-average pooling 'trick' doesn't have a theoretical basis to give better results that I know of. The reason it might be improving results could be due to the increased number of large crops from the image when using it.
A better solution would be to add the cropping scheme from BigSleep, which adds a bias for larger crops and improves results coherence significantly.
The text was updated successfully, but these errors were encountered: