-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyTorch ResNet50 Validation Accuracy #3
Comments
There are a lot of factors at play for a given result. Pytorch version, CUDA, PIL, etc. Even changing the image scaling between bicubic and bilinear can have a notable impact. I default to bicubic but bilinear works better for some models, likely based on what they were originally trained with. I have noticed changes in accuracy for many models that I measured over a year ago to now (same weights). My ResNet50 number with PyTorch 1.0.1.post2 and CUDA 10: Prec@1 75.868, Prec@5 92.872 My old ResNet50 numbers with PyTorch (0.2.0.post1) and CUDA 9.x?: Prec@1 76.130, Prec@5 92.862 A table with some of my old measurements here: https://github.com/rwightman/pytorch-dpn-pretrained |
ResNet50 on PyTorch 1.0.1.post2 and CUDA 10 w/ bilinear instead of bicubic, Prec@1 76.138, Prec@5 92.864 ... matches your numbers @ankmathur96 |
Interesting! I should mention that I am using PIL version 5.3.0.post0. I believe that bilinear is the default in PyTorch transforms (https://github.com/pytorch/vision/blob/master/torchvision/transforms/transforms.py#L182) and it seems this repository is using the default (https://github.com/cgnorthcutt/benchmarking-keras-pytorch/blob/master/imagenet_pytorch_get_predictions.py#L95). It's interesting to note the difference when using bicubic though. I've also seen variation with different CUDA versions and other setup differences similar to what you're describing. I've seen, for example, a full percentage point drop when using OpenCV's implementation bilinear resizing, as compared to PIL. I was unaware, though, that there could be a full percentage point drop from such setup differences in this kind of more constrained setting (using PyTorch/CUDA/PIL). I found this especially worth highlighting since this repo's evaluation seems to be off by enough that densenet169 performs worse than ResNet-50 in my setup. Edit: It's worth noting that many such differences due to subtle changes in preprocessing implementations can be eliminated (if need be for a production use case) by fine tuning with a low learning rate for several epochs |
@ankmathur96 yeah, I noticed when I was doing my benchmarking in the past that most of the resnet/densenet models in torchvision were better with the default bilinear, but a number of the ported models, Inception variants, DPN, etc were doing better with bicubic. Fine-tuning can definitely help with these sorts of issues if/when it matters. It's also worth noting that many of the default pretrained weights can pretty easily be surpassed by around 1% or more using different training schedules and augmentation techniques. FWIW my densenet169 numbers are very close to this repo and less than my ResNet50 numbers @1 but better @5. I'm using Pillow-SIMD 5.3.0.post0 |
@ankmathur96 @rwightman Thanks for finding this. I agree its likely a PyTorch version / cuda version incompatibility. Did either of you find a fix? Feel free to send a Pull Request on https://github.com/cgnorthcutt/benchmarking-keras-pytorch/blob/master/imagenet_pytorch_get_predictions.py |
the difference between |
See these 2 URLs for the differences in bilinear resizing across libraries, or even same library same function, different padding options: https://stackoverflow.com/questions/18104609/interpolating-1-dimensional-array-using-opencv also see TFv2 now follows Pillow, not OpenCV, if there is a difference between the two... ...which doesn't seem the case |
@calebrob6 Caleb Robinson | How to reproduce ImageNet validation results
|
Hey there!
I came across your project from Jeremy Howard's Twitter. I think it's great to be benchmarking these numbers and keeping them in a single place!
I've tried running your script and ran into some problems that I was hoping you could help diagnose:
I ran
python imagenet_pytorch_get_predictions.py -m resnet50 -g 0 -b 64 ~/imagenet/
and gotI'm using Python 3.7 and PyTorch 1.0.1.post2 and didn't change any of your code except for making the argparse parameter for batch_size to be type=int.
I work pretty regularly with PyTorch and ResNet-50 and was surprised to see the ResNet-50 have only 75.02% validation accuracy. When I use the pretrained ResNet-50 using the code here, I get 76.138% top-1, 92.864% top-5 accuracy. Specifically, I run:
python main.py -a resnet50 -e -b 64 -j 8 --pretrained ~/imagenet/
I'm using CUDA 9.2 and CUDNN version 7.4.1 and running inference on a NVIDIA V100 on a Google Cloud instance using Ubuntu 16.04.
I'm curious what might be going wrong here and why our results are different - to start with, what version of CUDNN/CUDA did your results originate from?
The text was updated successfully, but these errors were encountered: