GitHub - bantucaravan/recycling-image-classification: Image Classification with CNNs and transfer learning

Image Classification with CNNs

This repo contains experiments comparing the accuracy of transfer-learning versus from-scratch learning and several different network architectures for image classification.

I implemented several simple CNN architectures using increasing an number of feature maps at each layer plus dropouts. I also transfer learned from the convolutional bases of the Inception-ResNet V2 and VGG16 pre-trained architectures. I experimented with flattening versus global average pooling the final convolutional layers for the pre-trained models. I experimented with the depth and regularization of the final fully connected layers sitting on top of the pre-trained bases. I experimented with "fine-tune" training the last few layers of the pre-trained models using very low learning rates. I experimented with "warm-start" continuing training from the most promising models. Finally, I experimented with applying different image augmentations (stretching, rotating, cropping) to training data set.

Methodology

This project is an attempt to practice iterative, recorded, and reproducible search for optimal hyper-parameters and architecture.

The nbrun package is used to execute a base experiment notebook with different combinations of parameters specifying the model architecture and other hyper-parameters. Each time an experiment is run a copy of the notebook is saved (as .ipynb and and .html) allowing reproducibility and later reference.

A logging framework is also defined which allows logging of metrics from each experiment as well as specifications of the data generators and models. Combined with saved model weights, the logged specifications allow for reconstruction of the model pipeline to predict on new data in a new python session without re-training the models.

A plot of the training loss and accuracy is also saved from each experiment.

Example Training Plot

Project Structure

code/Base Experiment.ipynb - the base notebook used to run experiments.
code/build_models.py - this file defines all of the various model architectures tested.
code/build_data_gens.py - this file defines the the various data augmentation generators used.
code/saved experiments/ - this directory stores the saved copies of each experiment notebook.

Data

Data is from this kaggle data set: https://www.kaggle.com/asdasdasasdas/garbage-classification

Data is images of five classes of recyclable material and one category of trash. A high degree of regularity in train images makes these models VERY poor at external generalization, i.e. while the models can accurately detect a image of paper from this dataset, they will do a poor job of identifying any random image of paper from the internet. That is because the images in the data set are all single items on a white background, under consistent lighting, at a uniform distance; random images from the internet will not have those same features.

Leaderboard

See the top ten highest performing models.

Note: See code/build_models.py for the exact model configurations represented by the (non-timestamp) names in the "MODEL" column.
Note: Names in the "MODEL" column that are time stamps represent "warm-started" models which continued training from a previously trained configuration.

metrics_report.weighted avg.recall	EPOCH	MODEL	run_id
0.859756	70.0	VGG16 Fine-tuning	2020-02-09_21h53m57s
0.841463	40.0	2020-02-07_01h10m05s	2020-02-09_14h25m27s
0.814024	40.0	2020-02-07_01h36m15s	2020-02-09_14h07m18s
0.807927	70.0	2020-02-08_23h29m06s	2020-02-09_15h50m36s
0.786585	100.0	VGG16 Model	2020-02-09_04h52m57s
0.777439	40.0	2020-02-09_07h23m26s	2020-02-09_17h27m39s
0.777439	100.0	Inception-ResNet V2 finetuning final-module	2020-02-09_07h23m26s
0.774390	NaN	Lite Test	2020-02-07_01h36m15s
0.768293	NaN	Lite Test	2020-02-07_01h10m05s
0.746951	70.0	Inception-ResNet V2 w. Dropout Model	2020-02-09_12h22m07s

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
code		code
data		data
figs		figs
logs		logs
misc		misc
saved models		saved models
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Classification with CNNs

Methodology

Project Structure

Data

Leaderboard

About

Releases

Packages

Languages

bantucaravan/recycling-image-classification

Folders and files

Latest commit

History

Repository files navigation

Image Classification with CNNs

Methodology

Project Structure

Data

Leaderboard

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages