This project is developed in Keras + OpenCV + NLTK to build some classifier models on the dataset images and their caption.
The dataset includes two train
and test
directories each contain images
and sentences
subdirectories for image and text classification, respectively.
You can get the dataset from here.
- Python: 3.7.12
- Tensorflow: 2.7.0
- Scikit-learn: 1.0.2
- Numpy: 1.19.5
- Pandas: 1.3.5
- Opencv-python: 4.1.2.30
- Opencv-contrib-python: 4.1.2.30
- NLTK: 3.2.5
- Tensorflow-hub: 0.12.0
- Image Classification:
- Text Classification:
- You will be given a caption as input, and you will be asked to find 10 images in the database whose captions are closer to the given caption. (Select images from the entire database, regardless of the input label). Then use the genetic algorithm and find a coefficient for each image so that by combining 10 images with these coefficients, the resulting image belongs to the category of input caption. You can use variational autoencoder for combining images. This and this can be helful.
Fixes and improvements are more than welcome, so raise an issue or send a PR!