- Only support 4 digits as both correct answers and contents in images
- Only support an alphanumeric combination of lowercase English letters and numbers.
Using Anaconda
to set up virtual dev environment is recommended.
- Python 3.8
- TensorFlow 2.7
- Collect catpcha-like images and rename them with correct answers. For example, if the image says '5dc8' in your eyes, it shall be renamed
5dc5.jpeg
. - Put a major portion of images in a root directory named
training
for training. - Put another batch of images in another root directory named
validation
for validation. - Run
python train.py
, a prompt will ask how many epochs do you want. Insert an integer. - Run
python predict.py
, images in thevalidation
folder will go through the model.- A series of predictions will show up in your terminal.
- At the end, a percentage will tell what is the preciseness of current model.
After labeling over 700+ images, after a certain amount of epochs, the correct percentage could have reached around 70%
. The more images, the better the result is.