Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to process images as batch format? #113

Open
1 task done
mapengsen opened this issue Dec 22, 2024 · 2 comments
Open
1 task done

how to process images as batch format? #113

mapengsen opened this issue Dec 22, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@mapengsen
Copy link

Issue Type

Bug

Source

GitHub (source)

DECIMER Image Transformer Version

2.0

OS Platform and Distribution

No response

Python version

No response

Current Behaviour?

How do I use batch processing data because I now have 10,000 images

Which images caused the issue? (This is mandatory for images related issues)

No response

Standalone code to reproduce the issue

How do I use batch processing data because I now have 10,000 images

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@tgCam
Copy link

tgCam commented Dec 22, 2024

Try this:

import os
import sys
parts = os.path.abspath(__file__).split(os.sep)
import importlib
import re
import glob

# Check for the decimer package
try:
    print("\nInitializing DECIMER. This can take about 30 seconds...")
    from DECIMER import predict_SMILES
except ImportError:
    print("""\nDECIMER not found. Please install the package before running this script.
        # Create a conda wonderland
        conda create --name DECIMER python=3.10.0 -y
        conda activate DECIMER

        # Equip yourself with DECIMER
        pip install decimer""")
    
    sys.exit(1)

# Select the folder containing the images
print("\nSelect the folder containing the images in PNG or JPG format.")

folder_path = [PATH TO FOLDER WITH YOUR IMAGES HERE]
if not folder_path:
    print("\nExiting...")
    sys.exit(1)

filelist = glob.glob(os.path.join(folder_path, '**/*.png'), recursive=True) + \
           glob.glob(os.path.join(folder_path, '**/*.jpg'), recursive=True) + \
           glob.glob(os.path.join(folder_path, '**/*.jpeg'), recursive=True)

out_dir = os.path.join(folder_path, 'out')
os.makedirs(out_dir, exist_ok=True)

base_outfile_name = 'smiles_out'
outfile_name = f"{base_outfile_name}.csv"
counter = 1
while os.path.exists(os.path.join(out_dir, outfile_name)):
    outfile_name = f"{base_outfile_name}_{counter}.csv"
    counter += 1

with open(os.path.join(out_dir, outfile_name), 'w') as f:
    f.write("Image,SMILES\n")
    
    for filename in filelist:
        # Unleash the power of DECIMER
        img_file = os.path.basename(filename)
        SMILES = predict_SMILES(filename)
        print(f"Image: {img_file} | Decoded SMILES: {SMILES}")

        # Write the SMILES to a csv file with the Image name as one column and SMILES as another column
    
        f.write(f"{img_file},{SMILES}\n")

if filelist:
    print("\nAll images processed. Check the 'out' directory for the SMILES csv file.")

@mapengsen
Copy link
Author

This is also a picture that can only process batch 1 at a time, can you make its batch bigger? Many at once?

@OBrink OBrink self-assigned this Dec 22, 2024
@OBrink OBrink added the enhancement New feature or request label Dec 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants