Update bulk image conversion script to support multiple formats by Anushlinux · Pull Request #226 · Udayraj123/OMRChecker

Anushlinux · 2024-10-03T14:33:05Z

I have updated the conversion code and implemented my function to the convert_image_hook as you required

Pls review the code
(ps: sorry for deleting the previous pr i did it bimistake)

Udayraj123

@Anushlinux you have understood and handled a lot of code flows(in local scripts) to OMRChecker. Great work on your first PR! 🎉
I've finished my review and I think we will need some discussion to make it even more extensible. Let's discuss async in the discord channel once you go through my comments.

Udayraj123 · 2024-10-07T00:34:31Z

-
-# Usual pre-processing commands for speedups (useful during re-runs)
-# python3 scripts/local/convert_images.py -i inputs/ --replace [--filter-ext png,jpg] --output-ext jpg
-# python3 scripts/local/resize_images.py -i inputs/ -o outputs --max-width=1500


Let's keep all these comments as a guide for the contributors to figure out what needs to be done here

Udayraj123 · 2024-10-07T00:42:20Z

+    pages = convert_from_path(input_path)
+    for i, page in enumerate(pages):
+        output_path = os.path.join(output_dir, f"page_{i + 1}.jpg")
+        page.save(output_path, 'JPEG')


let's also include file name here

Suggested change

pages = convert_from_path(input_path)

for i, page in enumerate(pages):

output_path = os.path.join(output_dir, f"page_{i + 1}.jpg")

page.save(output_path, 'JPEG')

pages = convert_from_path(input_path)

input_path = Path(PurePosixPath(input_path).as_posix())

file_name = input_path.name

for i, page in enumerate(pages):

output_path = os.path.join(output_dir, f"{file_name}_page_{i + 1}.jpg")

print(f"Saving page: {output_path}")

page.save(output_path, 'JPEG')

Also convert_image and convert_pdf_to_jpg should come from image_utils instead of bulk_ops_common.py.
I am thinking of keeping this file only for operations that are common across all sorts of bulk scripts (bulk resize, convert, watermark, etc)

Udayraj123 · 2024-10-07T01:17:10Z

+    """
+    pages = convert_from_path(input_path)
+    for i, page in enumerate(pages):
+        output_path = os.path.join(output_dir, f"page_{i + 1}.jpg")


I have a feature extension idea here, see if you can implement -
Let's add a support to extract only the first page based on a flag (in that case the image's output_dir shouldn't be created and the single image should be output directly with same filename without the page_ prefix.

Udayraj123 · 2024-10-07T01:24:40Z

+    os.makedirs(output_dir, exist_ok=True)
+    extensions = ['png', 'jpg', 'jpeg', 'tiff', 'tif', 'pdf']
+
+    filepaths = walk_and_extract_files(input_dir, extensions)


snake case

Suggested change

filepaths = walk_and_extract_files(input_dir, extensions)

file_paths = walk_and_extract_files(input_dir, extensions)

Udayraj123 · 2024-10-07T01:26:20Z

+    Bulk converts images and PDFs to the specified format.
+    """
+    os.makedirs(output_dir, exist_ok=True)
+    extensions = ['png', 'jpg', 'jpeg', 'tiff', 'tif', 'pdf']


Let's keep pdf handling in a separate script (check other comments)

Suggested change

extensions = ['png', 'jpg', 'jpeg', 'tiff', 'tif', 'pdf']

extensions = ['png', 'jpg', 'jpeg', 'tiff', 'tif']

Udayraj123 · 2024-10-07T01:30:03Z

+    for input_path in filepaths:
+        relative_path = os.path.relpath(os.path.dirname(input_path), input_dir)
+        output_subdir = os.path.join(output_dir, relative_path) if not in_place else os.path.dirname(input_path)
+        os.makedirs(output_subdir, exist_ok=True)
+


The in_place flag handling can be a common bulk ops utility. I propose we define a function bulk_apply_on_files(func, input_dir, file_paths, in_place)

The first arg func can be convert_image or convert_pdf_to_image

The bulk_apply_on_files applies this passed function with the same arguments

if in_place is True and the conversion was successful, the input file gets removed (pdf or image) and the output files are added in the same directory

if in_place is False and the conversion was successful, the output files are added in a relative directory output_dir = f"{input_dir}/outputs/"

If the conversion was unsuccessful, no file changes happen

Let's discuss more about this in the discord group if needed.

Udayraj123 · 2024-10-07T01:33:31Z

-
-    converted_count = convert_images_in_tree(args)
-    trigger_size = args["trigger_size"]
+    converted_count = convert_images(args.filenames, args.format.upper())


Here we can call the function bulk_apply_on_files(convert_image, input_dir, file_paths, in_place) or something similar based on our discussion (check other comments below)

Udayraj123 · 2025-05-24T12:52:18Z

will be closing for inactivity. Feel free to reopen or if someone else wants to continue this task.

Update image conversion script to support multiple formats

65ad6ae

Udayraj123 requested changes Oct 3, 2024

View reviewed changes

Comment thread scripts/hooks/convert_images_hook.py Outdated

Comment thread scripts/hooks/convert_images_hook.py Outdated

changes on the basis of hooks and bulk conversion

84ddb53

Anushlinux requested a review from Udayraj123 October 5, 2024 21:07

Udayraj123 requested changes Oct 7, 2024

View reviewed changes

Udayraj123 added the hacktoberfest label Oct 7, 2024

Udayraj123 closed this May 24, 2025

	filepaths = walk_and_extract_files(input_dir, extensions)
	file_paths = walk_and_extract_files(input_dir, extensions)

	extensions = ['png', 'jpg', 'jpeg', 'tiff', 'tif', 'pdf']
	extensions = ['png', 'jpg', 'jpeg', 'tiff', 'tif']

Uh oh!

Conversation

Anushlinux commented Oct 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Udayraj123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Udayraj123 commented May 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Anushlinux commented Oct 3, 2024 •

edited

Loading