Hash input file contents for imageproc output filenames #2886

lpulley · 2025-05-10T04:17:00Z

It doesn't seem quite right to me that resize_image's output filename hash reads the path of the input file; shouldn't it read the contents instead? That way:

if an input file is modified in place, Zola will not reuse the now-stale processed file
if an input file is moved but retains the original filename, Zola will be able to reuse the existing processed file
- (This one begs the question: should the original filename be included in the processed output filename at all? Should the processed output filename just be a hash digest and an extension? It would minimize the impact of renames and maybe even allow multiple identical input files to reuse a single processed output file.)

(Also, GetImageMetadata::call can just use the full path to the file as its cache key, so the return signature of search_for_file can be simplified.)

Since #2862/#2872 have changed the hash behavior on next, I figure this is maybe a good time to consider this, too, as it also affects the imageproc hash behavior.

lpulley · 2025-06-06T21:21:37Z

@Keats do I have a good idea here? Bumping because of:

Since #2862/#2872 have changed the hash behavior on next, I figure this is maybe a good time to consider this, too, as it also affects the imageproc hash behavior.

lpulley · 2025-06-08T20:03:36Z

Hm... this is vulnerable to the input contents changing between enqueueing and image processing. I suppose the queue would have to hold the input contents instead of the input path?

Keats · 2025-06-09T19:56:28Z

I think that makes sense

Keats

I like the idea but the performance changes seem to be a net negative even when you take into account regenerating the image if we change the filename

Keats · 2025-06-09T20:00:49Z

components/imageproc/src/helpers.rs

+) -> Result<String> {
    let mut hasher = DefaultHasher::new();
-    hasher.write(input_src.as_ref());
+    hasher.write(


does that change perf significantly? I can imagine reading a whole site of images is going to be much more computationally expensive than filenames

Also it doesn't make sense to read the file multiple times, once for the filename and once to actually operate on it

Yeah, I've been thinking about how to approach this. We'd essentially want to read each input once (at most), and for each operation to hold a reference or key to the in-memory contents of the input for processing.

I'll push something for that if I figure out a good solution.

Keats reviewed Jun 9, 2025

View reviewed changes

lpulley force-pushed the imageproc-hash-files-not-paths branch from 3801386 to 35110a5 Compare July 8, 2025 02:20

Keats force-pushed the next branch from 4c3a3a7 to bd270af Compare July 14, 2025 21:02

lpulley added 4 commits July 26, 2025 12:37

Change get_processed_filename to hash input file contents

2357f11

Don't return unified path from search_for_file

d35b345

Update docs

10871ce

Update tests

a7b7acb

lpulley force-pushed the imageproc-hash-files-not-paths branch from 35110a5 to a7b7acb Compare July 26, 2025 17:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hash input file contents for imageproc output filenames #2886

Hash input file contents for imageproc output filenames #2886

lpulley commented May 10, 2025 •

edited

Loading

Uh oh!

lpulley commented Jun 6, 2025 •

edited

Loading

Uh oh!

lpulley commented Jun 8, 2025

Uh oh!

Keats commented Jun 9, 2025

Uh oh!

Keats left a comment

Uh oh!

Keats Jun 9, 2025

Uh oh!

Keats Jun 9, 2025

Uh oh!

lpulley Jun 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Hash input file contents for imageproc output filenames #2886

Are you sure you want to change the base?

Hash input file contents for imageproc output filenames #2886

Conversation

lpulley commented May 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lpulley commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lpulley commented Jun 8, 2025

Uh oh!

Keats commented Jun 9, 2025

Uh oh!

Keats left a comment

Choose a reason for hiding this comment

Uh oh!

Keats Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

Keats Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

lpulley Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lpulley commented May 10, 2025 •

edited

Loading

lpulley commented Jun 6, 2025 •

edited

Loading

lpulley Jun 9, 2025 •

edited

Loading