-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jpegtran optimized file size regression between mozjpeg and 3.1 and 4.1.1 #433
Comments
I noticed the same. I have found that it’s not a problem. After having examined the files, I found that it can add markers that should be there but are not. I have decided that if I get a file where the size isn’t exactly the same as the input file, then I accept it as better. I have created a program that automates jpegtran and I simply overwrite the input file with what I get. I think tyo should too. What I want is a flag such that if the output is identical to the input file, then there should be no output file but you should still get 0. There’s nothing to do with such a file. All I can ”do” is to delete it. |
You can use something like jpegsnoop and compare the larger file with the smaller. AllI found was some added markers. I really think they should be there. Browsers and picture viewers seem capable of dealing with such files. |
@FredWahl: You seem to have a different problem: for you, jpegtran creates an output file larger than the input file. In this issue, the output file of jpegtran (both 3.1 and 4.1.1) is smaller than the input file (so far so good), and the problem is that the output of jpegtran 4.1.1 is larger than the output of jpegtran 3.1.
This is a file size regression: newer version of jpegtran generates larger output file, but one of the main goals of jpegtran is to make the output file small. This is problem, because as a user I'd like to use the latest jpegtran, and get the smallest possible (and feasible) valid JPEG file. Right now, to get it, I have to run many versions of jpegtran (from mozjpeg), and pick the smallest generated output file. |
Actually, the change was introduced between version 4.0.0 and 4.0.3. I've updated my original post with this finding. |
This is interesting. My beloved MacPro crashed, so I had to find a backup of the source and make it work with a newer version. Then I noticed that jepgtran could create larger files. I tried to find some documentation that expalined this but I could not. Finally, I decided that these files were probaly better and only somewhat larger. |
I think the jpeg format is complicated. It could be progressive, baseline or arithmetic. There are many programs that generate them and also many that modify them in unexpected ways. Turns out that if you opened a jpeg in a Microsoft picture viewer and then saved the image somewhere, you would get a reencoded jpeg which would be different from the file you opened. Some browers are capable of opening damaged files while others fail. I have not been able to find a good open source tool that can repair jpeg files. I guess some developers are happy if they can generate somthing that their viewer can open. It doesn’t mean that it is well formed. If jpegtran can do something about it, I think it should. A problem with the files you uploaded is that you have no idea how they may have been manipulated before you found them. I think it makes sense to use jpegsnoop or similar to see why one is larger than the other while still encoded the same way. BTW, I have suggested that there should be a flag the prevents jpegtran from creating an output file if it hasn’t actually done anything with the input file. Such files serve no useful purpose, certainly not to my c/c++ program. It just makes things slower. |
It looks like we (people who have commented on this issue so far) don't have a good understanding on what has changed between jpegtran 4.0.0 and 4.0.3 causing the output JPEG file to be larger (than the previous output JPEG file). I opened this issue to get the mozjpeg developers and experts involved, develop this understanding, and then (if possible), revert the change, or at least provide a command-line flag to disable it. To facilitate this, I've run JPEG analysis tool jpegdump.py (written in Python 2.x) on both lab31.jpg and lab411.jpg. (Unfortunately I don't have a Windows machine, so I'm not able to run jpegsnoop, and it's also a GUI program, so it's not easy to copy-paste its output here.)
It looks like both output files are progressive JPEGs (SOF2), they contain a the minimal JFIF segment without thumbnail (APP0), they don't contain metadata beyond the minimal JFIF segment, they contain identical quantization tables (DQT), and they do the lossless entropy coding very differently (lab31.jpg contains 11 DHT–SOS segment pairs, lab411.jpg contains 9), and the lossless entropy coding in lab31.jpg is more efficint, serving as an example for the optimized file size regression between mozjpeg 4.0.0 and 4.0.3. mozjpeg developers, do you have any insights? |
@pts say:
https://github.com/ImageProcessing-ElectronicPublications/jpegsnoop (cross, cli). |
That sounds useful, but it's out-of-scope for this issue. You may want to open a separate issue for that. As of now, jpegtran of mozjpeg 4.1.1 isn't a general-purpose tool to repair broken JPEG files. For example, if we assume the contrary, and we also assume that lab31.jpg is broken, and run it through jpegtran of mozjpeg 4.1.1, we get an identical output JPEG file, meaning that lab31.jpg was not broken. But then why does jpegtran of mozjpeg 4.1.1 generate a larger output JPEG file for lab.jpg if it's a JPEG optimizer and its previous version was able to generate a smaller, valid output JPEG? (That's the question I have asked and clarified many times in this issue.)
I fail to see a problem concerning manipulation here. What I see is that I download JPEG file lab.jpg from a random source, and jpegtran of mozjpeg 4.1.1 creates an optimized output JPEG file which is larger than the optimized output JPEG file created by jpegtran of mozjpeg 3.1 for the same lab.jpg file. It doesn't matter how the original JPEG file lab.jpg was created (and/or manipulated), a JPEG optimizer (such as jpegtran of mozjpeg) shouldn't have such a file size regression, at least not without the developers understanding and documenting it. Generating larger output files without a good reason defeats the purpose of the JPEG optimizer.
That also sounds useful for some use cases, but it's also out-of-scope for this issue, and it's also independent from it (fixing one doesn't help the other). You may want to open a separate issue for that, thus they can be fixed independently. |
@zvezdochiot say:
Thank you! I've run jpegsnoop and added links to its output to the issue opening message. |
I have a similar problem with cjpeg: updated from 3.1 to 4.1.3 and when I run the same command, I get way bigger output.
And I also had the case that the input jpg was smaller than the output jpg with quality set to 90 ... 🤔 what can we do about this? did some defaults change that we now have to set manually? |
So the problem affects both cjpeg and jpegtran? That seems to indicate that the problem lies in some routine that is common. It is a good thing that people pay attention to detail. In my case it was by accident. I’m surprised this was not found out by the developers. It seems like a good idea to run a newer tool on the output picture of an older tool as well as using identical input files. It quickly becomes complicated when you have many tools and many commands. The tools I have used as a developer would not suffice, I think |
mozjpegtran in mozjpeg 4.1.1 (same output as 4.0.3) creates larger output files than the output files in mozjpeg 3.1 (same output as 4.0.0, 3.3.1 and 3.0) for some inputs (e.g. lab.jpg). For some other input files (e.g. lenna.jpg), it's the other way round.
Examples:
Input files:
Output files:
Output of the CLI version of jpegsnoop on various files:
ERROR: Early EOF - file may be missing EOI
, but all other image processing software I tried can read, display and process these JPEG files without an error.In general, most JPEGs taken by recent mobile phones are larger when optimized with version 4.1.1 than with 3.1.
Am I using jpegtran correctly? Is it possible to specify command-line flags for version 4.1.1 so that the output JPEG won't be larger than of 3.1?
I have tried at all command-line flags of mozjpegtran in version 4.1.1, especially
-fastcrush
,-restart ...
,-maxmemory ...
,-maxscans ...
and-verbose -verbose -verbose -verbose -verbose
, but they didn't make the JPEG output file lab411.jpg as small as lab31.jpg.If not, then could you please fix it in the next version? I'd like to use jpegtran of a recent mozjpeg for lossless JPEG optimization, but I'd like to keep the output file as small as possible.
The text was updated successfully, but these errors were encountered: