2/4 Support BigTIFF encoded TIFF data #584

don-vip · 2022-06-18T21:07:18Z

This is the second step towards solving #278

Port of these commits from drewnoakes/metadata-extractor-dotnet#302:

The PR is built on top of #581

drewnoakes

Looks good. As with your other PRs, I'm dropping my thoughts here as a note to myself for when I pull this down to play with it, hopefully later today.

drewnoakes · 2022-06-19T22:45:09Z

Source/com/drew/imaging/tiff/TiffReader.java

-        final int tiffMarker = reader.getUInt16(2);
+        final short tiffMarker = (short) reader.getUInt16(2);


I'm not totally sure about this change, as Java doesn't have an unsigned short type (unlike .NET where this change is being ported from). Generally throughout the Java library we widen the type to allow the full unsigned range to remain unsigned. In this case though it's not really a numeric value used for any kind of size or comparison, but just an identifier. That said, it does change the public API, which is a breaking change. There are other breaking changes in these PRs though.

Just dumping out my thoughts here. I'm going to pull down your work here and take it for a spin before merging.

drewnoakes · 2022-06-19T22:47:51Z

Source/com/drew/lang/RandomAccessReader.java

+     * @return the 64 bit int value, between 0x0000000000000000 and 0xFFFFFFFFFFFFFFFF
+     * @throws IOException the buffer does not contain enough bytes to service the request, or index is negative
+     */
+    public long getUInt64(int index) throws IOException


Java doesn't have an unsigned long, so this method is redundant as far as I can tell. Keeping it does at least allow calling code to express the intent, despite the limitation of the language.

I notice though that, comparing to getInt64 above, there are a bunch of && masks in that method that may be redundant (and possibly others too).

you're right, I wasn't sure what to do. It mainly helps to compare the .NET and Java codebases.

Source/com/drew/imaging/tiff/TiffReader.java

drewnoakes · 2022-06-19T22:55:31Z

Source/com/drew/imaging/tiff/TiffReader.java

+                    ? reader.getUInt64(finalTagOffset)
+                    : reader.getUInt32(finalTagOffset);
+
+            if (nextIfdOffsetLong != 0 && nextIfdOffsetLong <= Integer.MAX_VALUE) {


The fact that getUInt64 might return a negative value is a problem here. I'm wondering if that method should be changed to throw if the value is actually negative. Otherwise we have to do things like check nextIfdOffsetLong < 0 here (and possibly elsewhere).

On the other hand, throwing might lead to failing to read data that would otherwise be safe to read.

I really wish Java had unsigned numeric types!

… than passing TIFF header offsets around everywhere.

Note that while BigTIFF supports files greater than 2 GiB in size, our current implementation does not due to the pervasive use of Int32 throughout the code to represent offsets into the data.

This will only ever be a 16-bit value.

This allows combining the add and test operations into a single lookup.

don-vip · 2024-07-29T21:25:46Z

This one looks good too, I see the Java <> .NET differences disappear for BigTIFF files

don-vip marked this pull request as ready for review June 18, 2022 21:13

don-vip mentioned this pull request Jun 18, 2022

3/4 Make RandomAccessReader.isMotorolaByteOrder read only #585

Open

don-vip changed the title ~~Support BigTIFF encoded TIFF data~~ 2/4 Support BigTIFF encoded TIFF data Jun 19, 2022

drewnoakes reviewed Jun 19, 2022

View reviewed changes

don-vip and others added 3 commits July 20, 2024 16:37

Simplify TIFF processing by 'shifting' base of indexed reader, rather…

3a975db

… than passing TIFF header offsets around everywhere.

Pass start offset when reading Exif

675ab95

Add IndexedReader.GetUInt64

f5ddfcd

don-vip force-pushed the bigtiff branch from b53e3ee to 4ba3ebe Compare July 29, 2024 21:20

don-vip added 3 commits July 29, 2024 23:23

Support BigTIFF encoded TIFF data

6964deb

Note that while BigTIFF supports files greater than 2 GiB in size, our current implementation does not due to the pervasive use of Int32 throughout the code to represent offsets into the data.

Change ITiffHandler.SetTiffMarker to accept ushort

dec9b6b

This will only ever be a 16-bit value.

Pass Set for processed IFD offsets

499ea8a

This allows combining the add and test operations into a single lookup.

don-vip force-pushed the bigtiff branch from 4ba3ebe to 499ea8a Compare July 29, 2024 21:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2/4 Support BigTIFF encoded TIFF data #584

2/4 Support BigTIFF encoded TIFF data #584

don-vip commented Jun 18, 2022 •

edited

Loading

drewnoakes left a comment

drewnoakes Jun 19, 2022

drewnoakes Jun 19, 2022

don-vip Jun 20, 2022

drewnoakes Jun 19, 2022

don-vip commented Jul 29, 2024

		final int tiffMarker = reader.getUInt16(2);
		final short tiffMarker = (short) reader.getUInt16(2);

2/4 Support BigTIFF encoded TIFF data #584

Are you sure you want to change the base?

2/4 Support BigTIFF encoded TIFF data #584

Conversation

don-vip commented Jun 18, 2022 • edited Loading

drewnoakes left a comment

Choose a reason for hiding this comment

drewnoakes Jun 19, 2022

Choose a reason for hiding this comment

drewnoakes Jun 19, 2022

Choose a reason for hiding this comment

don-vip Jun 20, 2022

Choose a reason for hiding this comment

drewnoakes Jun 19, 2022

Choose a reason for hiding this comment

don-vip commented Jul 29, 2024

don-vip commented Jun 18, 2022 •

edited

Loading