Skip to content

Conversation

@grantatspothero
Copy link
Contributor

Description

Add support for timestamp-nanos to NativeLogicalTypesAvroTypeBlockHandler.

Note this does not change the behavior of the hive connector, HiveAvroTypeBlockHandler still only supports timestamp-millis. timestamp-micros and timestamp-nanos logical types will continue to be ignored.

Additional context and related issues

Avro 1.12 spec added timestamp-nanos and local-timestamp-nanos logical types: https://avro.apache.org/docs/1.12.0/specification/

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

Change should not be visible by users, hive connector behavior is unchanged.

@cla-bot cla-bot bot added the cla-signed label Dec 3, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for the timestamp-nanos logical type to the native Avro type management system, following the Avro 1.12 specification. The change extends NativeLogicalTypesAvroTypeBlockHandler while maintaining backward compatibility by keeping the Hive connector's behavior unchanged (it continues to ignore timestamp-micros and timestamp-nanos logical types).

  • Added TimestampNanosLogicalType record class and associated constants
  • Implemented encoding/decoding logic for nanosecond-precision timestamps using LongTimestamp
  • Added comprehensive test coverage including edge cases with negative epoch timestamps

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/avro/model/AvroLogicalType.java Added TimestampNanosLogicalType record and TIMESTAMP_NANOS/LOCAL_TIMESTAMP_NANOS constants with parsing logic
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/avro/NativeLogicalTypesAvroTypeManager.java Added TIMESTAMP_NANOS_SCHEMA, type mapping, and bidirectional conversion functions between Trino timestamps and Avro nanosecond timestamps
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/avro/NativeLogicalTypesAvroTypeBlockHandler.java Implemented TimestampNanosBlockBuildingDecoder to decode nanosecond timestamps into LongTimestamp blocks
lib/trino-hive-formats/src/main/java/io/trino/hive/formats/avro/HiveAvroTypeBlockHandler.java Updated pattern matching to ignore TimestampNanosLogicalType, maintaining existing Hive connector behavior
lib/trino-hive-formats/src/test/java/io/trino/hive/formats/avro/TestAvroPageDataReaderWithAvroNativeTypeManagement.java Added timestamp-nanos to existing tests and new comprehensive test for negative epoch timestamps, plus incidental bug fix for timestamp-micros test data

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

yield (block, position) ->
{
SqlTimestamp timestamp = ((SqlTimestamp) timestampType.getObjectValue(block, position));
return (timestamp.getEpochMicros() * NANOSECONDS_PER_MICROSECOND) + (timestamp.getPicosOfMicros() / PICOSECONDS_PER_NANOSECOND);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guard for overflow.
add a test for 9999-12-31 timestamps (surprisingly popular in real world data)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guarded for overflow and added test for longest representable timestamp nanos.

Comment on lines +318 to +348
try (AvroFileWriter fileWriter = new AvroFileWriter(
trinoLocalFilesystem.newOutputFile(testLocation).create(),
timestampsSchema,
new NativeLogicalTypesAvroTypeManager(),
AvroCompressionKind.NULL,
ImmutableMap.of(),
timestampsSchema.getFields().stream().map(Schema.Field::name).collect(toImmutableList()),
new NativeLogicalTypesAvroTypeBlockHandler().typeFor(timestampsSchema).getTypeParameters(), false)) {
fileWriter.write(timestampsPage);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be the only test. NativeLogicalTypesAvroTypeManager is our class and its logic may or may not be entirely correct.

There should be a test with hive writing the file. See example existing test

"TBLPROPERTIES ('avro.schema.literal'='%s')",
SCHEMA_LITERAL));

Copy link
Contributor Author

@grantatspothero grantatspothero Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added hive connector product test, note the test indicates no change in behavior for hive connector

@grantatspothero grantatspothero force-pushed the gn/supportAvroTimestampNanos branch from e303348 to b223b3c Compare December 9, 2025 15:57
@grantatspothero grantatspothero force-pushed the gn/supportAvroTimestampNanos branch from b223b3c to e4a37fb Compare December 9, 2025 16:02
Only support in native block handler, not hive block handler
@grantatspothero grantatspothero force-pushed the gn/supportAvroTimestampNanos branch from e4a37fb to 86fe3b5 Compare December 9, 2025 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants