-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-463: Improve TZ support for JDBC driver #464
base: main
Are you sure you want to change the base?
Conversation
.../apache/arrow/driver/jdbc/accessor/impl/calendar/ArrowFlightJdbcTimeStampVectorAccessor.java
Show resolved
Hide resolved
@@ -120,7 +120,12 @@ public static int getSqlTypeIdFromArrowType(ArrowType arrowType) { | |||
case Time: | |||
return Types.TIME; | |||
case Timestamp: | |||
return Types.TIMESTAMP; | |||
String tz = ((ArrowType.Timestamp) arrowType).getTimezone(); | |||
if (tz != null){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I tested this PR from
java.time.Instant
obtained through Arrow Flight JDBC Driver is 16 hours different from the original timestamp #636 and Timestamps queried from InfluxDB 3 Core via JDBC Driver are inconsistent with those inserted influxdata/influxdb#25983 , but from my point of view, the current PR still handlesjava.time.Instant
incorrectly. -
The
java.time.Instant
obtained from the query is 8 hours different from the original timestamp, which is still unreasonable. See linghengqian/influxdb-3-core-jdbc-test@8ae1aaa .
sdk install java 21.0.6-ms
sdk use java 21.0.6-ms
git clone [email protected]:aiguofer/arrow-java.git -b improved_tz_support
cd ./arrow/
mvn clean install -DskipTests -Dspotless.check.skip=true
cd ../
git clone [email protected]:linghengqian/influxdb-3-core-jdbc-test.git
cd ./influxdb-3-core-jdbc-test/
sdk use java 21.0.6-ms
./mvnw -T 1C -Dtest=TimeDifferenceTest clean test
Click me to view the error log of the unit test🥯🥨🍟🧂🥖🥚🍔🦪🍜🍘
$ ./mvnw -T 1C -Dtest=TimeDifferenceTest clean test
[INFO] Scanning for projects...
[INFO]
[INFO] Using the MultiThreadedBuilder implementation with a thread count of 16
[INFO]
[INFO] ----------< io.github.linghengqian:influxdb-3-core-jdbc-test >----------
[INFO] Building influxdb-3-core-jdbc-test 1.0-SNAPSHOT
[INFO] from pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- clean:3.2.0:clean (default-clean) @ influxdb-3-core-jdbc-test ---
[INFO] Deleting /home/linghengqian/TwinklingLiftWorks/git/public/influxdb-3-core-jdbc-test/target
[INFO]
[INFO] --- resources:3.3.1:resources (default-resources) @ influxdb-3-core-jdbc-test ---
[INFO] skip non existing resourceDirectory /home/linghengqian/TwinklingLiftWorks/git/public/influxdb-3-core-jdbc-test/src/main/resources
[INFO]
[INFO] --- compiler:3.13.0:compile (default-compile) @ influxdb-3-core-jdbc-test ---
[INFO] No sources to compile
[INFO]
[INFO] --- resources:3.3.1:testResources (default-testResources) @ influxdb-3-core-jdbc-test ---
[INFO] skip non existing resourceDirectory /home/linghengqian/TwinklingLiftWorks/git/public/influxdb-3-core-jdbc-test/src/test/resources
[INFO]
[INFO] --- compiler:3.13.0:testCompile (default-testCompile) @ influxdb-3-core-jdbc-test ---
[INFO] Recompiling the module because of changed source code.
[INFO] Compiling 7 source files with javac [debug target 21] to target/test-classes
[INFO]
[INFO] --- surefire:3.5.2:test (default-test) @ influxdb-3-core-jdbc-test ---
[INFO] Using auto detected provider org.apache.maven.surefire.junitplatform.JUnitPlatformProvider
[INFO]
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running io.github.linghengqian.TimeDifferenceTest
SLF4J(W): No SLF4J providers were found.
SLF4J(W): Defaulting to no-operation (NOP) logger implementation
SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further details.
2月 25, 2025 11:11:36 上午 org.apache.arrow.driver.jdbc.shaded.org.apache.arrow.memory.BaseAllocator <clinit>
信息: Debug mode disabled. Enable with the VM option -Darrow.memory.debug.allocator=true.
2月 25, 2025 11:11:36 上午 org.apache.arrow.driver.jdbc.shaded.org.apache.arrow.memory.DefaultAllocationManagerOption getDefaultAllocationManagerFactory
信息: allocation manager type not specified, using netty as the default type
2月 25, 2025 11:11:36 上午 org.apache.arrow.driver.jdbc.shaded.org.apache.arrow.memory.CheckAllocator reportResult
信息: Using DefaultAllocationManager at memory/netty/DefaultAllocationManagerFactory.class
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.419 s <<< FAILURE! -- in io.github.linghengqian.TimeDifferenceTest
[ERROR] io.github.linghengqian.TimeDifferenceTest.test -- Time elapsed: 3.354 s <<< FAILURE!
java.lang.AssertionError:
Expected: is <2025-02-25T03:11:24.614425710Z>
but: was <2025-02-24T19:11:24.614425710Z>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at io.github.linghengqian.TimeDifferenceTest.queryDataByJdbcDriver(TimeDifferenceTest.java:89)
at io.github.linghengqian.TimeDifferenceTest.test(TimeDifferenceTest.java:50)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
[INFO]
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TimeDifferenceTest.test:50->queryDataByJdbcDriver:89
Expected: is <2025-02-25T03:11:24.614425710Z>
but: was <2025-02-24T19:11:24.614425710Z>
[INFO]
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4.894 s (Wall Clock)
[INFO] Finished at: 2025-02-25T11:11:38+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.5.2:test (default-test) on project influxdb-3-core-jdbc-test: There are test failures.
[ERROR]
[ERROR] See /home/linghengqian/TwinklingLiftWorks/git/public/influxdb-3-core-jdbc-test/target/surefire-reports for the individual test results.
[ERROR] See dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
- Another issue that is not resolved by the current PR is that the file is not formatted, resulting in the need for a command line argument of
-Dspotless.check.skip=true
.
@linghengqian I looked at your test but there's a few things that are not immediately clear.
If you're in Shanghai TZ, it's possible the 8 hour difference you see is due to your local laptop timezone. |
Hopefully/presumably the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does Influx adjust the TZ for the Point based on its location? (that is, does it convert
magicTime
to London TZ?)
- @aiguofer I'm just quickly quoting an influxdb committer. See Timestamps queried from InfluxDB 3 Core via JDBC Driver are inconsistent with those inserted influxdata/influxdb#25983 (comment) .
FWIW
influxdb3
assumes the written time is UTC, and stores as UTC.
Which timestamp vector does it use for transport when querying
time
? does it include TZ info or not?
- From my perspective, the time obtained by querying influxdb 3 core through influxdb 3 java client is the time without time zone information, refer to https://github.com/linghengqian/influxdb-3-core-jdbc-test/blob/master/src/test/java/io/github/linghengqian/influxdb3java/PointValuesTest.java . And the time obtained by querying influxdb 3 core through arrow flight java api is a
java.time.OffsetDateTime
converted from timestamp to UTC time zone, and then fromOffsetDateTime
to the finaljava.time.LocalDateTime
, refer to https://github.com/linghengqian/influxdb-3-core-jdbc-test/blob/master/src/test/java/io/github/linghengqian/FlightSqlTest.java . And the handling via arrow flight jdbc driver obviously requires a fix in the current PR.
Hopefully/presumably the
Instant
value should be correct, though? Although I see it's anInstant
obtained from thejava.sql.Timestamp
which is already suspect (it's never been clear to me what the expectation is for the semantics of that class)
-
@lidavidm Sorry, I made a cognitive error before. I forgot that
java.sql.Timestamp#toInstant()
will use the device's time zone information. I have updated the processing at linghengqian/influxdb-3-core-jdbc-test@489f4d7 and verified that the current PR handlesjava.time.Instant
correctly. My WSL environment usesAsia/Shanghai
time zone. -
Overall, I think there are no issues with the current PR except for the unformatted files, and it would be great to see the current PR merged in the
19.0.0
milestone.
$ mvn clean install -DskipTests
[ERROR] Failed to execute goal com.diffplug.spotless:spotless-maven-plugin:2.30.0:check (spotless-check) on project flight-sql-jdbc-core: The following files had format violations:
Awesome! I can easily make the formatting fixes for now and remove the draft status to get some more reviews. Depending on how others feel about unit tests, it would be great if someone can help contribute the tests. |
This comment has been minimized.
This comment has been minimized.
Ok I tried to fix the existing tests but now I'm worried that the change to not modify the offset for Timestamp vectors w/o TZ info might have an inconsistent behavior with I also noticed that what we currently do for There's a lot of inter-related parts here and breaking changes could be troublesome. I'm going to change behavior so that usage of the "legacy" JDBC date/time objects stays the same and we can simply recommend that users use |
ddd4bb3
to
ee121e5
Compare
I'll take a look. That sounds reasonable as a first step but we should probably fix things longer term if we can. @laurentgo not sure if you have ideas on how the native JDBC datetime types are supposed to behave, I found the documentation rather under-specified... |
...che/arrow/driver/jdbc/accessor/impl/calendar/ArrowFlightJdbcTimeStampVectorAccessorTest.java
Outdated
Show resolved
Hide resolved
Awesome thanks! I updated a few more things including a table in the PR description showing different behaviors for I'm not sure how much more I'll be able to work on this over the next few weeks but I'll try to keep up with it! |
This PR adds support for natively fetching
java.time.*
objects through the JDBC driver.DateVector
DateTimeVector
TimeVector
This PR also changes the behavior for vectors that include TZ info. These will now return as
TIMESTAMP_WITH_TIMEZONE
.The behavior for different ways to access a TimeStampVector are as follows:
Closes #463