Skip to content

Conversation

Aggarwal-Raghav
Copy link
Contributor

@Aggarwal-Raghav Aggarwal-Raghav commented Oct 6, 2025

What changes were proposed in this pull request?

HIVE-29246, Upgrade derby version to 10.17.1.0

Derby project has done refactoring:

  1. org.apache.derby.jdbc.EmbeddedDriver is now part of derbytools jar
  2. org.apache.derby.security.SystemPermission → org.apache.derby.shared.common.security.SystemPermission (this is part of derbyshared jar which is compile time dependency for org.apache.derby:derby)
  3. org.apache.derby.jdbc.AutoloadedDriver → org.apache.derby.iapi.jdbc.AutoloadedDriver

Why are the changes needed?

This will help with the CVE-2022-46337 and derby 10.17.1.0 uses JDK21, its inline with hive master branch

Does this PR introduce any user-facing change?

If users are using org.apache.derby.jdbc.EmbeddedDriver then they need to change the DriverName to org.apache.derby.iapi.jdbc.AutoloadedDriver

How was this patch tested?

Local setup testing. Will see CI outcome

@Aggarwal-Raghav
Copy link
Contributor Author

CC @zabetak , if you have bandwidth can you please review this?

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes LGTM. One thing that I wanted to clarify is the distinction between derby and derbytools artifacts. Do we need both of them in the project? If the driver is now inside derbytools what does derby contain?

public class DerbySQLConnectorProvider extends AbstractJDBCConnectorProvider {
private static Logger LOG = LoggerFactory.getLogger(DerbySQLConnectorProvider.class);

// private static final String DRIVER_CLASS = "org.apache.derby.jdbc.EmbeddedDriver".intern();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Drop the comment

If other changes are needed for the PR then we can also tackle this nit otherwise let's leave it as is. Having a full precommit run just for deleting a comment isn't worth it.

@zabetak
Copy link
Member

zabetak commented Oct 9, 2025

Following up on the question about derby and derbytools I am a bit unsure why the tools are needed. I was under the impression that the derby artifact has everything needed to run an embedded database.

@Aggarwal-Raghav
Copy link
Contributor Author

Aggarwal-Raghav commented Oct 9, 2025

The changes LGTM. One thing that I wanted to clarify is the distinction between derby and derbytools artifacts. Do we need both of them in the project? If the driver is now inside derbytools what does derby contain?

Thanks for the review @zabetak , Regarding this question.
I don't know the details as the commits in which this change is done is not very insightful :-)
Screenshot 2025-10-09 at 2 48 51 PM

But without this change the schema initialization fails as the https://github.com/apache/derby/blob/trunk/java/org.apache.derby.tools/org/apache/derby/jdbc/EmbeddedDriver.java looks for JDBCBoot class which is part of derby jar

(The screenshot is without derby jar in classpath, only derbytools and derbyshared are present)
Screenshot 2025-10-09 at 2 41 46 PM

@Aggarwal-Raghav
Copy link
Contributor Author

Following up on the question about derby and derbytools I am a bit unsure why the tools are needed. I was under the impression that the derby artifact has everything needed to run an embedded database.

In short,
org.apache.derby.jdbc.EmbeddedDriver is part of => derbytools.jar
EmbeddedDriver needs JDBCBoot which is part of => derby.jar

@zabetak
Copy link
Member

zabetak commented Oct 9, 2025

My interpretation of the Derby changes is that the derbytools is the optional part not the other way around. Did you try using Derby without adding derbytools in the classpath?

@Aggarwal-Raghav
Copy link
Contributor Author

My interpretation of the Derby changes is that the derbytools is the optional part not the other way around. Did you try using Derby without adding derbytools in the classpath?

(derbytools jar is removed and derby, derbyshared are present)
Screenshot 2025-10-09 at 4 16 00 PM

@zabetak
Copy link
Member

zabetak commented Oct 9, 2025

Explicitly loading a driver (using Class.forName) shouldn't be necessary in recent Java versions. Anyways, the code is there at the moment so probably this is outside of scope for the current PR. However, since the code wants to load a driver explicitly can we use AutoloadedDriver instead of the EmbeddedDriver by changing the appropriate Metastore conf?

@Aggarwal-Raghav
Copy link
Contributor Author

Explicitly loading a driver (using Class.forName) shouldn't be necessary in recent Java versions. Anyways, the code is there at the moment so probably this is outside of scope for the current PR. However, since the code wants to load a driver explicitly can we use AutoloadedDriver instead of the EmbeddedDriver by changing the appropriate Metastore conf?

let me check on this and will update the PR/comment accordingly. Thanks for your input.

@Aggarwal-Raghav
Copy link
Contributor Author

Explicitly loading a driver (using Class.forName) shouldn't be necessary in recent Java versions. Anyways, the code is there at the moment so probably this is outside of scope for the current PR. However, since the code wants to load a driver explicitly can we use AutoloadedDriver instead of the EmbeddedDriver by changing the appropriate Metastore conf?

So I tried using AutoloadedDriver in my setup (without derbytools) and it is working nicely. I also replaced all the EmbeddedDriver with AutoloadedDriver in code as well and it's compiling/building well.

I read about Service Provider Interface (SPI) and in (derby.jar) META-INF/services/java.sql.Driver contains org.apache.derby.iapi.jdbc.AutoloadedDriver and EmbeddedDriver is also using AutoloadedDriver internally.

https://github.com/apache/derby/blob/4253dcf4aa37dc64cf7235d494cd2f00f72e678a/java/org.apache.derby.tools/org/apache/derby/jdbc/EmbeddedDriver.java#L184

So, i think we can do the following:

  1. Remove derbytools from all pom.xml
  2. Replace org.apache.derby.jdbc.EmbeddedDriver -> org.apache.derby.iapi.jdbc.AutoloadedDriver everywhere
  3. Remove Class.forName() from HiveSchemaHelper.java

Let me know your thoughts on this: Aggarwal-Raghav@70f740e Will update PR accordingly

@zabetak
Copy link
Member

zabetak commented Oct 10, 2025

I think that Aggarwal-Raghav@70f740e is in the right direction. I am OK with items 1 and 2 but slightly unsure about 3 (removing Class.forName).

If possible let's tackle item 3 in a separate ticket/PR and do it in a more holistic way checking for other places where we load JDBC drivers explicitly. We should also verify that all supported drivers can be loaded automatically via the SPI mechanism.

@Aggarwal-Raghav
Copy link
Contributor Author

I think that Aggarwal-Raghav@70f740e is in the right direction. I am OK with items 1 and 2 but slightly unsure about 3 (removing Class.forName).

If possible let's tackle item 3 in a separate ticket/PR and do it in a more holistic way checking for other places where we load JDBC drivers explicitly. We should also verify that all supported drivers can be loaded automatically via the SPI mechanism.

Sure, I have updated the PR and and updated the description "User-facing change" and filed HIVE-29258 for handling [3]

Copy link

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM will merge soon! Please check if there is any of the doc in the Website that needs to be updated.

@Aggarwal-Raghav
Copy link
Contributor Author

LGTM will merge soon! Please check if there is any of the doc in the Website that needs to be updated.

thanks for the review @zabetak . Yes there are few mention of EmbeddedDriver in Website. Will raise a pr for the same shortly.

@Aggarwal-Raghav
Copy link
Contributor Author

Raised apache/hive-site#67.

Also, can we please add the following/similar line in the commit message:

HIVE-29246: Upgrade derby version to 10.17.1.0

* Migration from org.apache.derby.jdbc.EmbeddedDriver -> org.apache.derby.iapi.jdbc.AutoloadedDriver

@zabetak zabetak merged commit 009dea2 into apache:master Oct 13, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants