-
Notifications
You must be signed in to change notification settings - Fork 4.4k
GCS client library migration in Java SDK - part 1 #36876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… of gcsio in GcsUtil
| import com.google.api.services.storage.model.StorageObject; | ||
| import com.google.auth.Credentials; | ||
| import com.google.auto.value.AutoValue; | ||
| import com.google.cloud.hadoop.gcsio.CreateObjectOptions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
References to com.google.cloud.hadoop.gcsio.* and com.google.cloud.hadoop.util.* have been removed from GcsUtil.java. They are kept in GcsUtilLegacy.java though during the migration process.
| } | ||
|
|
||
| @VisibleForTesting | ||
| void setCloudStorageImpl(GoogleCloudStorage g) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the two setCloudStorageImpl functions are no longer surfaced for testing in GcsUtil.java, we still keep them in GcsUtilLegacy.java.
| } | ||
| } | ||
|
|
||
| GoogleCloudStorage createGoogleCloudStorage( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is also removed from GcsUtil.java, but will remain in GcsUtilLegacy.java.
| * @return a SeekableByteChannel that can read the object data | ||
| */ | ||
| public SeekableByteChannel open(GcsPath path) throws IOException { | ||
| return open(path, this.googleCloudStorageOptions.getReadChannelOptions()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing with the previous implementation
Lines 593 to 600 in b20ccbf
| public SeekableByteChannel open(GcsPath path) throws IOException { | |
| String bucket = path.getBucket(); | |
| SeekableByteChannel channel = | |
| googleCloudStorage.open( | |
| new StorageResourceId(path.getBucket(), path.getObject()), | |
| this.googleCloudStorageOptions.getReadChannelOptions()); | |
| return wrapInCounting(channel, bucket); | |
| } |
we will invoke the open() function below to include the IO request counters similar to the create() functions below.
|
cc'ed @clairemcginty Context: gcsio is sunsetting (https://s.apache.org/beam-gcsutil-modernization), and we are on the way to migrate to the GCS client library for Beam Java SDK. |
| import org.apache.beam.runners.core.metrics.MonitoringInfoMetricName; | ||
| import org.apache.beam.sdk.extensions.gcp.auth.TestCredential; | ||
| import org.apache.beam.sdk.extensions.gcp.options.GcsOptions; | ||
| import org.apache.beam.sdk.extensions.gcp.util.GcsUtil.BatchInterface; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BatchInterface and RewriteOp are only used in Tests, so it is fine to get rid of them in GcsUtil and keep them in GcsUtilLegacy.
|
Assigning reviewers: R: @Abacn for label java. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
This PR lays the groundwork for the GCS client library migration in the Java SDK. It refactors the existing GCS client implementation to prepare for a new implementation while maintaining backward compatibility.
Key changes include:
GcsUtil.java.GcsUtilLegacy.javaandGcsOptions.javawill be removed in a future PR.openAPI in the legacy module has been rerouted to include previously implemented but unused IO request count metrics, as discussed in [BEAM-11980] Java GCS - Implement IO Request Count metrics #15394 (comment).