Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-34930] Fork existing code from bahir-flink #1

Merged
merged 44 commits into from
May 14, 2024

Conversation

ferenc-csaky
Copy link
Contributor

@ferenc-csaky ferenc-csaky commented Mar 28, 2024

For this step, I tried to not introduce changes in the ported logic. I separated semantically different changes into multiple commits, so the [FLINK-34930] prefixed commits summarize the actual changes.

The spotless and checkstyle changes are fairly big, but most of the stuff were about adding javadoc and code/import reformat.

TODO

  • Explicitly state in the repo that the code was forked from bahir-flink.

@ferenc-csaky ferenc-csaky force-pushed the FLINK-34930 branch 2 times, most recently from d62ac38 to 82f84c8 Compare April 10, 2024 12:13
@ferenc-csaky
Copy link
Contributor Author

@MartijnVisser can you trigger another CI run pls?

ferenc-csaky and others added 25 commits April 11, 2024 20:53
When running docker based integration tests locally,
fail silentily if env requirements not available.

Closes #38
Closes #35
By default, KuduSink processing message one by one
without checkpoint. When checkoint is enabled, throughput
is improved by using FlushMode.AUTO_FLUSH_BACKGROUND,
and use checkpoint to ensure at-least-once.

Closes #50
* resolve eventual consistency issues
* improve speed special on eventual consistency stream
* Update Readme
Sometimes we don't want to upsert all columns of a kudu table. 
So we need to support the function that upsert part of columns
of a kudu table.
Fix the problem "Caused by: java.lang.IllegalStateException:
Cannot proceed, the client has already been closed" when 
running in Flink local mode
Kudu connector rework including the addition of a
connector to the Table API for it. 

Co-authored-by: Gyula Fora <[email protected]>
Co-authored-by: Balazs Varga <[email protected]>
Update the KuduTableSource to inherit from InputFormatTableSource
in order to support both streaming SQL and Batch SQL at the same time.
In order to reduce unnecessary data transmission, the filter push down
was also added to the KuduTableSource.
@ferenc-csaky
Copy link
Contributor Author

ferenc-csaky commented Apr 12, 2024

I am not sure about the current failure with the 1.18.1 + JDK17 combo, I could not reproduce it locally, maybe there is some flakiness here. I'll check a bit deeper soon.

@ferenc-csaky
Copy link
Contributor Author

After I took a look at the logs, the error was environmental, it failed to connect to the Dockerized Kudu:

ERROR org.apache.kudu.client.Connection [] - [peer master-0:0:0:0:0:0:0:1/<unresolved>:32780(0:0:0:0:0:0:0:1/<unresolved>:32780)] server sent error Service unavailable: service kudu.master.MasterService not registered on TabletServer

@MartijnVisser can that one case be retriggered? Or of not, I hope another run would be green.

@MartijnVisser
Copy link
Contributor

@MartijnVisser can that one case be retriggered? Or of not, I hope another run would be green.

Re-triggered all failing jobs!

@ferenc-csaky
Copy link
Contributor Author

Thanks, it passed now! is it okay to merge this change at this point for now?

@ferenc-csaky ferenc-csaky changed the title [FLINK-34930] Externalize existing code from bahir-flink [FLINK-34930] Fork existing code from bahir-flink Apr 15, 2024
Copy link
Contributor

@MartijnVisser MartijnVisser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ferenc-csaky I was double checking for legal implications (since this is coming from the Bahir project, that's in the attic).

Looking at https://issues.apache.org/jira/browse/LEGAL-584?focusedCommentId=17402556&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17402556 I think what's missing in this PR are the actual LICENSE and NOTICE files. WDYT?

@MartijnVisser
Copy link
Contributor

@ferenc-csaky I've asked the Legal team for a confirmation on this topic, since the ASF owns both Bahir and Flink code. You can track https://issues.apache.org/jira/browse/LEGAL-675

@ferenc-csaky
Copy link
Contributor Author

@ferenc-csaky I've asked the Legal team for a confirmation on this topic, since the ASF owns both Bahir and Flink code. You can track https://issues.apache.org/jira/browse/LEGAL-675

Thank you for taking the time and look into it! For the time being, I migrated the Bahir header to the NOTICE. If it is okay to change it, I'll remove that commit.

Copy link
Contributor

@MartijnVisser MartijnVisser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've compared this PR to https://github.com/apache/bahir-flink/commits/master/flink-connector-kudu and it looks complete; https://issues.apache.org/jira/browse/LEGAL-675 also confirmed that we didn't do anything with the NOTICE files. LGTM

@MartijnVisser MartijnVisser merged commit 8674446 into apache:main May 14, 2024
9 checks passed
@ferenc-csaky ferenc-csaky deleted the FLINK-34930 branch May 14, 2024 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet