Fixes to improve starrocks driver support#2449
Closed
phaethon wants to merge 3 commits intodlt-hub:develfrom
Closed
Fixes to improve starrocks driver support#2449phaethon wants to merge 3 commits intodlt-hub:develfrom
phaethon wants to merge 3 commits intodlt-hub:develfrom
Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
rudolfix
reviewed
Apr 1, 2025
Collaborator
rudolfix
left a comment
There was a problem hiding this comment.
@phaethon we can merge those fixes but as a part of a larger refactor of sql alchemy destinations that will allow to set destination capabilities per dialect. in this case
- starrocks naming convention should be plugged in
- table adapter should be plugged in that can change the table layout before it is created
Author
|
I am currently working on a larger update, which implements starrocks as a separate destination. It implements starrocks specific loading mechanisms (INSERT FROM FILES, Stream Load), creates naming convention (which adds suffix to reserved keywords). It inherits many of the classes from sqlalchemy destination. WIP can be seen: https://github.com/phaethon/dlt/tree/starrocks As I currently see it, this PR should be cancelled giving preference to separate destination. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Starrocks analytical database has multiple table types (duplicate key, primary key, aggregate). Currently, dlt does not support starrocks as a separate destination, but it is partly usable through sqlalchemy.
Using starrocks via sqlalchemy dlt creates only "duplicate key" tables. With write dispositions "append" and "merge" this does not work, as generated query for merge (delete with subquery) in Starrocks is supported only for table type "primary key". This pull request allows creating of "primary key" tables if primary key is set for table in hints and create_primary_keys = true. And this fixes usage of "append" and "merge" write dispositions for incremental loading.
Pull request consists of 2 changes:
If reordering of columns is a problem for users of other databases, additional
ifcan be added to enable the part of the code only when driver is starrocks. This can, also, be disabled by setting create_primary_keys = false. I have not researched how many other databases might have such a requirement and to what extent it is unique for starrocks.For a reference: starrocks documentation page explicitely saying that for "duplicate key" tables only simple conditions for delete are allowed. See "DML DELETE" row in the "Data change" table.