Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](arrow-flight-sql) Support arrow-flight-sql protocol getStreamCatalogs, getStreamSchemas, getStreamTables #46217

Merged
merged 4 commits into from
Jan 2, 2025

Conversation

xinyiZzz
Copy link
Contributor

@xinyiZzz xinyiZzz commented Dec 31, 2024

What problem does this PR solve?

Implement the getStreamCatalogs, getStreamSchemas, getStreamTables methods in the arrow-flight-sql protocol, which can support BI tools to correctly display the metadata tree when using the arrow-flight-sql Driver to connect to Doris.

DBeaver uses the arrow-flight-sql Driver connecting to Doris:

  1. list all catalogs and show properties
    image
  2. list dbSchemas and show properties
    image
  3. list tables and list table columns.
    image
  4. external catalog
    image
    image

How to connect to Doris: (will be organized into documents later)
https://www.dremio.com/blog/jdbc-driver-for-arrow-flight-sql/#h-how-to-use-jdbc-driver-with-dbeaver-client
https://docs.dremio.com/current/sonar/client-applications/clients/dbeaver/?_gl=1*1epgwh0*_gcl_au*MjUyNjE1ODM0LjE3MzQwMDExNDg.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Dec 31, 2024

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@xinyiZzz
Copy link
Contributor Author

xinyiZzz commented Jan 1, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32639 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit cf3c3c7d0dcc0edce328a207589276b4716eda5a, data reload: false

------ Round 1 ----------------------------------
q1	17599	6224	6109	6109
q2	2045	286	170	170
q3	10546	1254	745	745
q4	10230	873	448	448
q5	7632	2183	2028	2028
q6	207	180	145	145
q7	908	756	609	609
q8	9224	1372	1173	1173
q9	5260	4985	4903	4903
q10	6762	2298	1833	1833
q11	469	278	258	258
q12	339	354	219	219
q13	17762	3622	2972	2972
q14	247	238	207	207
q15	547	489	491	489
q16	643	626	593	593
q17	573	852	335	335
q18	7026	6434	6361	6361
q19	1976	955	557	557
q20	300	318	189	189
q21	2779	2159	1993	1993
q22	354	325	303	303
Total cold run time: 103428 ms
Total hot run time: 32639 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6263	6227	6197	6197
q2	237	321	242	242
q3	2218	2610	2342	2342
q4	1391	1832	1345	1345
q5	4300	4736	4809	4736
q6	181	175	136	136
q7	2075	1972	1829	1829
q8	2614	2790	2657	2657
q9	7277	7280	7129	7129
q10	3065	3351	2800	2800
q11	602	510	485	485
q12	664	800	620	620
q13	3456	3735	3062	3062
q14	284	291	276	276
q15	560	496	500	496
q16	648	701	632	632
q17	1198	1725	1252	1252
q18	7629	7258	7219	7219
q19	779	1117	1010	1010
q20	1964	2014	1820	1820
q21	5460	5072	4877	4877
q22	643	624	566	566
Total cold run time: 53508 ms
Total hot run time: 51728 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 189818 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit cf3c3c7d0dcc0edce328a207589276b4716eda5a, data reload: false

query1	973	376	370	370
query2	6511	2393	2297	2297
query3	6708	209	211	209
query4	34000	23855	23539	23539
query5	4405	628	448	448
query6	287	200	189	189
query7	4641	485	304	304
query8	289	244	232	232
query9	9546	2648	2629	2629
query10	456	305	249	249
query11	18315	15381	15238	15238
query12	150	108	108	108
query13	1643	529	422	422
query14	10574	7201	6737	6737
query15	232	211	188	188
query16	8182	605	427	427
query17	1578	731	555	555
query18	2125	410	312	312
query19	219	187	160	160
query20	122	119	114	114
query21	211	123	104	104
query22	4385	4282	4244	4244
query23	35193	33534	33474	33474
query24	6896	2244	2224	2224
query25	482	471	401	401
query26	1057	252	157	157
query27	2110	476	348	348
query28	5141	2427	2433	2427
query29	557	576	405	405
query30	228	176	147	147
query31	991	868	799	799
query32	96	59	63	59
query33	509	345	301	301
query34	741	856	509	509
query35	808	818	741	741
query36	1011	1027	956	956
query37	117	102	78	78
query38	4104	4161	4067	4067
query39	1491	1448	1440	1440
query40	204	115	101	101
query41	50	90	44	44
query42	125	102	103	102
query43	518	515	477	477
query44	1296	815	819	815
query45	181	176	179	176
query46	846	1046	640	640
query47	1915	1927	1840	1840
query48	398	421	320	320
query49	772	475	386	386
query50	626	656	382	382
query51	7124	7204	7043	7043
query52	103	99	92	92
query53	222	253	183	183
query54	477	493	401	401
query55	78	79	81	79
query56	248	292	245	245
query57	1202	1171	1095	1095
query58	247	226	237	226
query59	3019	3079	2867	2867
query60	317	264	243	243
query61	116	112	106	106
query62	897	795	743	743
query63	224	187	201	187
query64	4615	998	653	653
query65	3228	3181	3254	3181
query66	1087	432	308	308
query67	15880	15951	15499	15499
query68	10040	756	516	516
query69	476	299	257	257
query70	1168	1101	1129	1101
query71	432	280	256	256
query72	5855	3868	3882	3868
query73	888	750	369	369
query74	9941	9460	8832	8832
query75	4436	3177	2691	2691
query76	5542	1200	769	769
query77	1016	354	269	269
query78	10108	10233	9690	9690
query79	2731	888	603	603
query80	695	516	434	434
query81	467	274	240	240
query82	621	155	126	126
query83	199	169	144	144
query84	286	94	76	76
query85	801	347	342	342
query86	350	321	312	312
query87	4561	4580	4328	4328
query88	3450	2221	2202	2202
query89	404	348	299	299
query90	1912	186	182	182
query91	132	131	105	105
query92	65	59	59	59
query93	1279	894	533	533
query94	672	400	295	295
query95	336	259	249	249
query96	486	608	280	280
query97	2728	2799	2712	2712
query98	230	193	195	193
query99	1705	1588	1444	1444
Total cold run time: 297111 ms
Total hot run time: 189818 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.67 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit cf3c3c7d0dcc0edce328a207589276b4716eda5a, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.04	0.03
query3	0.24	0.08	0.07
query4	1.59	0.10	0.11
query5	0.42	0.41	0.42
query6	1.17	0.66	0.64
query7	0.02	0.01	0.02
query8	0.04	0.03	0.03
query9	0.58	0.50	0.51
query10	0.56	0.55	0.54
query11	0.13	0.10	0.11
query12	0.13	0.10	0.11
query13	0.60	0.62	0.59
query14	2.72	2.75	2.73
query15	0.90	0.82	0.82
query16	0.39	0.38	0.40
query17	0.96	1.00	1.06
query18	0.23	0.21	0.20
query19	1.93	1.88	2.03
query20	0.02	0.01	0.01
query21	15.36	0.92	0.58
query22	0.74	1.10	0.69
query23	15.00	1.42	0.61
query24	3.23	1.30	1.93
query25	0.14	0.12	0.09
query26	0.22	0.15	0.14
query27	0.07	0.06	0.07
query28	14.47	1.48	1.05
query29	12.60	3.90	3.24
query30	0.25	0.08	0.06
query31	2.82	0.61	0.39
query32	3.24	0.53	0.46
query33	3.06	3.11	3.15
query34	16.70	5.14	4.50
query35	4.50	4.44	4.47
query36	0.66	0.49	0.48
query37	0.10	0.06	0.06
query38	0.04	0.03	0.03
query39	0.04	0.03	0.02
query40	0.16	0.14	0.12
query41	0.07	0.02	0.02
query42	0.04	0.02	0.03
query43	0.04	0.03	0.03
Total cold run time: 106.29 s
Total hot run time: 31.67 s

Copy link
Contributor

github-actions bot commented Jan 2, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Jan 2, 2025
Copy link
Contributor

github-actions bot commented Jan 2, 2025

PR approved by anyone and no changes requested.

Copy link
Contributor

@wangbo wangbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xinyiZzz xinyiZzz merged commit d8d5942 into apache:master Jan 2, 2025
24 of 25 checks passed
github-actions bot pushed a commit that referenced this pull request Jan 2, 2025
…mCatalogs`, `getStreamSchemas`, `getStreamTables` (#46217)

### What problem does this PR solve?

Implement the `getStreamCatalogs`, `getStreamSchemas`, `getStreamTables`
methods in the arrow-flight-sql protocol, which can support BI tools to
correctly display the metadata tree when using the `arrow-flight-sql`
Driver to connect to Doris.

DBeaver uses the `arrow-flight-sql` Driver connecting to Doris:

1. list all catalogs and show properties

![image](https://github.com/user-attachments/assets/f1de6e87-ba5d-4d67-a7bb-06fd91b6cbb1)
2. list dbSchemas and show properties

![image](https://github.com/user-attachments/assets/e706a065-420a-4137-a6a1-6e4f807fad8a)
3. list tables and list table columns.

![image](https://github.com/user-attachments/assets/f9929da9-8cc3-4d74-9f73-b3837854c349)
4. external catalog

![image](https://github.com/user-attachments/assets/ef9ebee1-36d8-4f7f-b97a-e4720ed45d1c)

![image](https://github.com/user-attachments/assets/58f1e0d9-17ed-48a9-be89-dfb818672525)

How to connect to Doris: (will be organized into documents later)

https://www.dremio.com/blog/jdbc-driver-for-arrow-flight-sql/#h-how-to-use-jdbc-driver-with-dbeaver-client

https://docs.dremio.com/current/sonar/client-applications/clients/dbeaver/?_gl=1*1epgwh0*_gcl_au*MjUyNjE1ODM0LjE3MzQwMDExNDg.
github-actions bot pushed a commit that referenced this pull request Jan 2, 2025
…mCatalogs`, `getStreamSchemas`, `getStreamTables` (#46217)

### What problem does this PR solve?

Implement the `getStreamCatalogs`, `getStreamSchemas`, `getStreamTables`
methods in the arrow-flight-sql protocol, which can support BI tools to
correctly display the metadata tree when using the `arrow-flight-sql`
Driver to connect to Doris.

DBeaver uses the `arrow-flight-sql` Driver connecting to Doris:

1. list all catalogs and show properties

![image](https://github.com/user-attachments/assets/f1de6e87-ba5d-4d67-a7bb-06fd91b6cbb1)
2. list dbSchemas and show properties

![image](https://github.com/user-attachments/assets/e706a065-420a-4137-a6a1-6e4f807fad8a)
3. list tables and list table columns.

![image](https://github.com/user-attachments/assets/f9929da9-8cc3-4d74-9f73-b3837854c349)
4. external catalog

![image](https://github.com/user-attachments/assets/ef9ebee1-36d8-4f7f-b97a-e4720ed45d1c)

![image](https://github.com/user-attachments/assets/58f1e0d9-17ed-48a9-be89-dfb818672525)

How to connect to Doris: (will be organized into documents later)

https://www.dremio.com/blog/jdbc-driver-for-arrow-flight-sql/#h-how-to-use-jdbc-driver-with-dbeaver-client

https://docs.dremio.com/current/sonar/client-applications/clients/dbeaver/?_gl=1*1epgwh0*_gcl_au*MjUyNjE1ODM0LjE3MzQwMDExNDg.
yiguolei pushed a commit that referenced this pull request Jan 2, 2025
…ol `getStreamCatalogs`, `getStreamSchemas`, `getStreamTables` #46217 (#46268)

Cherry-picked from #46217

Co-authored-by: Xinyi Zou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.8-merged dev/3.0.x reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants