[FEATURE][SanityTest] Support parquet format data #736

LantaoJin · 2024-10-03T04:25:36Z

Is your feature request related to a problem?
In Sanity the Testing, we only test for the JSON format data, and each query will scan 1045 JSON files, that is the primary slowness during Spark execution (~90% time spent on file scan). Can parquet file format be able to use? It could be much and much faster. We didn't test other data format cases.

What solution would you like?
Not sure it is supported or not. Close it if this is already supported.

What alternatives have you considered?
A clear and concise description of any alternative solutions or features you've considered.

Do you have any additional context?
Add any other context or screenshots about the feature request here.

LantaoJin added enhancement New feature or request untriaged labels Oct 3, 2024

LantaoJin changed the title ~~[FEATURE] Support parquet format data~~ [FEATURE][SanityTest] Support parquet format data Oct 3, 2024

YANG-DB added the Lang:PPL Pipe Processing Language support label Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE][SanityTest] Support parquet format data #736

[FEATURE][SanityTest] Support parquet format data #736

LantaoJin commented Oct 3, 2024

[FEATURE][SanityTest] Support parquet format data #736

[FEATURE][SanityTest] Support parquet format data #736

Comments

LantaoJin commented Oct 3, 2024