Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-8891][VL] Allow to reuse local SSD cache on Spark context restart #8892

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

zhouyuan
Copy link
Contributor

@zhouyuan zhouyuan commented Mar 4, 2025

What changes were proposed in this pull request?

This patch adds one config to allow reuse of local SSD cache
Before this patch, the ssd cache will be discarded on spark context shutdown. After this with the new config(ssdReuse) the cache will be kept there, and the users need to manually delete it

Fixes: #8891

How was this patch tested?

manual tests

@github-actions github-actions bot added the VELOX label Mar 4, 2025
Copy link

github-actions bot commented Mar 4, 2025

#8891

@zhouyuan zhouyuan changed the title [GLUTEN-8891] Allow to reuse local SSD cache on Spark context restart [GLUTEN-8891][VL] Allow to reuse local SSD cache on Spark context restart Mar 4, 2025
@zhouyuan zhouyuan force-pushed the wip_local_cache_keep branch from b9e196d to a36fe5c Compare March 4, 2025 09:00
Signed-off-by: Yuan <[email protected]>
@zhouyuan zhouyuan force-pushed the wip_local_cache_keep branch 2 times, most recently from 2bf5bcf to 5e91dd6 Compare March 4, 2025 11:16
@zhouyuan zhouyuan force-pushed the wip_local_cache_keep branch from 5e91dd6 to b97730f Compare March 4, 2025 11:18
@FelixYBW
Copy link
Contributor

FelixYBW commented Mar 4, 2025

when the data on local ssd is deleted with and without the PR?

@zhouyuan
Copy link
Contributor Author

zhouyuan commented Mar 5, 2025

when the data on local ssd is deleted with and without the PR?

Before this patch, the ssd cache will be discarded on spark context shutdown. After this with the new config(ssdReuse) the cache will be kept there, and the users need to manually delete it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[VL] Allow to reuse local SSD cache on Spark context restart
3 participants