Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Add document introducting StageLevel Resource Profile Adjust #8908

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

zjuwangg
Copy link
Contributor

@zjuwangg zjuwangg commented Mar 5, 2025

What changes were proposed in this pull request?

Add document about how to use StageLevelResource Auto adjust introduced in #8209

How was this patch tested?

NA

@github-actions github-actions bot added the DOCS label Mar 5, 2025
Copy link

github-actions bot commented Mar 5, 2025

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

---

### **Overview**
Apache Gluten introduces a stage-level resource auto-adjustment framework to mitigate heap Out-of-Memory (OOM) issues caused by varying memory demands across stages in Spark applications. This feature dynamically adjusts task and executor resource profiles (e.g., heap/off-heap memory allocation) based on stage characteristics, such as the presence of fallback operators or heavy shuffle workloads(to be supported).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by chatGPT:
"One major advantage of Apache Gluten is its ability to significantly reduce memory requirements per executor—potentially by up to half—when entire stages are offloaded to the native engine. This engine primarily relies on off-heap memory with minimal on-heap usage. However, when stages contain fallback operators that utilize the JVM engine, the on-heap memory size must be increased, leading to even higher memory demands per executor. This challenge has posed significant barriers during the adoption of Apache Gluten.

To address this issue, Apache Gluten introduces a stage-level resource auto-adjustment framework. This feature dynamically optimizes task and executor resource profiles, such as heap and off-heap memory allocation, based on the specific characteristics of each stage, including the presence of fallback operators. Additionally, this framework is designed with future enhancements in mind, allowing for adjustments to accommodate other requirements, such as heavy shuffle workloads."

@FelixYBW
Copy link
Contributor

FelixYBW commented Mar 6, 2025

Thank you, @zjuwangg

zjuwangg added 2 commits March 6, 2025 11:32
[DOC] Add document introducting StageLevel Resource Profile Adjust
@zjuwangg
Copy link
Contributor Author

zjuwangg commented Mar 6, 2025

@FelixYBW Thanks for your detailed review, just address the commnets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants