-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[Eager] Add missing attribute copy when PyLayer grad node copied
#77150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Eager] Add missing attribute copy when PyLayer grad node copied
#77150
Conversation
3c59c47 to
73fa547
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a bug in the PyLayer gradient node copy constructor that was causing segmentation faults when using paddle.grad. The issue occurred because several member attributes were not being copied when the GradNode was copied, leading to missing or uninitialized data during gradient computation.
Key Changes:
- Added copying of missing attributes (
name_,forward_outputs_dist_attr_,forward_outputs_global_dims_,forward_outputs_is_dist_meta_,grad_in_dtype_consistent_) in theGradNodePyLayercopy constructor - Added a test case to verify PyLayer works correctly with
paddle.gradand partial gradients - Refactored
list_varsmethod inpir.ccto usepy::listfor better Python integration
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| paddle/fluid/eager/pylayer/py_layer_node.h | Added copying of 5 previously missing member attributes in the copy constructor to prevent segfaults when using paddle.grad |
| test/legacy_test/test_pylayer_op.py | Added test case that reproduces and validates the fix for the PyLayer copy constructor bug with paddle.grad |
| paddle/fluid/pybind/pir.cc | Refactored list_vars method to use py::list instead of std::vector<pir::Value> for better Python API consistency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| this->forward_outputs_dist_attr_ = other.forward_outputs_dist_attr_; | ||
| this->forward_outputs_global_dims_ = other.forward_outputs_global_dims_; | ||
| this->forward_outputs_is_dist_meta_ = other.forward_outputs_is_dist_meta_; | ||
| this->grad_in_dtype_consistent_ = other.grad_in_dtype_consistent_; |
Copilot
AI
Dec 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The member variable grad_in_dtype_consistent_ is being copied in the copy constructor but it's not initialized in the primary constructor. This could lead to undefined behavior when a GradNodePyLayer object is created using the primary constructor, as the boolean member will have an indeterminate value. Consider adding initialization in the primary constructor's member initializer list or providing a default value in the class declaration.
Co-authored-by: Copilot <[email protected]>
wanghuancoder
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
你的PR提交成功,感谢你对开源项目的贡献! |
…77150) --------- Co-authored-by: Copilot <[email protected]>
|
✅ Cherry-pick successful! Created PR: #77162 |
…addlePaddle#77150) --------- Co-authored-by: Copilot <[email protected]>
…addlePaddle#77150) --------- Co-authored-by: Copilot <[email protected]>
|
✅ Cherry-pick successful! Created PR: #77293 |
PR Category
Execute Infrastructure
PR Types
Bug fixes
Description
PyLayer拷贝构造函数里缺失部分 attribute 的 copy,这会导致在使用paddle.grad时,由于 GradNode 被 copy,部分属性缺失,进而导致段错误的出现该问题仅在
paddle.grad部分出现PyLayer,且其中部分输出不需要梯度时才会复现,由于tensor.backward并不涉及 GradNode 的 copy,因此没问题