You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, when running the multi-human policy, such as sarl, lstm-rl, I noticed that there is drastic memory increase with training going on. The used memory increased from about 4G to 20G after 100 episodes training. I debug for a long time, but still no clue about what's going wrong there. @ChanganVR Pls have a look.
The text was updated successfully, but these errors were encountered:
@huiwenzhang Not such issue has been reported before. Maybe you could check whether your pytorch and CUDA version are compatible. Sometimes that could have an effect on the memory consumption.
@huiwenzhang Not such issue has been reported before. Maybe you could check whether your pytorch and CUDA version are compatible. Sometimes that could have an effect on the memory consumption.
I used pytorch version 2.0.1 with cuda version 11.8. The local cuda version is 12.1. According to the official doc of pytorch, newer cuda version is also supported. Besides, I didn't use GPU as you suggested. But the problem still exist. Training with cadrl and rgl policy is fine. Do you have any other guess about the memory leak?
@huiwenzhang I see. I don't have a clue what could be causing the issue. You could debug by removing all codes and adding back parts by parts until the issue occurs.
Hi, when running the multi-human policy, such as sarl, lstm-rl, I noticed that there is drastic memory increase with training going on. The used memory increased from about 4G to 20G after 100 episodes training. I debug for a long time, but still no clue about what's going wrong there. @ChanganVR Pls have a look.
The text was updated successfully, but these errors were encountered: