Skip to content

Conversation

yardenas
Copy link
Contributor

@yardenas yardenas commented Apr 28, 2025

  • Run PPO on walk, run task (5 seeds)
  • Run SAC on walk, run task (5 seeds)
  • Sample initial state based on _find_non_contacting_height
  • Add docstrings
  • Clean up hydra yaml files (!)
  • Clean up train_brax.py

Runs are available @ https://api.wandb.ai/links/yardas/y54nyaed

@ShahhhVihaan
Copy link

@yardenas I'm sure you already have a plan for this PR, but if you'd like help with any part of it, even the boring or cleanup tasks, I'd be happy to pitch in. I'm getting out of theory RL and trying to get more hands on experience with practical implementations.

@yardenas
Copy link
Contributor Author

@ShahhhVihaan sounds good! There's def a lot that can be done. I think that adding the escape or fetch tasks would be a solid contribution that's currently missing. I have a general idea of how to do that so I'm happy to guide you through it.

@ShahhhVihaan
Copy link

Great, I'll take a look at both dm_control and its version here. Yeah, how do you want me to implement it?

Also, I can fork your fork and make a new branch from the add-quadruped branch. Does that sound good?

@yardenas
Copy link
Contributor Author

@ShahhhVihaan, sounds great! How can I connect with you? (email for instance)
It would be easier to coordinate the details

@ShahhhVihaan
Copy link

You can reach me at [email protected]

@yardenas
Copy link
Contributor Author

yardenas commented May 1, 2025

@ShahhhVihaan, just sent you an email

@yardenas
Copy link
Contributor Author

yardenas commented May 1, 2025

@btaba, two questions about the design of the environment:

  1. Currently we do not sample the initial position of the quadruped as done in the original implementation as it slows down training (I think ~5 additional minutes). What's your opinion about that? I could try to optimize my implementation of _find_non_contacting_height, but not sure exactly if we can gain a lot.
  2. I disabled some contacts of the knees with the ground to speed up performance. What do you think about that?

@yardenas
Copy link
Contributor Author

@btaba let's make it happen, no?

@btaba
Copy link
Collaborator

btaba commented Aug 13, 2025

Hi @yardenas , the PR LGTM. Would it be possible to cleanup the config and training scripts and add just the env? If you have some training curve screenshots and a video, that would be awesome. I'll give it an "Approval" and submit

@btaba btaba self-requested a review August 13, 2025 23:20
@yardenas
Copy link
Contributor Author

@btaba, sounds great, thank you!

I'll get the PR ready for merge ASAP :)

SAC on the Walk & Run tasks

image

PPO on the Walk & Run tasks

image

Video

8937bfaa-9144-40cc-93f6-82a5a04c57b4

Other metrics can be found here: https://api.wandb.ai/links/yardas/y54nyaed

@yardenas yardenas marked this pull request as ready for review August 14, 2025 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants