Nice work! I really appreciate your effort on creating a unified framework.
I'm currently tring to apply these algorithms with my own reward on Wan2.1-1.3B but I'm finding difficulty even for training with rewards like pickscore. Maybe can you share some successful configs for training? Thank you so much!
And also, is there any plan for implementing some video based scores like video align? I can also try implement it and send pull requests
Again thank you for sharing your work!
Nice work! I really appreciate your effort on creating a unified framework.
I'm currently tring to apply these algorithms with my own reward on Wan2.1-1.3B but I'm finding difficulty even for training with rewards like pickscore. Maybe can you share some successful configs for training? Thank you so much!
And also, is there any plan for implementing some video based scores like video align? I can also try implement it and send pull requests
Again thank you for sharing your work!