The official code of Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RL.
This project makes use of the following open-source projects:
For installation instructions, please refer to the CORL repository for detailed guidance.
Take O2SAC from the results of CQL as an example.
To run the offline pre-training, use the following command:
cd offline
python cql.py --env hopper-medium-v2 --seed 0
To perform online fine-tuning, use the following command:
cd finetune
python O2SAC.py --env hopper-medium-v2 --seed 0