Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaying data from Roboset FK1-v4(human) dataset with FK1_RelaxFixed-v4 environment. #124

Open
omeryagmurlu opened this issue Dec 5, 2023 · 4 comments
Assignees

Comments

@omeryagmurlu
Copy link

Hello,

I'm trying to replay the Roboset FK1-v4(human) dataset and I'm facing problems with the new v4 kitchen environment. I'm able to replay the training data using the kitchen_relax-v1 environment from Relay Policy Learning, but am unable to replay them using the FK1_RelaxFixed-v4 environment. The arm moves seemingly random instead of following the trajectory from the data. Here are the code snippets with the functioning kitchen_relax-v1 replay code and the non-functioning FK1_RelaxFixed-v4 one. Thank you very much!

Apart from that, do you happen to have an estimated date for when the other multi task suites apart from the kitchen will release? Thanks!

FK1_RelaxFixed-v4, not working:

import torch
import h5py
import numpy as np
import gym
import time
from tqdm import tqdm
import robohive

torch.cuda.empty_cache()

trace = '/SNIP/datasets/human_demos_playdata/FK1_RelaxFixed_v2d-v4_60_20230506-111653_trace.h5'
with h5py.File(trace, 'r') as file:
    h = dict()
    # kettle to top left, bottom stove, right slider, left cupboard
    for key in file['Trial60'].keys():
        if key == 'env_infos':
            h['qpos'] = file['Trial60/env_infos/state/qpos'][()]
            h['qvel'] = file['Trial60/env_infos/state/qvel'][()]
            continue
        h[key] = file['Trial60'][key][()]
    
    print('loaded 60')

actions = h['actions']
qpos = h['qpos'][0]
qvel = h['qvel'][0]

speedup = 1

env_name = 'FK1_RelaxFixed-v4'
# env_name = 'kitchen-v2'
env = gym.make(env_name)

env.reset()
init_qpos = qpos.copy()
init_qvel = qvel.copy()
env.sim.data.qpos[:] = init_qpos
env.sim.data.qvel[:] = init_qvel
env.sim.forward()

# pick scaling for actions
act_mid = np.zeros(env.sim.model.nu)
act_amp = 2 * np.ones(env.sim.model.nu)

env.mj_render()

obs = env.get_obs()
for i in tqdm(range(actions.shape[0] - 1)):
    ctrl = actions[i]

    # act = ctrl
    act = act_mid + ctrl * act_amp
    next_obs, reward, done, env_info = env.step(act)

    # if i % render_skip == 0:
    env.mj_render()
    time.sleep(env.dt / speedup)

    obs = next_obs
    if done:
        break

env.close()

kitchen_relax-v1, working:

import torch
import h5py
import numpy as np
import gym
import time
from tqdm import tqdm
import adept_envs.franka
torch.cuda.empty_cache()

trace = '/SNIP/datasets/human_demos_playdata/FK1_RelaxFixed_v2d-v4_60_20230506-111653_trace.h5'
with h5py.File(trace, 'r') as file:
    h = dict()
    # put kettle on top left, bottom stove, right slider, left cupboard
    for key in file['Trial60'].keys():
        if key == 'env_infos':
            h['qpos'] = file['Trial60/env_infos/state/qpos'][()]
            h['qvel'] = file['Trial60/env_infos/state/qvel'][()]
            continue
        h[key] = file['Trial60'][key][()]
    
    print('loaded 60')

actions = h['actions']
qpos = h['qpos'][0]
qvel = h['qvel'][0]

speedup = 1

env = gym.make('kitchen_relax-v1')

env.reset()
init_qpos = qpos.copy()
init_qvel = qvel.copy()
env.sim.data.qpos[:init_qpos.shape[0]] = init_qpos
env.sim.data.qvel[:init_qvel.shape[0]] = init_qvel
env.sim.forward()

env.mj_render()

print(f'act_mid: {env.act_mid}, {env.act_mid.shape}\nact_amp: {env.act_amp}, {env.act_amp.shape}\nskip: {env.skip}\nframe_skip: {env.frame_skip}\nmodel.opt.timestep: {env.model.opt.timestep}\n')

for i in tqdm(range(actions.shape[0] - 1)):
    act = actions[i]

    observation, reward, done, info = env.step(act)
    env.mj_render()
    time.sleep((env.model.opt.timestep * env.frame_skip) / speedup)
    if done:
        break

env.close()
@gaoyuezhou
Copy link
Collaborator

gaoyuezhou commented Dec 11, 2023

Thank you for your question. We are taking a look at this issue and will post updates here.

Does this issue occur for other expert or human (e.g. human_demos_by_task) datasets or only the human play datasets? Any additional context you can provide would be very helpful. Thank you for your patience!

@omeryagmurlu
Copy link
Author

Hello,

Thank you for your response. I've tried replaying the FK1_Knob1OnRandom-v4 dataset from human_demos_by_task, and had the same issue with it. I've also tried replaying the DAPG(human)/door_v2d-v4 dataset with its corresponding environment and this seemed to work without any problems. Here's a recording showing the Trace0 from the play dataset with the FK1_RelaxFixed-v4 env using the code snippet I've provided in my first post:

Screencast.from.2023-12-15.13-40-00.webm

Thank you for your help.

@gaoyuezhou
Copy link
Collaborator

Hi,

Thank you for the info. For replaying RoboHive datasets, you should be able to directly use the recorded 'actions' as actions as in here instead of scaling them as you did in the script you provided in your first post. Additionally, we have provided a script for replaying the datasets. The following command should replay the dataset successfully for FK1_Knob1OnRandom-v4 datasets:

python logger/examine_logs.py -e FK1_Knob1OnRandom_v2d-v4 -p <path to your dataset file>/FK1_Knob1OnRandom_v2d-v4_0_20230529-204609_trace.h5

Let us know if this solves the issue. Thanks!

@rgong-bdai
Copy link

rgong-bdai commented Jun 26, 2024

Hi, does the script work for playdata?

for example, FK1_RelaxFixed_v2d-v4_0_20230506-110624_trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants