Skip to content

Question about LongVA-TPO-10k data format and DATA_PATH for TPO training #4

@xzp9999

Description

@xzp9999

I downloaded LongVA-TPO-10k from Hugging Face (https://huggingface.co/datasets/ruili0/LongVA-TPO-10k). It contains CSV files and a videos/ folder, but no JSON. However, the training loader seems to read JSON (or YAML pointing to JSON).

Could you confirm what DATA_PATH should point to in the training script?
If a JSON/JSONL version is expected, is there an official file or a recommended CSV→JSON conversion/schema?

Any guidance would be greatly appreciated. Thank you for releasing the code and dataset!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions