Question about dataset generation. #191

Yyb-XJTU · 2024-11-28T03:13:25Z

Your processing steps are to first process the db file into pkl, and then generate a dataset in arrow format. However, the original nuplan dataset is hierarchical (according to map), do I need to operate one by one?

JohnZhan2023 · 2024-11-29T09:25:08Z

No, the two parts are completely independent. You can concurrently run:

    python generation.py  --num_proc 40 --sample_interval 100  
    --dataset_name boston_index_demo  --starting_file_num 0  
    --ending_file_num 10000  --cache_folder {PATH_TO_CACHE_FOLDER}
    --data_path {PATH_TO_DATASET_FOLDER}  --only_data_dic

for pkl

and the

    python generation.py  --num_proc 40 --sample_interval 100  
    --dataset_name boston_index_interval100  --starting_file_num 0  
    --ending_file_num 10000  --cache_folder {PATH_TO_CACHE_FOLDER}  
    --data_path {PATH_TO_DATASET_FOLDER}  --only_index

for arrow

Yyb-XJTU · 2024-11-30T01:58:52Z

Thanks for your reply. Should I split the Nuplan dataset into train, val, and test? Then, should I perform the above two steps to generate pkl and arrow for each subset, respectively?

JohnZhan2023 · 2024-12-01T02:21:23Z

You don't need to generate each subset and it will automatically generate all the subsets.

Yyb-XJTU · 2024-12-01T06:10:12Z

This is my nuplan dataset file structure（All db files in a Folders）：

I read the generation.py code and found that each subset (train, val and test) needs to be processed separately, and there is no automatic processing logic.

JohnZhan2023 · 2024-12-02T11:30:11Z

Thank you for pointing out my mistakes. You are right. We should run the python file separately for each subset.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about dataset generation. #191

Question about dataset generation. #191

Yyb-XJTU commented Nov 28, 2024

JohnZhan2023 commented Nov 29, 2024 •

edited

Loading

Yyb-XJTU commented Nov 30, 2024

JohnZhan2023 commented Dec 1, 2024

Yyb-XJTU commented Dec 1, 2024

JohnZhan2023 commented Dec 2, 2024

Question about dataset generation. #191

Question about dataset generation. #191

Comments

Yyb-XJTU commented Nov 28, 2024

JohnZhan2023 commented Nov 29, 2024 • edited Loading

Yyb-XJTU commented Nov 30, 2024

JohnZhan2023 commented Dec 1, 2024

Yyb-XJTU commented Dec 1, 2024

JohnZhan2023 commented Dec 2, 2024

JohnZhan2023 commented Nov 29, 2024 •

edited

Loading