Add vanilla train/val/test codes of SemanticKITTI/nuScenes/Waymo

cardwing · Jun 19, 2022 · 915dbc9 · 915dbc9
1 parent 8b95f28
commit 915dbc9
Show file tree

Hide file tree

Showing 36 changed files with 4,732 additions and 26 deletions.
diff --git a/README.md b/README.md
@@ -1,7 +1,6 @@
-# Codes-for-PVKD
 Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation (CVPR 2022)
 
-Our model achieves state-of-the-art performance on three benchmarks, i.e., ranks **1st** in [Waymo 3D Semantic Segmentation Challenge](https://waymo.com/open/challenges/2022/3d-semantic-segmentation/) (the "Cylinder3D" and "Offboard_SemSeg" entities), ranks **1st** in [SemanticKITTI LiDAR Semantic Segmentation Challenge](https://competitions.codalab.org/competitions/20331#results) (single-scan, the "Point-Voxel-KD" entity), ranks **3rd** in [SemanticKITTI LiDAR Semantic Segmentation Challenge](https://competitions.codalab.org/competitions/20331#results) (multi-scan, the "PVKD" entity). Our trained model has been used in one NeurIPS 2022 submission! Please do not hesitate to use our trained models!
+Our model achieves state-of-the-art performance on three benchmarks, i.e., ranks **1st** in [Waymo 3D Semantic Segmentation Challenge](https://waymo.com/open/challenges/2022/3d-semantic-segmentation/) (the "Cylinder3D" and "Offboard_SemSeg" entities), ranks **1st** in [SemanticKITTI LiDAR Semantic Segmentation Challenge](https://competitions.codalab.org/competitions/20331#results) (single-scan, the "Point-Voxel-KD" entity), ranks **2nd** in [SemanticKITTI LiDAR Semantic Segmentation Challenge](https://competitions.codalab.org/competitions/20331#results) (multi-scan, the "PVKD" entity). Our trained model has been used in one NeurIPS 2022 submission! Do not hesitate to use our trained models!
 
 ## Installation
 
@@ -65,26 +64,27 @@ Our model achieves state-of-the-art performance on three benchmarks, i.e., ranks
 ```
 
 ## Test
+We take evaluation on the SemanticKITTI test set (single-scan) as example.
 
-First, download the [pre-trained models]() and put them in ./model_load_dir.
+1. Download the [pre-trained models]() and put them in `./model_load_dir`.
 
-Then, generate predictions on the SemanticKITTI test set.
+2. Generate predictions on the SemanticKITTI test set.
 
 ```
 CUDA_VISIBLE_DEVICES=0 python -u test_cyl_sem_tta.py
 ```
 
-We perform test-time augmentation to boost the performance. The model predictions will be saved in ./out_cyl/test by default.
+We perform test-time augmentation to boost the performance. The model predictions will be saved in `./out_cyl/test` by default.
 
 
-Convert label number back to the original dataset format before submitting:
+3. Convert label number back to the original dataset format before submitting:
 ```
 python remap_semantic_labels.py -p out_cyl/test -s test --inverse
 cd out_cyl/test
 zip -r out_cyl.zip sequences/
 ```
 
-Finally, upload out_cyl.zip to the [SemanticKITTI online server](https://competitions.codalab.org/competitions/20331#participate).
+4. Upload out_cyl.zip to the [SemanticKITTI online server](https://competitions.codalab.org/competitions/20331#participate).
 
 ## Train
 
@@ -99,48 +99,60 @@ Currently, we only support vanilla training.
 
 1. SemanticKITTI test set (single-scan):
 
-|Model|Reported|Reproduced|Gain|
-|:---:|:---:|:---:|:---:|
-|SPVNAS|66.4%|--|--|
-|Cylinder3D|68.9%|71.8%|**2.9%**|
-|Cylinder3D_0.5x|71.2%|71.4%|0.2%|
+|Model|Reported|Reproduced|Gain|Weight|
+|:---:|:---:|:---:|:---:|:---:|
+|SPVNAS|66.4%|--|--|--|
+|Cylinder3D|68.9%|71.8%|**2.9%**|[cyl_sem_1.0x_71_8.pt]()|
+|Cylinder3D_0.5x|71.2%|71.4%|0.2%|[cyl_sem_0.5x_71_2.pt]()|
 
 2. SemanticKITTI test set (multi-scan):
 
+|Model|Reported|Reproduced|Gain|Weight|
+|:---:|:---:|:---:|:---:|:---:|
+|Cylinder3D|52.5%|--%|**--%**|--|
+|Cylinder3D_0.5x|58.2%|--%|--%|[cyl_sem_ms_0.5x_58_2.pt]()|
 
 3. Waymo test set:
 
+|Model|Reported|Reproduced|Gain|
+|:---:|:---:|:---:|:---:|
+|Cylinder3D|71.18%|71.18%|0|
+|Cylinder3D_0.5x|--|--|--|
 
 4. nuScenes val set:
 
+|Model|Reported|Reproduced|Gain|
+|:---:|:---:|:---:|:---:|
+|Cylinder3D|76.1%|--|--|
+|Cylinder3D_0.5x|76.0%|75.7%|-0.3%|
 
 ## Citation
 If you use the codes, please cite the following publications:
 ```
-@InProceedings{Hou_2022_CVPR,
-    author    = {Hou, Yuenan and Zhu, Xinge and Ma, Yuexin and Loy, Chen Change and Li, Yikang},
+@inproceedings{Hou_2022_CVPR,
     title     = {Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation},
+    author    = {Hou, Yuenan and Zhu, Xinge and Ma, Yuexin and Loy, Chen Change and Li, Yikang},
     booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
-    year      = {2022},
     pages     = {8479-8488}
+    year      = {2022},
 }
 
 @inproceedings{zhu2021cylindrical,
-  title={Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation},
-  author={Zhu, Xinge and Zhou, Hui and Wang, Tai and Hong, Fangzhou and Ma, Yuexin and Li, Wei and Li, Hongsheng and Lin, Dahua},
-  booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
-  pages={9939--9948},
-  year={2021}
+    title={Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation},
+    author={Zhu, Xinge and Zhou, Hui and Wang, Tai and Hong, Fangzhou and Ma, Yuexin and Li, Wei and Li, Hongsheng and Lin, Dahua},
+    booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
+    pages={9939--9948},
+    year={2021}
 }
 
 @article{zhu2021cylindrical-tpami,
-  title={Cylindrical and Asymmetrical 3D {C}onvolution {N}etworks for LiDAR-based Perception},
-  author={Zhu, Xinge and Zhou, Hui and Wang, Tai and Hong, Fangzhou and Li, Wei and Ma, Yuexin and Li, Hongsheng and Yang, Ruigang and Lin, Dahua},
-  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
-  year={2021},
-  publisher={IEEE}
+    title={Cylindrical and Asymmetrical 3D {C}onvolution {N}etworks for LiDAR-based Perception},
+    author={Zhu, Xinge and Zhou, Hui and Wang, Tai and Hong, Fangzhou and Li, Wei and Ma, Yuexin and Li, Hongsheng and Yang, Ruigang and Lin, Dahua},
+    journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+    year={2021},
+    publisher={IEEE}
 }
 ```
 
-## Acknowledgments
+## Acknowledgements
 This repo is built upon the awesome [Cylinder3D](https://github.com/xinge008/Cylinder3D).
diff --git a/builder/__init__.py b/builder/__init__.py
@@ -0,0 +1,3 @@
+# -*- coding:utf-8 -*-
+# author: Xinge
+# @file: __init__.py.py 
diff --git a/builder/data_builder.py b/builder/data_builder.py
@@ -0,0 +1,96 @@
+# -*- coding:utf-8 -*-
+# author: Xinge
+# @file: data_builder.py 
+
+import torch
+from dataloader.dataset_semantickitti import get_model_class, collate_fn_BEV, collate_fn_BEV_tta, collate_fn_BEV_ms, collate_fn_BEV_ms_tta
+from dataloader.pc_dataset import get_pc_model_class
+
+
+def build(dataset_config,
+          train_dataloader_config,
+          val_dataloader_config,
+          grid_size=[480, 360, 32],
+          use_tta=False,
+          use_multiscan=False,
+          use_waymo=False):
+    data_path = train_dataloader_config["data_path"]
+    train_imageset = train_dataloader_config["imageset"]
+    val_imageset = val_dataloader_config["imageset"]
+    train_ref = train_dataloader_config["return_ref"]
+    val_ref = val_dataloader_config["return_ref"]
+
+    label_mapping = dataset_config["label_mapping"]
+
+    SemKITTI = get_pc_model_class(dataset_config['pc_dataset_type'])
+
+    nusc=None
+    if "nusc" in dataset_config['pc_dataset_type']:
+        from nuscenes import NuScenes
+        nusc = NuScenes(version='v1.0-trainval', dataroot=data_path, verbose=True)
+
+    train_pt_dataset = SemKITTI(data_path, imageset=train_imageset,
+                                return_ref=train_ref, label_mapping=label_mapping, nusc=nusc)
+    val_pt_dataset = SemKITTI(data_path, imageset=val_imageset,
+                              return_ref=val_ref, label_mapping=label_mapping, nusc=nusc)
+
+    train_dataset = get_model_class(dataset_config['dataset_type'])(
+        train_pt_dataset,
+        grid_size=grid_size,
+        flip_aug=True,
+        fixed_volume_space=dataset_config['fixed_volume_space'],
+        max_volume_space=dataset_config['max_volume_space'],
+        min_volume_space=dataset_config['min_volume_space'],
+        ignore_label=dataset_config["ignore_label"],
+        rotate_aug=True,
+        scale_aug=True,
+        transform_aug=True
+    )
+
+    if use_tta:
+        val_dataset = get_model_class(dataset_config['dataset_type'])(
+            val_pt_dataset,
+            grid_size=grid_size,
+            flip_aug=True,
+            fixed_volume_space=dataset_config['fixed_volume_space'],
+            max_volume_space=dataset_config['max_volume_space'],
+            min_volume_space=dataset_config['min_volume_space'],
+            ignore_label=dataset_config["ignore_label"],
+            rotate_aug=True,
+            scale_aug=True,
+            return_test=True,
+            use_tta=True,
+        )
+        if use_multiscan:
+            collate_fn_BEV_tmp = collate_fn_BEV_ms_tta
+        else:
+            collate_fn_BEV_tmp = collate_fn_BEV_tta
+    else:
+        val_dataset = get_model_class(dataset_config['dataset_type'])(
+            val_pt_dataset,
+            grid_size=grid_size,
+            fixed_volume_space=dataset_config['fixed_volume_space'],
+            max_volume_space=dataset_config['max_volume_space'],
+            min_volume_space=dataset_config['min_volume_space'],
+            ignore_label=dataset_config["ignore_label"],
+        )
+        if use_multiscan or use_waymo:
+            collate_fn_BEV_tmp = collate_fn_BEV_ms
+        else:
+            collate_fn_BEV_tmp = collate_fn_BEV
+
+    train_dataset_loader = torch.utils.data.DataLoader(dataset=train_dataset,
+                                                       batch_size=train_dataloader_config["batch_size"],
+                                                       collate_fn=collate_fn_BEV_tmp,
+                                                       shuffle=train_dataloader_config["shuffle"],
+                                                       num_workers=train_dataloader_config["num_workers"])
+    val_dataset_loader = torch.utils.data.DataLoader(dataset=val_dataset,
+                                                     batch_size=val_dataloader_config["batch_size"],
+                                                     collate_fn=collate_fn_BEV_tmp,
+                                                     shuffle=val_dataloader_config["shuffle"],
+                                                     num_workers=val_dataloader_config["num_workers"])
+
+    if use_tta:
+        return train_dataset_loader, val_dataset_loader, val_pt_dataset
+    else:
+        return train_dataset_loader, val_dataset_loader
diff --git a/builder/loss_builder.py b/builder/loss_builder.py
@@ -0,0 +1,20 @@
+# -*- coding:utf-8 -*-
+# author: Xinge
+# @file: loss_builder.py 
+
+import torch
+from utils.lovasz_losses import lovasz_softmax
+
+
+def build(wce=True, lovasz=True, num_class=20, ignore_label=0):
+
+    loss_funs = torch.nn.CrossEntropyLoss(ignore_index=ignore_label)
+
+    if wce and lovasz:
+        return loss_funs, lovasz_softmax
+    elif wce and not lovasz:
+        return wce
+    elif not wce and lovasz:
+        return lovasz_softmax
+    else:
+        raise NotImplementedError
diff --git a/builder/model_builder.py b/builder/model_builder.py
@@ -0,0 +1,37 @@
+# -*- coding:utf-8 -*-
+# author: Xinge
+# @file: model_builder.py 
+
+from network.cylinder_spconv_3d import get_model_class
+from network.segmentator_3d_asymm_spconv import Asymm_3d_spconv
+from network.cylinder_fea_generator import cylinder_fea
+
+
+def build(model_config):
+    output_shape = model_config['output_shape']
+    num_class = model_config['num_class']
+    num_input_features = model_config['num_input_features']
+    use_norm = model_config['use_norm']
+    init_size = model_config['init_size']
+    fea_dim = model_config['fea_dim']
+    out_fea_dim = model_config['out_fea_dim']
+
+    cylinder_3d_spconv_seg = Asymm_3d_spconv(
+        output_shape=output_shape,
+        use_norm=use_norm,
+        num_input_features=num_input_features,
+        init_size=init_size,
+        nclasses=num_class)
+
+    cy_fea_net = cylinder_fea(grid_size=output_shape,
+                              fea_dim=fea_dim,
+                              out_pt_fea_dim=out_fea_dim,
+                              fea_compre=num_input_features)
+
+    model = get_model_class(model_config["model_architecture"])(
+        cylin_model=cy_fea_net,
+        segmentator_spconv=cylinder_3d_spconv_seg,
+        sparse_shape=output_shape
+    )
+
+    return model
diff --git a/config/__init__.py b/config/__init__.py
@@ -0,0 +1,3 @@
+# -*- coding:utf-8 -*-
+# author: Xinge
+# @file: __init__.py.py 
diff --git a/config/config.py b/config/config.py
@@ -0,0 +1,103 @@
+# -*- coding:utf-8 -*-
+# author: Xinge
+
+from pathlib import Path
+
+from strictyaml import Bool, Float, Int, Map, Seq, Str, as_document, load
+
+model_params = Map(
+    {
+        "model_architecture": Str(),
+        "output_shape": Seq(Int()),
+        "fea_dim": Int(),
+        "out_fea_dim": Int(),
+        "num_class": Int(),
+        "num_input_features": Int(),
+        "use_norm": Bool(),
+        "init_size": Int(),
+    }
+)
+
+dataset_params = Map(
+    {
+        "dataset_type": Str(),
+        "pc_dataset_type": Str(),
+        "ignore_label": Int(),
+        "return_test": Bool(),
+        "fixed_volume_space": Bool(),
+        "label_mapping": Str(),
+        "max_volume_space": Seq(Float()),
+        "min_volume_space": Seq(Float()),
+    }
+)
+
+
+train_data_loader = Map(
+    {
+        "data_path": Str(),
+        "imageset": Str(),
+        "return_ref": Bool(),
+        "batch_size": Int(),
+        "shuffle": Bool(),
+        "num_workers": Int(),
+    }
+)
+
+val_data_loader = Map(
+    {
+        "data_path": Str(),
+        "imageset": Str(),
+        "return_ref": Bool(),
+        "batch_size": Int(),
+        "shuffle": Bool(),
+        "num_workers": Int(),
+    }
+)
+
+
+train_params = Map(
+    {
+        "model_load_path": Str(),
+        "model_save_path": Str(),
+        "checkpoint_every_n_steps": Int(),
+        "max_num_epochs": Int(),
+        "eval_every_n_steps": Int(),
+        "learning_rate": Float()
+     }
+)
+
+schema_v4 = Map(
+    {
+        "format_version": Int(),
+        "model_params": model_params,
+        "dataset_params": dataset_params,
+        "train_data_loader": train_data_loader,
+        "val_data_loader": val_data_loader,
+        "train_params": train_params,
+    }
+)
+
+
+SCHEMA_FORMAT_VERSION_TO_SCHEMA = {4: schema_v4}
+
+
+def load_config_data(path: str) -> dict:
+    yaml_string = Path(path).read_text()
+    cfg_without_schema = load(yaml_string, schema=None)
+    schema_version = int(cfg_without_schema["format_version"])
+    if schema_version not in SCHEMA_FORMAT_VERSION_TO_SCHEMA:
+        raise Exception(f"Unsupported schema format version: {schema_version}.")
+
+    strict_cfg = load(yaml_string, schema=SCHEMA_FORMAT_VERSION_TO_SCHEMA[schema_version])
+    cfg: dict = strict_cfg.data
+    return cfg
+
+
+def config_data_to_config(data):  # type: ignore
+    return as_document(data, schema_v4)
+
+
+def save_config_data(data: dict, path: str) -> None:
+    cfg_document = config_data_to_config(data)
+    with open(Path(path), "w") as f:
+        f.write(cfg_document.as_yaml())