Skip to content

Commit

Permalink
Add vanilla train/val/test codes of SemanticKITTI/nuScenes/Waymo
Browse files Browse the repository at this point in the history
  • Loading branch information
cardwing committed Jun 19, 2022
1 parent 8b95f28 commit 915dbc9
Show file tree
Hide file tree
Showing 36 changed files with 4,732 additions and 26 deletions.
64 changes: 38 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# Codes-for-PVKD
Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation (CVPR 2022)

Our model achieves state-of-the-art performance on three benchmarks, i.e., ranks **1st** in [Waymo 3D Semantic Segmentation Challenge](https://waymo.com/open/challenges/2022/3d-semantic-segmentation/) (the "Cylinder3D" and "Offboard_SemSeg" entities), ranks **1st** in [SemanticKITTI LiDAR Semantic Segmentation Challenge](https://competitions.codalab.org/competitions/20331#results) (single-scan, the "Point-Voxel-KD" entity), ranks **3rd** in [SemanticKITTI LiDAR Semantic Segmentation Challenge](https://competitions.codalab.org/competitions/20331#results) (multi-scan, the "PVKD" entity). Our trained model has been used in one NeurIPS 2022 submission! Please do not hesitate to use our trained models!
Our model achieves state-of-the-art performance on three benchmarks, i.e., ranks **1st** in [Waymo 3D Semantic Segmentation Challenge](https://waymo.com/open/challenges/2022/3d-semantic-segmentation/) (the "Cylinder3D" and "Offboard_SemSeg" entities), ranks **1st** in [SemanticKITTI LiDAR Semantic Segmentation Challenge](https://competitions.codalab.org/competitions/20331#results) (single-scan, the "Point-Voxel-KD" entity), ranks **2nd** in [SemanticKITTI LiDAR Semantic Segmentation Challenge](https://competitions.codalab.org/competitions/20331#results) (multi-scan, the "PVKD" entity). Our trained model has been used in one NeurIPS 2022 submission! Do not hesitate to use our trained models!

## Installation

Expand Down Expand Up @@ -65,26 +64,27 @@ Our model achieves state-of-the-art performance on three benchmarks, i.e., ranks
```

## Test
We take evaluation on the SemanticKITTI test set (single-scan) as example.

First, download the [pre-trained models]() and put them in ./model_load_dir.
1. Download the [pre-trained models]() and put them in `./model_load_dir`.

Then, generate predictions on the SemanticKITTI test set.
2. Generate predictions on the SemanticKITTI test set.

```
CUDA_VISIBLE_DEVICES=0 python -u test_cyl_sem_tta.py
```

We perform test-time augmentation to boost the performance. The model predictions will be saved in ./out_cyl/test by default.
We perform test-time augmentation to boost the performance. The model predictions will be saved in `./out_cyl/test` by default.


Convert label number back to the original dataset format before submitting:
3. Convert label number back to the original dataset format before submitting:
```
python remap_semantic_labels.py -p out_cyl/test -s test --inverse
cd out_cyl/test
zip -r out_cyl.zip sequences/
```

Finally, upload out_cyl.zip to the [SemanticKITTI online server](https://competitions.codalab.org/competitions/20331#participate).
4. Upload out_cyl.zip to the [SemanticKITTI online server](https://competitions.codalab.org/competitions/20331#participate).

## Train

Expand All @@ -99,48 +99,60 @@ Currently, we only support vanilla training.

1. SemanticKITTI test set (single-scan):

|Model|Reported|Reproduced|Gain|
|:---:|:---:|:---:|:---:|
|SPVNAS|66.4%|--|--|
|Cylinder3D|68.9%|71.8%|**2.9%**|
|Cylinder3D_0.5x|71.2%|71.4%|0.2%|
|Model|Reported|Reproduced|Gain|Weight|
|:---:|:---:|:---:|:---:|:---:|
|SPVNAS|66.4%|--|--|--|
|Cylinder3D|68.9%|71.8%|**2.9%**|[cyl_sem_1.0x_71_8.pt]()|
|Cylinder3D_0.5x|71.2%|71.4%|0.2%|[cyl_sem_0.5x_71_2.pt]()|

2. SemanticKITTI test set (multi-scan):

|Model|Reported|Reproduced|Gain|Weight|
|:---:|:---:|:---:|:---:|:---:|
|Cylinder3D|52.5%|--%|**--%**|--|
|Cylinder3D_0.5x|58.2%|--%|--%|[cyl_sem_ms_0.5x_58_2.pt]()|

3. Waymo test set:

|Model|Reported|Reproduced|Gain|
|:---:|:---:|:---:|:---:|
|Cylinder3D|71.18%|71.18%|0|
|Cylinder3D_0.5x|--|--|--|

4. nuScenes val set:

|Model|Reported|Reproduced|Gain|
|:---:|:---:|:---:|:---:|
|Cylinder3D|76.1%|--|--|
|Cylinder3D_0.5x|76.0%|75.7%|-0.3%|

## Citation
If you use the codes, please cite the following publications:
```
@InProceedings{Hou_2022_CVPR,
author = {Hou, Yuenan and Zhu, Xinge and Ma, Yuexin and Loy, Chen Change and Li, Yikang},
@inproceedings{Hou_2022_CVPR,
title = {Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation},
author = {Hou, Yuenan and Zhu, Xinge and Ma, Yuexin and Loy, Chen Change and Li, Yikang},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2022},
pages = {8479-8488}
year = {2022},
}
@inproceedings{zhu2021cylindrical,
title={Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation},
author={Zhu, Xinge and Zhou, Hui and Wang, Tai and Hong, Fangzhou and Ma, Yuexin and Li, Wei and Li, Hongsheng and Lin, Dahua},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
pages={9939--9948},
year={2021}
title={Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation},
author={Zhu, Xinge and Zhou, Hui and Wang, Tai and Hong, Fangzhou and Ma, Yuexin and Li, Wei and Li, Hongsheng and Lin, Dahua},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
pages={9939--9948},
year={2021}
}
@article{zhu2021cylindrical-tpami,
title={Cylindrical and Asymmetrical 3D {C}onvolution {N}etworks for LiDAR-based Perception},
author={Zhu, Xinge and Zhou, Hui and Wang, Tai and Hong, Fangzhou and Li, Wei and Ma, Yuexin and Li, Hongsheng and Yang, Ruigang and Lin, Dahua},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2021},
publisher={IEEE}
title={Cylindrical and Asymmetrical 3D {C}onvolution {N}etworks for LiDAR-based Perception},
author={Zhu, Xinge and Zhou, Hui and Wang, Tai and Hong, Fangzhou and Li, Wei and Ma, Yuexin and Li, Hongsheng and Yang, Ruigang and Lin, Dahua},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2021},
publisher={IEEE}
}
```

## Acknowledgments
## Acknowledgements
This repo is built upon the awesome [Cylinder3D](https://github.com/xinge008/Cylinder3D).
3 changes: 3 additions & 0 deletions builder/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# -*- coding:utf-8 -*-
# author: Xinge
# @file: __init__.py.py
96 changes: 96 additions & 0 deletions builder/data_builder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# -*- coding:utf-8 -*-
# author: Xinge
# @file: data_builder.py

import torch
from dataloader.dataset_semantickitti import get_model_class, collate_fn_BEV, collate_fn_BEV_tta, collate_fn_BEV_ms, collate_fn_BEV_ms_tta
from dataloader.pc_dataset import get_pc_model_class


def build(dataset_config,
train_dataloader_config,
val_dataloader_config,
grid_size=[480, 360, 32],
use_tta=False,
use_multiscan=False,
use_waymo=False):
data_path = train_dataloader_config["data_path"]
train_imageset = train_dataloader_config["imageset"]
val_imageset = val_dataloader_config["imageset"]
train_ref = train_dataloader_config["return_ref"]
val_ref = val_dataloader_config["return_ref"]

label_mapping = dataset_config["label_mapping"]

SemKITTI = get_pc_model_class(dataset_config['pc_dataset_type'])

nusc=None
if "nusc" in dataset_config['pc_dataset_type']:
from nuscenes import NuScenes
nusc = NuScenes(version='v1.0-trainval', dataroot=data_path, verbose=True)

train_pt_dataset = SemKITTI(data_path, imageset=train_imageset,
return_ref=train_ref, label_mapping=label_mapping, nusc=nusc)
val_pt_dataset = SemKITTI(data_path, imageset=val_imageset,
return_ref=val_ref, label_mapping=label_mapping, nusc=nusc)

train_dataset = get_model_class(dataset_config['dataset_type'])(
train_pt_dataset,
grid_size=grid_size,
flip_aug=True,
fixed_volume_space=dataset_config['fixed_volume_space'],
max_volume_space=dataset_config['max_volume_space'],
min_volume_space=dataset_config['min_volume_space'],
ignore_label=dataset_config["ignore_label"],
rotate_aug=True,
scale_aug=True,
transform_aug=True
)

if use_tta:
val_dataset = get_model_class(dataset_config['dataset_type'])(
val_pt_dataset,
grid_size=grid_size,
flip_aug=True,
fixed_volume_space=dataset_config['fixed_volume_space'],
max_volume_space=dataset_config['max_volume_space'],
min_volume_space=dataset_config['min_volume_space'],
ignore_label=dataset_config["ignore_label"],
rotate_aug=True,
scale_aug=True,
return_test=True,
use_tta=True,
)
if use_multiscan:
collate_fn_BEV_tmp = collate_fn_BEV_ms_tta
else:
collate_fn_BEV_tmp = collate_fn_BEV_tta
else:
val_dataset = get_model_class(dataset_config['dataset_type'])(
val_pt_dataset,
grid_size=grid_size,
fixed_volume_space=dataset_config['fixed_volume_space'],
max_volume_space=dataset_config['max_volume_space'],
min_volume_space=dataset_config['min_volume_space'],
ignore_label=dataset_config["ignore_label"],
)
if use_multiscan or use_waymo:
collate_fn_BEV_tmp = collate_fn_BEV_ms
else:
collate_fn_BEV_tmp = collate_fn_BEV

train_dataset_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=train_dataloader_config["batch_size"],
collate_fn=collate_fn_BEV_tmp,
shuffle=train_dataloader_config["shuffle"],
num_workers=train_dataloader_config["num_workers"])
val_dataset_loader = torch.utils.data.DataLoader(dataset=val_dataset,
batch_size=val_dataloader_config["batch_size"],
collate_fn=collate_fn_BEV_tmp,
shuffle=val_dataloader_config["shuffle"],
num_workers=val_dataloader_config["num_workers"])

if use_tta:
return train_dataset_loader, val_dataset_loader, val_pt_dataset
else:
return train_dataset_loader, val_dataset_loader
20 changes: 20 additions & 0 deletions builder/loss_builder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# -*- coding:utf-8 -*-
# author: Xinge
# @file: loss_builder.py

import torch
from utils.lovasz_losses import lovasz_softmax


def build(wce=True, lovasz=True, num_class=20, ignore_label=0):

loss_funs = torch.nn.CrossEntropyLoss(ignore_index=ignore_label)

if wce and lovasz:
return loss_funs, lovasz_softmax
elif wce and not lovasz:
return wce
elif not wce and lovasz:
return lovasz_softmax
else:
raise NotImplementedError
37 changes: 37 additions & 0 deletions builder/model_builder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# -*- coding:utf-8 -*-
# author: Xinge
# @file: model_builder.py

from network.cylinder_spconv_3d import get_model_class
from network.segmentator_3d_asymm_spconv import Asymm_3d_spconv
from network.cylinder_fea_generator import cylinder_fea


def build(model_config):
output_shape = model_config['output_shape']
num_class = model_config['num_class']
num_input_features = model_config['num_input_features']
use_norm = model_config['use_norm']
init_size = model_config['init_size']
fea_dim = model_config['fea_dim']
out_fea_dim = model_config['out_fea_dim']

cylinder_3d_spconv_seg = Asymm_3d_spconv(
output_shape=output_shape,
use_norm=use_norm,
num_input_features=num_input_features,
init_size=init_size,
nclasses=num_class)

cy_fea_net = cylinder_fea(grid_size=output_shape,
fea_dim=fea_dim,
out_pt_fea_dim=out_fea_dim,
fea_compre=num_input_features)

model = get_model_class(model_config["model_architecture"])(
cylin_model=cy_fea_net,
segmentator_spconv=cylinder_3d_spconv_seg,
sparse_shape=output_shape
)

return model
3 changes: 3 additions & 0 deletions config/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# -*- coding:utf-8 -*-
# author: Xinge
# @file: __init__.py.py
103 changes: 103 additions & 0 deletions config/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# -*- coding:utf-8 -*-
# author: Xinge

from pathlib import Path

from strictyaml import Bool, Float, Int, Map, Seq, Str, as_document, load

model_params = Map(
{
"model_architecture": Str(),
"output_shape": Seq(Int()),
"fea_dim": Int(),
"out_fea_dim": Int(),
"num_class": Int(),
"num_input_features": Int(),
"use_norm": Bool(),
"init_size": Int(),
}
)

dataset_params = Map(
{
"dataset_type": Str(),
"pc_dataset_type": Str(),
"ignore_label": Int(),
"return_test": Bool(),
"fixed_volume_space": Bool(),
"label_mapping": Str(),
"max_volume_space": Seq(Float()),
"min_volume_space": Seq(Float()),
}
)


train_data_loader = Map(
{
"data_path": Str(),
"imageset": Str(),
"return_ref": Bool(),
"batch_size": Int(),
"shuffle": Bool(),
"num_workers": Int(),
}
)

val_data_loader = Map(
{
"data_path": Str(),
"imageset": Str(),
"return_ref": Bool(),
"batch_size": Int(),
"shuffle": Bool(),
"num_workers": Int(),
}
)


train_params = Map(
{
"model_load_path": Str(),
"model_save_path": Str(),
"checkpoint_every_n_steps": Int(),
"max_num_epochs": Int(),
"eval_every_n_steps": Int(),
"learning_rate": Float()
}
)

schema_v4 = Map(
{
"format_version": Int(),
"model_params": model_params,
"dataset_params": dataset_params,
"train_data_loader": train_data_loader,
"val_data_loader": val_data_loader,
"train_params": train_params,
}
)


SCHEMA_FORMAT_VERSION_TO_SCHEMA = {4: schema_v4}


def load_config_data(path: str) -> dict:
yaml_string = Path(path).read_text()
cfg_without_schema = load(yaml_string, schema=None)
schema_version = int(cfg_without_schema["format_version"])
if schema_version not in SCHEMA_FORMAT_VERSION_TO_SCHEMA:
raise Exception(f"Unsupported schema format version: {schema_version}.")

strict_cfg = load(yaml_string, schema=SCHEMA_FORMAT_VERSION_TO_SCHEMA[schema_version])
cfg: dict = strict_cfg.data
return cfg


def config_data_to_config(data): # type: ignore
return as_document(data, schema_v4)


def save_config_data(data: dict, path: str) -> None:
cfg_document = config_data_to_config(data)
with open(Path(path), "w") as f:
f.write(cfg_document.as_yaml())
Loading

0 comments on commit 915dbc9

Please sign in to comment.