(PSI2.0 is an extension dataset based on the PSI1.0 dataset.)
- 20230910: All PSI data including videos, CV annotations, and Cognitive Annotation (PSI1.0 & PSI 2.0) are public for download and future exploration!
[Google Drive][PSI Homepage] 🎇⚡ - 20230606: The official [Homepage], [GitHub], and [CodaLab (Track 1)(Track 2)(Track 3)] of the IEEE ITSS Student Competition are public.
Please refer to PSI dataset for the details of PSI dataset and data structure.
db = {
'video_name': *video_name*,
'frames': {
'frame_*frameId*': {
'cognitive_annotation': {
'*objType*_track_*trackId*': {
*annotatorId1*: {
'driving_decision_speed': str, # ['increaseSpeed', 'decreaseSpeed', 'maintainSpeed']
'driving_decision_direction': str, # ['goStraight', 'turnLeft', 'turnRight']
'explanation': str,
'key_frame': int # {0: not key frame, 1: key frame}
},
*annotatorId2*: {
'driving_decision_speed': str,
'driving_decision_direction': str,
'explanation': str,
'key_frame': int
},
...
}
}
}
}
}
(0) Arguments
# Experimental Setting
Input: Observed video sequence
Output: Driving decision prediction (Speed: increase/decrease/maintain speed/stop; Direction: go straight/turn left/turn right)
Observed sequence length: 15 frames (0.5s for 30 fps)
Prediction: 2 outputs - driving decision(s) (speed + direction)
Overlap rate: 0.9 for traingin/validation, 1 for test
(To sample tracks with stride length = len(observed_sequence_length) * overlap rate
Video Splits:
('./splits/PSI200_split.json')
- Train: Video_0001 ~ Video_0110
- Val: Video_0111 ~ Video_0146
- Test: Video_0147 ~ Video_0204
('./splits/PSI100_split.json')
- Train: Video_0001 ~ Video_0082
- Val: Video_0083 ~ Video_0088
- Test: Video_0089 ~ Video_0110
(1) Generate database
./database/create_database(args)
Organize the data into format as:
db = {
- *video_name*: { # video name
- 'frames': [0, 1, 2, ...], # list of frames that the target pedestrian appear
- 'speed': [],
- 'gps': [],
- 'nlp_annotations': {
- *annotator_id*: { # annotator's id/name
- 'speed': [], # list of driving decision (speed) at speific frame, extended from key-frame annotations
- 'direction': [], # list of driving decision (direction) at speific frame, extended from key-frame annotations
- 'description': [], # list of explanation of the intent estimation for every frame from the current annotator_id
- 'key_frame': [] # if the specific frame is key-frame, directly annotated by the annotator. 0-NOT key-frame, 1-key-frame
},
...
}
}
}
Driving decision ground-truth:
Here in this baseline, we use the major voting strategy to set the speed/direction annotation category with the most number of agreements among all annotators as the ground-truth driving decision annotation.
(2) training / validation / test split
Our splits are provided in ./splits
. Specifically, for PSI100, all videos are splited into train/val/test as ratio
(3) Run training
python main.py
(4) Evaluation Metrics
Acc-speed: Overall accuracy of speed driving decision prediction
mAcc-speed: Class-wise average accuracy of speed driving decision prediction
Acc-direction: Overall accuracy of wheel direction driving decision prediction
mAcc-direction: Class-wise average accuracy of wheel direction driving decision prediction
(4). Environment
Python 3.8
PyTorch 1.10.0 + Cuda 111
Tensorboard 2.10.1
(5) Notes
This baseline only take the bounding boxes sequence of the target pedestrian as input. However, PSI contains various multi-modal annotations and information available for further exploration to contribute to the intent prediction. E.g., Video sequence, other road users bounding boxes, detailed text-based explanation annotations, etc.
[1] Tina Chen, Taotao Jing, Renran Tian, Yaobin Chen, Joshua Domeyer, Heishiro Toyoda, Rini Sherony, Zhengming Ding. "Psi: A pedestrian behavior dataset for socially intelligent autonomous car." arXiv preprint arXiv:2112.02604 (2021).
[2] Chen, Tina, Renran Tian, and Zhengming Ding. "Visual reasoning using graph convolutional networks for predicting pedestrian crossing intention." In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3103-3109. 2021.
Please feel free to send any questions or comments to [email protected]