(PSI2.0 is an extension dataset based on the PSI1.0 dataset.)
- 20230910: All PSI data including videos, CV annotations, and Cognitive Annotation (PSI1.0 & PSI 2.0) are public for download and future exploration!
[Google Drive][PSI Homepage] 🎇⚡ - 20230606: The official [Homepage], [GitHub], and [CodaLab (Track 1)(Track 2)(Track 3)] of the IEEE ITSS Student Competition are public.
Please refer to PSI dataset for the details of PSI dataset and data structure.
(0) Arguments
# Experimental Setting Input: Observed video sequence + target pedestrian bounding box sequence Output: Future trajectory prediction (bounding boxes sequence) Observed sequence length: 15 frames (0.5s for 30 fps) Predicted sequence length: 45 frames (1.5s for 30 fps) Overlap rate: 0.9 for traingin/validation, 1 for test (To sample tracks with stride length = len(observed_sequence_length) * overlap rate Video Splits: ('./splits/PSI200_split.json') - Train: Video_0001 ~ Video_0110 - Val: Video_0111 ~ Video_0146 - Test: Video_0147 ~ Video_0204 ('./splits/PSI100_split.json') - Train: Video_0001 ~ Video_0082 - Val: Video_0083 ~ Video_0088 - Test: Video_0089 ~ Video_0110 (1) Generate database
./database/create_database(args) Organize the data into format as:
db = { - *video_name*: { # video name - *pedestrian_id*: { # target pedestrian name - 'frames': [0, 1, 2, ...], # list of frames that the target pedestrian appear - 'cv_annotations': { - 'track_id': *track_id*, # target pedestrian's track_id, obtained from raw annoation - 'bbox': [[xtl, ytl, xbr, ybr], [], [], ...], # list of bounding boxes of the target pedestrian appeared }, - 'nlp_annotations': { - *annotator_id*: { # annotator's id/name - 'intent': [], # list of crossing itnent at speific frame, extended from key-frame annotations - 'description': [], # list of explanation of the intent estimation for every frame from the current annotator_id - 'key_frame': [] # if the specific frame is key-frame, directly annotated by the annotator. 0-NOT key-frame, 1-key-frame }, ... } } } }Pedestrian trajectory ground-truth:
The ground-truth trajectory is the sequence of positions (bounding boxes) of the target pedestrian in the future time steps.
(2) training / validation / test split
Our splits are provided in ./splits. Specifically, for PSI100, all videos are splited into train/val/test as ratio
(3) Run training
python main.py(4) Evaluation Metrics
ADE: Average Displacement Error FDE: Final Displacement Error ARB: FRB: (4). Environment
Python 3.8 PyTorch 1.10.0 + Cuda 111 Tensorboard 2.10.1 (5) Notes
This baseline only take the bounding boxes sequence of the target pedestrian as input. However, PSI contains various multi-modal annotations and information available for further exploration to contribute to the intent prediction. E.g., Video sequence, other road users bounding boxes, detailed text-based explanation annotations, etc.
[1] Tina Chen, Taotao Jing, Renran Tian, Yaobin Chen, Joshua Domeyer, Heishiro Toyoda, Rini Sherony, Zhengming Ding. "Psi: A pedestrian behavior dataset for socially intelligent autonomous car." arXiv preprint arXiv:2112.02604 (2021).
[2] Chen, Tina, Renran Tian, and Zhengming Ding. "Visual reasoning using graph convolutional networks for predicting pedestrian crossing intention." In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3103-3109. 2021.
Please feel free to send any questions or comments to psi.intent.benchmark@gmail.com

