Predicting pedestrians' intentions to cross paths with cars, particularly at intersections and crosswalks, is critical for autonomous systems. While recent studies have showcased the effectiveness of deep learning models based on computer vision in this domain, current models often lack the requisite confidence for integration into autonomous systems, leaving several unresolved issues. One of the fundamental challenges in autonomous systems is accurately predicting whether pedestrians intend to cross the path of a self-driving car. Our proposed model addresses this challenge by employing convolutional neural networks to predict pedestrian crossing intentions based on non-visual input data, including body pose, car velocity, and pedestrian bounding box, across sequential video frames. By logically arranging non-visual features in a 2D matrix format and utilizing an RGB semantic map to aid in comprehending and distinguishing fused features, our model achieves improved accuracy in pedestrian crossing intention prediction compared to previous approaches. Evaluation against the criteria of the JAAD database for pedestrian crossing intention prediction demonstrates significant enhancements over prior studies.
Rasouli, A., Kotseruba, I., & Tsotsos, J. K. (2017). Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In Proceedings of the IEEE International Conference on Computer Vision Workshops(pp. 206-213).
Kotseruba, I., Rasouli, A., & Tsotsos, J. K. (2020, October). Do they want to cross? Understanding pedestrian intention for behavior prediction. In 2020 IEEE Intelligent Vehicles Symposium (IV)(pp. 1688-1693). IEEE
Lorenzo, J., Parra, I., Wirth, F., Stiller, C., Llorca, D. F., & Sotelo, M. A. (2020, October). Rnn-based pedestrian crossing prediction using activity and pose-related features. In 2020 IEEE Intelligent Vehicles Symposium (IV)(pp. 1801-1806). IEEE.
Rasouli, A., Kotseruba, I., Kunic, T., & Tsotsos, J. K. (2019). Pie: A large-scale dataset and models for pedestrian intention estimation and trajectory prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision(pp. 6262-6271).
Quan, R., Zhu, L., Wu, Y., & Yang, Y. (2021). Holistic LSTM for pedestrian trajectory prediction. IEEE transactions on image processing, 30, 3229-3239.
Fushishita, N., Tejero-de-Pablos, A., Mukuta, Y., & Harada, T. (2020). Long-term human video generation of multiple futures using poses. In Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16(pp. 596-612). Springer International Publishing.
Rasouli, A., & Tsotsos, J. K. (2019). Autonomous vehicles that interact with pedestrians: A survey of theory and practice. IEEE transactions on intelligent transportation systems, 21(3), 900-918.
Rasouli, A., Kotseruba, I., & Tsotsos, J. K. (2017, June). Agreeing to cross: How drivers and pedestrians communicate. In 2017 IEEE Intelligent Vehicles Symposium (IV)(pp. 264-269). IEEE.
Fang, J., Wang, F., Xue, J., & Chua, T. S. (2024). Behavioral intention prediction in driving scenes: A survey. IEEE Transactions on Intelligent Transportation Systems.
Cao, Z., Simon, T., Wei, S. E., & Sheikh, Y. (2017). Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition(pp. 7291-7299)
Chen, S., & Demachi, K. (2020). A vision-based approach for ensuring proper use of personal protective equipment (PPE) in decommissioning of Fukushima Daiichi nuclear power station. Applied Sciences, 10(15), 5129.
O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.
Kotseruba, I., Rasouli, A., & Tsotsos, J. K. (2021). Benchmark for evaluating pedestrian action prediction. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision(pp. 1258-1268).
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Yang, D., Zhang, H., Yurtsever, E., Redmill, K. A., & Özgüner, Ü. (2022). Predicting pedestrian crossing intention with feature fusion and spatio-temporal attention. IEEE Transactions on Intelligent Vehicles, 7(2), 221-230
Rasouli, A., Kotseruba, I., & Tsotsos, J. K. (2020). Pedestrian action anticipation using contextual feature fusion in stacked rnns. arXiv preprint arXiv:2005.06582
Ham, J. S., Kim, D. H., Jung, N., & Moon, J. (2023). Cipf: Crossing intention prediction network based on feature fusion modules for improving pedestrian safety. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 3666-3675).
Azarmi, M., Rezaei, M., Wang, H., & Glaser, S. (2024). PIP-Net: Pedestrian Intention Prediction in the Wild. arXiv preprint arXiv:2402.12810.
Zhang, Z., Tian, R., & Ding, Z. (2023, June). Trep: Transformer-based evidential prediction for pedestrian intention with uncertainty. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 3, pp. 3534-3542).
Zhou, Y., Tan, G., Zhong, R., Li, Y., & Gou, C. (2023). Pit: Progressive interaction transformer for pedestrian crossing intention prediction. IEEE Transactions on Intelligent Transportation Systems.
Damirchi, H., Greenspan, M., & Etemad, A. (2023, October). Context-aware pedestrian trajectory prediction with multimodal transformer. In 2023 IEEE International Conference on Image Processing (ICIP) (pp. 2535-2539). IEEE.
Li, Y., Zhang, C., Zhou, J., & Zhou, S. (2024). POI-GAN: A Pedestrian Trajectory Prediction Method for Service Scenarios. IEEE Access.
Lv, Z., Huang, X., & Cao, W. (2022). An improved GAN with transformers for pedestrian trajectory prediction models. International Journal of Intelligent Systems, 37(8), 4417-4436.
Pakdel, A., Nazari, B., & Sadri, S. (2023). Predicting Pedestrian Intentions in Self-Driving Cars: Leveraging Non-Visual Features and Semantic Mapping. Modeling and Simulation in Electrical and Electronics Engineering, 3(2), 21-28. doi: 10.22075/mseee.2024.33643.1154
MLA
Amin Pakdel; Behzad Nazari; Saeed Sadri. "Predicting Pedestrian Intentions in Self-Driving Cars: Leveraging Non-Visual Features and Semantic Mapping", Modeling and Simulation in Electrical and Electronics Engineering, 3, 2, 2023, 21-28. doi: 10.22075/mseee.2024.33643.1154
HARVARD
Pakdel, A., Nazari, B., Sadri, S. (2023). 'Predicting Pedestrian Intentions in Self-Driving Cars: Leveraging Non-Visual Features and Semantic Mapping', Modeling and Simulation in Electrical and Electronics Engineering, 3(2), pp. 21-28. doi: 10.22075/mseee.2024.33643.1154
VANCOUVER
Pakdel, A., Nazari, B., Sadri, S. Predicting Pedestrian Intentions in Self-Driving Cars: Leveraging Non-Visual Features and Semantic Mapping. Modeling and Simulation in Electrical and Electronics Engineering, 2023; 3(2): 21-28. doi: 10.22075/mseee.2024.33643.1154