ANALYSIS OF SPATIO-TEMPORAL CONVOLUTIONAL NEURAL NETWORKS FOR THE ACTION DETECTION TASKS

Authors

DOI:

https://doi.org/10.26577/jpcsit2024-v2-i4-a3

Keywords:

action detection, convolutional neural networks, spatio-temporal convolutional neural networks, YOWO

Abstract

This study investigates the effectiveness of Spatio-Temporal Convolutional Neural Networks (ST-CNNs) for action detection tasks, with a comprehensive comparison of state-of-the-art models including You Only Watch Once (YOWO), YOWOv2, YOWO-Frame, and YOWO-Plus. Through extensive experiments conducted on benchmark datasets such as UCF-101, HMDB-51, and AVA, we evaluate these architectures using metrics like frame-based Mean Average Precision (frame-mAP), video-mAP, computational efficiency (FPS), and scalability. The experiments also include real-time testing of the YOWO family using an IP camera and RTSP protocol to assess their practical applicability. Results highlight the superior accuracy of YOWO-Plus in capturing complex spatio-temporal dynamics, albeit at the cost of processing speed, and the efficiency of YOWO-Frame for live applications. This analysis underscores the trade-offs between speed and accuracy inherent in single-stage ST-CNN architectures. Our findings from the comparative analysis provide a robust foundation for the development of real-time systems capable of efficient and reliable operation in action detection tasks.

Downloads

Download data is not yet available.

Author Biographies

Nurtugan Azatbekuly, Al-Farabi Kazakh National University, Almaty, Kazakhstan

Master’s student in the Computer Science Department at Al-Farabi Kazakh National University (Almaty, Kazakhstan, nurtugang17@gmail.com). His research interests focus on the analysis and development of computer vision algorithms.

Bazargul Matkerim, Al-Farabi Kazakh National University, Almaty, Kazakhstan

PhD in the Computer Science department at Al-Farabi Kazakh National University (Almaty, Kazakhstan, bazargul.matkerim@gmail.com). Her research interests include parallel computing and applications of machine learning.

Aksultan Mukhanbet, Al-Farabi Kazakh National University, Almaty, Kazakhstan

PhD student in the Computer Science department at Al-Farabi Kazakh National University (Almaty, Kazakhstan, mukhanbetaksultan0414@gmail.com). His research interests include machine learning and computer vision.

        40 9

Downloads

How to Cite

Azatbekuly, N., Matkerim, B., & Mukhanbet, A. (2024). ANALYSIS OF SPATIO-TEMPORAL CONVOLUTIONAL NEURAL NETWORKS FOR THE ACTION DETECTION TASKS. Journal of Problems in Computer Science and Information Technologies, 2(4), 26–33. https://doi.org/10.26577/jpcsit2024-v2-i4-a3