Hybrid Approach to Human Action Recognition
Keywords:
human action recognition, hybrid model, MediaPipeAbstract
The article proposes a hybrid approach to human action recognition that combines neural network-based extraction of skeletal features with deterministic geometric analysis based on vector algebra and three-dimensional affine transformations. Unlike traditional solutions that require model retraining when a new action is added, the developed system allows the user to dynamically define and modify the set of recognizable actions without the involvement of a machine learning specialist. Each action is defined as a sequence of poses described by the relative positions of body keypoints. The comparison between the current and reference poses is performed via averaged cosine similarity of vectors, while robustness to viewpoint changes is ensured by iterating through angles of affine transformations in 3D space. The software prototype is implemented in Python using the MediaPipe and OpenCV frameworks, features an intuitive graphical interface, and operates with a standard webcam. Experimental testing confirmed the correct recognition of defined actions with an accuracy of at least 76% under natural execution conditions and resilience to input data errors. The solution is intended for use in computational systems and complexes where configuration flexibility, interpretability, and a low entry threshold are important. doi 10.54708/19926502_2025_29411030Downloads
Published
2025-25-12
Issue
Section
******************************