Automatically human action recognition (HAR) with view variation from skeleton means of adaptive transformer network
Mehmood, Faisal; Chen, Enqing; Abbas, Touqeer; Akbar, Muhammad Azeem; Khan, Arif Ali (2023-04-17)
Post-print / Final draft
Mehmood, Faisal
Chen, Enqing
Abbas, Touqeer
Akbar, Muhammad Azeem
Khan, Arif Ali
17.04.2023
Soft Computing
Springer Nature
School of Engineering Science
Kaikki oikeudet pidätetään.
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2023042037792
https://urn.fi/URN:NBN:fi-fe2023042037792
Tiivistelmä
Human action recognition using skeletons has become increasingly appealing to a growing number of researchers in recent years. It is particularly challenging to recognize actions when they are captured from different angles because there are so many variations in their representations. This paper proposes an automatic strategy for determining virtual observation viewpoints that are based on learning and data driven to solve the problem of view variation throughout an act. Our VA-CNN and VA-RNN networks, which use convolutional and recurrent neural networks with long short-term memory, offer an alternative to the conventional method of reorienting skeletons according to a human-defined earlier benchmark. Using the unique view adaption module, each network first identifies the best observation perspectives and then transforms the skeletons for end-to-end detection with the main classification network based on those viewpoints. The suggested view adaptive models can provide significantly more consistent virtual viewpoints using the skeletons of different perspectives. By removing views, the models allow networks to learn action-specific properties more efficiently. Furthermore, we developed a two-stream scheme (referred to as VA-fusion) that integrates the performance of two networks to obtain an improved prediction. Random rotation of skeletal sequences is used to avoid overfitting during training and improve the reliability of view adaption models. An extensive experiment demonstrates that our proposed view adaptive networks outperform existing solutions on five challenging benchmarks.
Lähdeviite
Mehmood, F., Chen, E., Abbas, T., Akbar, M.A., Khan, A.A. (2023). Automatically human action recognition (HAR) with view variation from skeleton means of adaptive transformer network. Soft Computing. DOI: https://doi.org/10.1007/s00500-023-08008-z
Kokoelmat
- Tieteelliset julkaisut [1522]