Micro-gesture recognition using Mamba
Partha, Durbar Hore (2025)
Diplomityö
Partha, Durbar Hore
2025
School of Engineering Science, Laskennallinen tekniikka
Kaikki oikeudet pidätetään.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2025062472880
https://urn.fi/URN:NBN:fi-fe2025062472880
Tiivistelmä
In computer vision studies human-computer interaction is very significant, and its dimensions gradually increase with diverse advanced studies. Human emotion and gesture based studies has taken a large share with advanced deep learning computer vision studies. Micro-gesture has been an important and highly researched topic for understanding and analyzing one's true emotional states. With subtle gestures of different parts of the body human depicts their emotions unintentionally. Most of the micro-gesture recognition studies rely on deep learning methodologies where human emotions are classified by their subtle movements of their body. Mamba based model is a revolutionary approach in micro-gesture studies. Video Mamba is an advanced technique which uses State Space Model (SSM) to extract features from video dataset and classify human gestures. The monotonous studies of micro-gesture recognition using CNN and Transformer-based architectures are compared with the Video Mamba model to understand its significance and competence in long video sequencing and sequential processing.The study with Video Mamba model achieves greater performance than state-of-the-art CNN and Transformer-based models on popular publicly available datasets: iMiGUE, SMG and MA-52 datasets. Video Mamba model's competence in micro-gesture studies is established With long video sequencing capacity, linear complexity, less memory consumptions, less time consumption and higher accuracy.