Hyppää sisältöön
    • Suomeksi
    • På svenska
    • In English
  • Suomeksi
  • In English
  • Kirjaudu
Näytä aineisto 
  •   Etusivu
  • LUTPub
  • Tieteelliset julkaisut
  • Näytä aineisto
  •   Etusivu
  • LUTPub
  • Tieteelliset julkaisut
  • Näytä aineisto
JavaScript is disabled for your browser. Some features of this site may not work without it.

AU-TTT: Vision Test-Time Training model for Facial Action Unit Detection

Xing, Bohao; Yuan, Kaishen; Yu, Zitong; Liu, Xin; Kälviäinen, Heikki (2025-10-30)

Katso/Avaa
xing_et_al_au-ttt_vision_aam.pdf (902.2Kb)
Lataukset: 


Post-print / Final draft

Xing, Bohao
Yuan, Kaishen
Yu, Zitong
Liu, Xin
Kälviäinen, Heikki
30.10.2025
IEEE

School of Engineering Science

https://doi.org/10.1109/ICME59968.2025.11209184
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe20251103104902

Tiivistelmä

Facial Action Units (AUs) detection is a cornerstone of objective facial expression analysis and a critical focus in affective computing. Despite its importance, AU detection faces significant challenges, such as the high cost of AU annotation and the limited availability of datasets. These constraints often lead to overfitting in existing methods, resulting in substantial performance degradation when applied across diverse datasets. Addressing these issues is essential for improving the reliability and generalizability of AU detection methods. Moreover, many current approaches leverage Transformers for their effectiveness in long-context modeling, but they are hindered by the quadratic complexity of self-attention. Recently, Test-Time Training (TTT) layers have emerged as a promising solution for long-sequence modeling. Additionally, TTT applies self-supervised learning for iterative updates during both training and inference, offering a potential pathway to mitigate the generalization challenges inherent in AU detection tasks. In this paper, we propose a novel vision backbone tailored for AU detection, incorporating bidirectional TTT blocks, named AU-TTT. Our approach introduces TTT Linear to the AU detection task and optimizes image scanning mechanisms for enhanced performance. Additionally, we design an AU-specific Region of Interest (RoI) scanning mechanism to capture fine-grained facial features critical for AU detection. Experimental results demonstrate that our method achieves competitive performance in both within-domain and cross-domain scenarios.

Lähdeviite

Xing, B., Yuan, K., Yu, Z., Liu, X., Kälviäinen, H. (2025). AU-TTT: Vision Test-Time Training model for Facial Action Unit Detection. In: 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 2025. pp. 1-6. DOI: 10.1109/ICME59968.2025.11209184

Kokoelmat
  • Tieteelliset julkaisut [1670]
LUT-yliopisto
PL 20
53851 Lappeenranta
Ota yhteyttä | Tietosuoja | Saavutettavuusseloste
 

 

Tämä kokoelma

JulkaisuajatTekijätNimekkeetKoulutusohjelmaAvainsanatSyöttöajatYhteisöt ja kokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
LUT-yliopisto
PL 20
53851 Lappeenranta
Ota yhteyttä | Tietosuoja | Saavutettavuusseloste