Design and implementation of a short video recommendation system based on deep sequential models with multimodal features
Zhang, Yanli (2026)
Kandidaatintyö
Zhang, Yanli
2026
School of Engineering Science, Laskennallinen tekniikka
Kaikki oikeudet pidätetään.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2026051948582
https://urn.fi/URN:NBN:fi-fe2026051948582
Tiivistelmä
Existing matching-based and graph-based short-video recommendation models mainly rely on static user-item interaction information, so they have limited ability to capture the sequential changes of user interests. In addition, item ID embeddings alone cannot fully represent the visual content of short videos. To solve these problems, this study proposes a short-video recommendation model based on SASRec with VideoMAE feature enhancement.
The proposed model uses SASRec to model user behaviour sequences and introduces VideoMAE features to enhance item representations. It also applies dual-interest modelling, sequence-level contrastive learning, and frequency-based filtering to capture short-term and long-term user preferences and reduce noisy low-frequency items. Experiments on the MicroLens-100K dataset show that the proposed model achieves the best results, with a Hit@10 of 0.10432 and an NDCG@10 of 0.05850.
A React and FastAPI-based demo system was also implemented to display browsing history, recommendation results, and fallback recommendations for unknown users.
The proposed model uses SASRec to model user behaviour sequences and introduces VideoMAE features to enhance item representations. It also applies dual-interest modelling, sequence-level contrastive learning, and frequency-based filtering to capture short-term and long-term user preferences and reduce noisy low-frequency items. Experiments on the MicroLens-100K dataset show that the proposed model achieves the best results, with a Hit@10 of 0.10432 and an NDCG@10 of 0.05850.
A React and FastAPI-based demo system was also implemented to display browsing history, recommendation results, and fallback recommendations for unknown users.
