An improved lightweight multi-granularity feature fusion method for waste image classification
Feng, Fan (2026)
Kandidaatintyö
Feng, Fan
2026
School of Engineering Science, Tietotekniikka
Kaikki oikeudet pidätetään.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2026052352408
https://urn.fi/URN:NBN:fi-fe2026052352408
Tiivistelmä
Lightweight waste image classification models often have limited feature representation ability, especially when distinguishing visually similar waste categories. In contrast, many high-accuracy deep learning models require large numbers of parameters and high computational costs, which limits their use on resource-constrained platforms such as smart waste bins, mobile devices, and embedded recycling terminals. To address this problem, this thesis proposes an improved Lightweight Multi-Granularity Feature Fusion Network (LWMF-Net) for waste image classification.
LWMF-Net uses MobileNetV3-Large as the backbone and introduces a multi-granularity feature fusion module to combine low-level texture features, middle-level structural features, and high-level semantic features from different network stages. This design strengthens the representation ability of the lightweight backbone while maintaining a compact structure. Generalized mean pooling is also adopted to improve spatial feature aggregation and help the model focus on discriminative image regions.
The proposed model is evaluated through a quantitative experimental approach using a fixed training-validation-test split, ablation experiments, and controlled comparison with representative models. The dataset includes six waste classes: cardboard, glass, metal, paper, plastic, and trash. Compared with ResNet101, VGG16, ViT-B/16, Inception-V3, ShuffleNetV2-x1.0, and SqueezeNet1.1, LWMF-Net achieves the best overall performance, with 94.21% accuracy and a 93.35% macro F1-score on the test set. With only 4.17 million parameters and a model size of 16.13 MB, the results show that LWMF-Net provides a good balance between classification accuracy and deployment efficiency.
LWMF-Net uses MobileNetV3-Large as the backbone and introduces a multi-granularity feature fusion module to combine low-level texture features, middle-level structural features, and high-level semantic features from different network stages. This design strengthens the representation ability of the lightweight backbone while maintaining a compact structure. Generalized mean pooling is also adopted to improve spatial feature aggregation and help the model focus on discriminative image regions.
The proposed model is evaluated through a quantitative experimental approach using a fixed training-validation-test split, ablation experiments, and controlled comparison with representative models. The dataset includes six waste classes: cardboard, glass, metal, paper, plastic, and trash. Compared with ResNet101, VGG16, ViT-B/16, Inception-V3, ShuffleNetV2-x1.0, and SqueezeNet1.1, LWMF-Net achieves the best overall performance, with 94.21% accuracy and a 93.35% macro F1-score on the test set. With only 4.17 million parameters and a model size of 16.13 MB, the results show that LWMF-Net provides a good balance between classification accuracy and deployment efficiency.
