The RG Block in Mamba-YOLO : an ablation study of its Impact on small-target detection
Liu, Zeyu (2026)
Kandidaatintyö
Liu, Zeyu
2026
School of Engineering Science, Tietotekniikka
Kaikki oikeudet pidätetään.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2026052857805
https://urn.fi/URN:NBN:fi-fe2026052857805
Tiivistelmä
Small-object detection in UAV imagery remains difficult because many targets occupy only a few pixels, appear close to one another, and are easily confused with road surfaces, trees, shadows and other background structures. This thesis studies this problem through the RGBlock module in Mamba-YOLO-T. Instead of proposing a new detector, the work focuses on whether RGBlock and selected internal components make a measurable contribution when Mamba-YOLO-T is trained and evaluated on VisDrone2019-DET.
The study uses an ablation-based experimental design. The complete Mamba-YOLO-T model with Full RGBlock is compared with three modified variants: w/o RGBlock, w/o Gate and w/o DWConv Residual. YOLOv8n is also included as a lightweight external baseline. All Mamba-YOLO-T variants are evaluated on the VisDrone validation set using Precision, Recall, mAP50 and mAP50-95, with two random seeds reported for the main ablation experiments.
The results show that Full RGBlock gives the best overall performance among the tested Mamba-YOLO-T variants. It achieves an average mAP50 of 0.36169 and an average mAP50-95 of 0.21232. Removing the whole RGBlock branch causes the largest decline, while removing the gate or the DWConv residual connection leads to smaller but still observable decreases. These findings suggest that RGBlock is useful for the tested Mamba-YOLO-T setting on VisDrone, and that its benefit comes from the combined effect of local convolutional refinement, residual feature preservation and gated feature interaction.
The study uses an ablation-based experimental design. The complete Mamba-YOLO-T model with Full RGBlock is compared with three modified variants: w/o RGBlock, w/o Gate and w/o DWConv Residual. YOLOv8n is also included as a lightweight external baseline. All Mamba-YOLO-T variants are evaluated on the VisDrone validation set using Precision, Recall, mAP50 and mAP50-95, with two random seeds reported for the main ablation experiments.
The results show that Full RGBlock gives the best overall performance among the tested Mamba-YOLO-T variants. It achieves an average mAP50 of 0.36169 and an average mAP50-95 of 0.21232. Removing the whole RGBlock branch causes the largest decline, while removing the gate or the DWConv residual connection leads to smaller but still observable decreases. These findings suggest that RGBlock is useful for the tested Mamba-YOLO-T setting on VisDrone, and that its benefit comes from the combined effect of local convolutional refinement, residual feature preservation and gated feature interaction.
