A clustering system for high-dimensional gene expression data
Ping, Jingfeng (2025)
Kandidaatintyö
Ping, Jingfeng
2025
School of Engineering Science, Tietotekniikka
Kaikki oikeudet pidätetään.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2025050536015
https://urn.fi/URN:NBN:fi-fe2025050536015
Tiivistelmä
The main task of this thesis is to optimize scDFN (Liu et al., 2024), a novel single-cell clustering algorithm for deep learning, which provides rich information and data resources for bioinformatics and precision medicine.
Aiming at the computational efficiency of deep fusion network for scDFN, this study takes model architecture optimization as the main method, including introducing attention mechanism, upgrading graph neural network and adding gated fusion mechanism. In addition to this, small adjustments such as hyperparameter tuning also help to improve the running efficiency and clustering accuracy of the algorithm.
The optimized framework improves two commonly used clustering evaluation metrics: Normalized Mutual Information (NMI, from 0.8877 to 0.9184) and Adjusted Rand Index (ARI, from 0.9097 to 0.9633) on the source dataset. This shows that the optimized algorithm provides a more efficient solution for single-cell data analysis.
Aiming at the computational efficiency of deep fusion network for scDFN, this study takes model architecture optimization as the main method, including introducing attention mechanism, upgrading graph neural network and adding gated fusion mechanism. In addition to this, small adjustments such as hyperparameter tuning also help to improve the running efficiency and clustering accuracy of the algorithm.
The optimized framework improves two commonly used clustering evaluation metrics: Normalized Mutual Information (NMI, from 0.8877 to 0.9184) and Adjusted Rand Index (ARI, from 0.9097 to 0.9633) on the source dataset. This shows that the optimized algorithm provides a more efficient solution for single-cell data analysis.
