Breast cancer diagnostic using machine learning : applying supervised learning techniques to Coimbra and Wisconsin datasets
Kushwaha, Vikas (2023)
Diplomityö
Kushwaha, Vikas
2023
School of Engineering Science, Laskennallinen tekniikka
Kaikki oikeudet pidätetään.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2023061555053
https://urn.fi/URN:NBN:fi-fe2023061555053
Tiivistelmä
Breast cancer poses a significant global health concern, with approximately 2.2 million new cases and 700,000 deaths reported in 2020. Traditional diagnostic approaches which predominantly depend on expert judgement, have been associated with substantial variability in accuracy. To bridge this gap ML models are used to improve diagnostic out of which the present research investigates the potential of specific machine learning algorithms—Decision Trees, K-Nearest Neighbors, Support Vector Machines, and Logistic Regression—with an overarching objective of improving early detection and enhancing the precision of breast cancer diagnosis. The study utilizes the Breast Cancer Coimbra Dataset and the Wisconsin Diagnostic Breast Cancer Dataset for model training and evaluation. A comprehensive comparative analysis of these models is conducted, with a focus on optimizing hyperparameters and distance measures to ascertain the most effective configurations. Further, the influence of feature selection methods and Principal Component Analysis on model performance is explored.
Logistic Regression and Support Vector Machines models demonstrated remarkable performance, surpassing the predictive accuracy of models reported in current literature, with accuracies reaching up to 99.42%. This research could serve as a foundation for future studies applying machine learning models in breast cancer diagnostics, emphasizing the potential of machine learning as a robust tool in medical diagnostics.
Logistic Regression and Support Vector Machines models demonstrated remarkable performance, surpassing the predictive accuracy of models reported in current literature, with accuracies reaching up to 99.42%. This research could serve as a foundation for future studies applying machine learning models in breast cancer diagnostics, emphasizing the potential of machine learning as a robust tool in medical diagnostics.
