Improving Statistical Learning Methods Via Features Selection without Replacement Sampling and Random Projection
Abstract
Cancer is fundamentally a genetic disease characterized by genetic and epigenetic alterations that disrupt normal gene expression, leading to uncontrolled cell growth and metastasis. High-dimensional microarray datasets pose challenges for classification models due to the "small n, large p" problem, which can lead to overfitting. This study makes three different key contributions: 1) We propose a machine learning-based approach integrating the Feature Selection Without Replacement (FSWOR) technique and a projection method to improve classification accuracy.
Keywords
Brain cancer, gene expression, machine learning, SVM, NB, LR, DT, KNN, dimension reduction, PCA, LDA, GRP, SRP