Reduction of variables for predicting breast cancer survivability using principal component analysis
Was this content written or created while at IBA?
Yes
Document Type
Conference Paper
Publication Date
1-1-2015
Conference Name
2015 IEEE 28th International Symposium on Computer-Based Medical Systems (CBMS)
Conference Location
Sao Carlos, Brazil
Conference Dates
22-25 June 2015
ISBN/ISSN
84944218114 (Scopus)
First Page
131
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Abstract / Description
This research uses breast cancer data from the Surveillance, Epidemiology, and End Results (SEER) dataset's (1973-2010), which contains 684394 records. It is cleaned using several data pre-processing techniques. Survivability predictions are proposed using two different methods. In the first method, 14 variables are used as suggested by Delen et al[1], and in second method 14 variables are reduced to 5 variables (Principal Components) using a statistical technique called Principal Component Analysis (PCA), which captures 98% of total variance. The results of both of the methods propose almost same level of accuracy, thereby reducing the number of variables to be taken into account for the analysis of data.
DOI
https://doi.org/10.1109/CBMS.2015.62
Recommended Citation
Hussain, S., Quazilbash, N. Z., Bai, S., & Khoja, S. A. (2015). Reduction of variables for predicting breast cancer survivability using principal component analysis., 131. https://doi.org/10.1109/CBMS.2015.62