SEUIR Repository

A novel filter-wrapper based feature selection approach for cancer data classification

Show simple item record

dc.contributor.author Mohamed Mufassirin, M.M.
dc.contributor.author Ragel, Roshan G.
dc.date.accessioned 2019-03-26T06:53:23Z
dc.date.available 2019-03-26T06:53:23Z
dc.date.issued 2018-12-21
dc.identifier.citation M. M. M. Mufassirin and R. G. Ragel, A novel filter-wrapper based feature selection approach for cancer data classification, IEEE International Conference on Information and Automation for Sustainability (ICIAfS), 2018, pp. 1–6. en_US
dc.identifier.other 1570494780
dc.identifier.uri http://ir.lib.seu.ac.lk/handle/123456789/3513
dc.description.abstract The advancement in DNA microarray dataset technology has become an area of interest among many scholars. Application of this technology can be a great success for cancer data classification. However, DNA microarray data usually contains thousands of irrelevant and redundant gene information which need to be eliminated to improve the accuracy of classification. Thus, in order to select the relevant gene information from cancer data, a novel feature selection technique based on a filter-wrapper approach using machine learning methods is proposed in this study. Wrappers choose all possible subsets of features to evaluate which features are useful by using learning techniques and provide the most informative subset which will increase the accuracy of the classifiers whereas filter methods extract features from the data without any learning involved. However, compared to filters, the computation demand of wrappers are high when applied to cancer data. Hence, in the proposed work, the wrapper is applied after the filter approach with the intention of reducing the computational complexity of wrappers. The datasets were pre-processed initially using a filter called Gain Ratio Filter with the Ranker search method, and then the resultant gene subsets were evaluated using a wrapper called Wrapper Subset Evaluator with the best first forward selection searching strategy using the WEKA machine learning workbench. The selected gene subset by wrapper was then used to classify the cancer microarray using machine learning classifiers namely, Decision Tree (J48), Naïve Bayes, Sequential Minimal Optimization (SMO), Deep Learning and Bayes Net. The proposed approach was tested on five cancer microarray datasets. The accuracy of 89.69%, 95.16% and 97.04% were obtained for Breast, Colon and Lung cancer datasets respectively while Leukaemia and Ovarian cancer datasets scored 100%. According to the findings of this study, the proposed method is capable of accurately classify the dataset based on a few informative genes which is more efficient compared to existing classification models. en_US
dc.language.iso en_US en_US
dc.publisher IEEE en_US
dc.subject DNA microarray en_US
dc.subject Machine learning en_US
dc.subject Feature selection en_US
dc.subject Classification en_US
dc.title A novel filter-wrapper based feature selection approach for cancer data classification en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

  • Research Articles [911]
    THESE ARE RESEARCH ARTICLES OF ACADEMIC STAFF, PUBLISHED IN JOURNALS AND PROCEEDINGS ELSWHERE

Show simple item record

Search SEUIR


Advanced Search

Browse

My Account