SEUIR Repository

An efficient way of using wrappers in big data classification

Show simple item record

dc.contributor.author Fajila, M.N.F.
dc.date.accessioned 2018-12-07T08:41:29Z
dc.date.available 2018-12-07T08:41:29Z
dc.date.issued 2017-11-28
dc.identifier.isbn 9789556271232
dc.identifier.uri http://ir.lib.seu.ac.lk/handle/123456789/3279
dc.description.abstract Data is dramatically growing with the growth of time. However, the value of data forces the scientists to find patters to use the high dimensional data efficiently. Dimensionality reduction is an essential technique in data science when handling big data. Although always the techniques are being introduced, applying correct technique at right position still seems to be challenging. One such technique is wrappers for machine learning. Feature selection plays a major role in classification of big data. A feature can be more informative in the presence of another feature. Thus, no feature should be removed without assessing. Wrappers select all the possible combinations of feature subsets, and finally provide the most informative subset which classifies the data with a higher accuracy. But, compared to filters wrappers are much slower and consume a huge amount of time when applied to big data. Therefore, in the proposed approach, wrapper is applied after the application of filter in order to get rid of the computational complexity. This approach uses gain ratio filter followed by classifier subset evaluate, the wrapper for feature sub set selection. The proposed technique is validated and evaluated on two high dimensional micro array data sets namely; lung cancer data set and breast cancer data set. It provided 97.10% accuracy (only with two mis classifications) and 78.78% accuracy for lung cancer and breast cancer data sets respectively. Thus, the results show that the proposed approach is extremely efficient in terms of accuracy and computational time too. en_US
dc.language.iso en_US en_US
dc.publisher Faculty of Applied Science, South Eastern University of Sri Lanka en_US
dc.subject Big data, en_US
dc.subject Classification, en_US
dc.subject Dimensionality reduction, r. en_US
dc.subject Micro array, en_US
dc.subject Wrapper. en_US
dc.title An efficient way of using wrappers in big data classification en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search SEUIR


Advanced Search

Browse

My Account