Abstract: In the paper, we present an empirical evaluation of five feature selection methods: ReliefF, random forest feature selector, sequential forward selection, sequential backward selection, and Gini index. Among the evaluated methods, the random forest feature selector has not yet been widely compared to the other methods. In our evaluation, we test how the implemented feature selection can affect (i.e. improve) the accuracy of six different classifiers by performing feature selection. The results show that ReliefF and random forest enabled the classifiers to achieve the highest increase in classification accuracy on the average while reducing the number of unnecessary attributes. The…achieved conclusions can advise the machine learning users which classifier and feature selection method to use to optimize the classification accuracy, which may be important especially in risk-sensitive applications of Machine Learning (e.g. medicine, business decisions, control applications) as well as in the aim to reduce costs of collecting, processing and storage of unnecessary data.
Show more
Keywords: Feature selection, ReliefF, random forest feature selector, sequential forward selection, sequential backward selection, Gini index