You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

Learning cost-sensitive Bayesian networks via direct and indirect methods

Abstract

Cost-sensitive learning has become an increasingly important area that recognizes that real world classification problems need to take the costs of misclassification and accuracy into account. Much work has been done on cost-sensitive decision tree learning, but very little has been done on cost-sensitive Bayesian networks. Although there has been significant research on Bayesian networks there has been relatively little research on learning cost-sensitive Bayesian networks. Hence, this paper explores whether it is possible to develop algorithms that learn cost-sensitive Bayesian networks by taking (i) an indirect approach that changes the data distribution to reflect the costs of misclassification; and (ii) a direct approach that amends an existing accuracy based algorithm for learning Bayesian networks.An empirical comparison of the new approaches is carried out with cost-sensitive decision tree learning algorithms on 33 data sets, and the results show that the new algorithms perform better in terms of misclassification cost and maintaining accuracy.

References

[1] 

Ahmadlou M., and Adeli H., Enhanced probabilistic neural network with local decision circles: A robust classifier, Integrated Computer-Aided Engineering 17: (3) ((2010) ), 197-210.

[2] 

Akaike H., A new look at the statistical model identification, IEEE Transactions on Automatic Control 19: (6) ((1974) ), 716-723.

[3] 

Asuncion A., and Newman D.H., UCI machine learning repository, https://archive.ics.uci.edu/ml/datasets.html, ((2007) ).

[4] 

Breiman L., , Friedman J., , Stone C.J., and Olshen R.A., Classification and regression trees. CRC press, (1984) .

[5] 

Chow C.K., and Liu C.N., Approximating discrete probability distributions with dependence trees, IEEE Transactions on Information Theory 4: (3) ((1968) ), 462-467.

[6] 

Dasgupta S., Learning polytrees, In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence ((1999) ), pp. 134-141.

[7] 

Domingos P., Metacost: A general method for making classifiers cost-sensitive, In Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, ((1999) ), pp. 155-164.

[8] 

Drummond C., and Holte R.C., Exploiting the cost (in) sensitivity of decision tree splitting criteria, In ICML, ((2000) ), pp. 239-246.

[9] 

Elkan C., The foundations of cost-sensitive learning, International Joint Conference on Artificial Intelligence 17: (1) ((2001) ), 973-978.

[10] 

Friedman J.H., Data Mining and Statistics: What's the connection? Computing Science and Statistics 29: (1) ((1998) ), 3-9.

[11] 

Friedman N., , Geiger D., and Goldszmidt M., Bayesian Network Classifiers, Machine Learning 29: (2-3) ((1997) ), 131-163.

[12] 

Friedman N., and Goldszmidt M., Learning Bayesian networks with local structure, In Learning in Graphical Models, Springer Netherlands, ((1998) ), pp. 421-459.

[13] 

Ghosh-Dastidar S., and Adeli H., Improved spiking neural networks for EEG classification and epilepsy and seizure detection, Integrated Computer-Aided Engineering 14: (3) ((2007) ), 187-212.

[14] 

Heckerman D., , Mamdani A., and Wellman M.P., Real-world applications of Bayesian networks, Communications of the ACM 38(3) ((1995) ), 24-26.

[15] 

Langley P., , Iba W., and Thompson K., An analysis of Bayesian classifiers, In Proceedings, Tenth National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, 90: , ((1992) ), 223-228.

[16] 

Ling C.X., and Sheng V.S., Cost-sensitive learning, In Encyclopedia of Machine Learning ((2010) ), pp. 231-235.

[17] 

Ling C.X., , Yang Q., , Wang J., and Zhang S., Decision Trees with Minimal Costs, ACM International Conference Proceeding Series 21st international conference on Machine learning, Banff, Alberta, Canada, Article No. 69, ACM Press New York, NY, USA, ((2004) ).

[18] 

Lomax S., and Vadera S., A survey of cost-sensitive decision tree induction algorithms, ACM Computing Surveys (CSUR) 45: (2) ((2013) ), 16.

[19] 

Meila M., and Jordan M.I., Learning with Mixtures of Trees, Journal of Machine Learning Research ((2000) ), 1-48.

[20] 

Mitchell T.M., Does machine learning really work? AI magazine 18: (3) ((1997) ), 11.

[21] 

Nashnush E., and Vadera S., Cost-Sensitive Bayesian Network Learning Using Sampling, In Recent Advances on Soft Computing and Data Mining, Springer International Publishing, ((2014) ), pp. 467-476.

[22] 

Neapolitan R.E., Learning Bayesian networks, Upper Saddle River: Prentice Hall, ((2004) ).

[23] 

Pazzani M.J., , Merz C.J., , Murphy P.M., , Ali K., , Hume T., and Brunk C., Reducing Misclassification Costs, In ICML, ((1994) ), pp. 217-225.

[24] 

Pearl J., Embracing Causality in Formal Reasoning, In AAAI, ((1988) ), pp. 369-373.

[25] 

Phua C., , Lee V., , Smith K., and Gayler R., A comprehensive survey of data mining-based fraud detection research, arXiv preprint arXiv: 1009.6119, ((2010) ).

[26] 

Quinlan J.R., Induction of decision trees, Machine Learning 1: (1) ((1986) ), 81-106.

[27] 

Rissanen J., Modeling by shortest data description, Automatica 14: (5), (1978), 465-471.

[28] 

Santos-Rodríguez R., , García-García D., and Cid-Sueiro J., Cost-sensitive classification based on Bregman divergences for medical diagnosis, In Machine Learning and Applications, ICML, ((2009) ), pp. 551-556.

[29] 

Schwarz G., Estimating the dimension of a model, The Annals of Statistics 6: (2) ((1978) ), 461-464.

[30] 

Sheng V.S., and Ling C.X., Thresholding for making classifiers cost-sensitive, In Proceedings of the national conference on artificial intelligence. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press. 21: (1) ((2006) ), pp. 476.

[31] 

Sheng V.S., and Ling C.X., Roulette sampling for cost-sensitive learning, In Machine Learning: ECML, Springer, ((2007) ), pp. 724-731.

[32] 

Thing K.M., An Instance-Weighting Method to Induce Cost-Sensitive Decision Trees, IEEE Transactions on Knowledge and Data Engineering 14: (3) ((2002) ), 659-665.

[33] 

Witten I.H., and Frank E., Data Mining - Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers, ((2005) ).

[34] 

Zadrozny B., , Langford J., and Abe N., A simple method for cost-sensitive learning, IBM Technical Report RC22666, ((2003) ).

[35] 

Zadrozny B., , Langford J., and Abe N., Cost-sensitive learning by cost-proportionate example weighting, In Data Mining, ICDM, Third IEEE International Conference, ((2003) ), pp. 435-442.

[36] 

Zhang Y., , Zhou G., , Jin J., , Zhao Q., , Wang X., and Cichocki A., Aggregation of sparse linear discriminant analysis for event-related potential classification in brain-computer interface, International Journal of Neural Systems 24: (1) ((2014) ), 1450003.

[37] 

Zhang Y., , Zhou G., , Zhao Q., , Jin J., , Wang X., and Cichocki A., Spatial-temporal discriminant analysis for ERP-based brain-computer interface, IEEE Transactions on Neural Systems and Rehabilitation Engineering 21: (2) ((2013) ), 233-243.