Affiliations:
Department of Computer Science, Systems and Communications, University of Milano - Bicocca, Milan, Italy
Correspondence:
[*]
Corresponding author: Lorenzo Olearo, Department of Computer Science, Systems and Communications, University of Milano - Bicocca, Milan, Italy. E-mail: [email protected].
Note: [1] This research is supported by the Fondazione CARIPLO “AMPEL: Artificial intelligence facing Multidimensional Poverty in ELderly” (CUP H45F20000840007, Ref. 2020-0232).
Abstract: Despite the rapid development in very recent years of Artificial Intelligence models to predict poverty risk, this problem still remains an unsolved open challenge, especially from a multidimensional perspective. One of the main challenges is related to the scarcity of labelled and high-quality data for training models coupled with the lack of a general reference model to build good predictors. This results in the proposal of a variety of approaches tailored to specific contexts. This paper presents our proposal to address multidimensional poverty prediction, starting from an unlabelled dataset. We focus on the case of a fragile population, the older adults; our approach is highly flexible and can be easily adapted to various scenarios. Firstly, starting from expert knowledge, we apply a stochastic method for estimating the probability of an individual being poor, and we use this probability to identify three levels of risk. Then, we train an XGBoost classification model and exploit its tree structure to define a ranking of feature relevance. This information is used to create a new set of aggregated features representative of different poverty dimensions. An explainable novel Naive Bayes model is then trained for predicting individuals’ deprivation level in our particular domain. The capacity to identify which variables are predominantly associated with poverty among older adults offers valuable insights for policymakers and decision-makers to address poverty effectively.