Affiliations: Centro Universitário Campo Limpo Paulista, PMCC, 13231-230 C. Limpo Paulista, SP, Brazil
Corresponding author: M.C. Nicoletti, Centro Universitário Campo Limpo Paulista, PMCC, Rua Guatemala 167, 13231-230 C. Limpo Paulista, SP, Brazil. E-mail: [email protected].
Abstract: Instance-Based Learning (IBL) is a machine learning research area with focus on supervised algorithms that use the given training set as the expression of the learned concept. Usually the training instances in the set are described by vectors of attribute values and an associated class. The generalization process conducted by an instance-based algorithm happens during the classification phase, when a class should be assigned to a new instance of unknown class. Attributes that describe instances can be of different types, depending on the values they represent and, usually, can be of discrete or continuous type. A subtype of the discrete type is known as nominal. An attribute of nominal type usually represents categories and there is no order among its possible values. This paper proposes and investigates an alternative strategy for dealing with nominal attributes during the classification phase of the well-known instance-based algorithm NN (Nearest Neighbor). The proposed strategy is based on the concept of typicality of an instance, which can be taken into account as a possible tiebreaker, in situations where the new instance to be classified is equidistant from more than one nearest neighbor. Experiments using the proposed strategy and the default random strategy used by the conventional NN show that a strategy based on the concept of instance typicality can be a convenient choice to improve accuracy, when data instances have nominal attributes among the attributes that describe them.