Improving the performance of an incremental algorithm driven by error margins

del Campo-Ávila, José; Ramos-Jiménez, Gonzalo; Gama, João; Morales-Bueno, Rafael

doi:10.3233/IDA-2008-12305

Improving the performance of an incremental algorithm driven by error margins

Issue title: Knowledge Discovery from Data Streams

Guest editors: João Gamax, Jesus Aguilar-Ruizy and Ralf Klinkenbergz

Article type: Research Article

Authors: del Campo-Ávila, José^{a; *} | Ramos-Jiménez, Gonzalo^a | Gama, João^b | Morales-Bueno, Rafael^a

Affiliations: [a] Department of Languages and Computer Science, Universidad de Málaga, E.T.S.I. Informática, Campus de Teatinos, 29071 Málaga, Spain | [b] Laboratory of Artificial Intelligence and Computer Science and Faculty of Economics, University of Porto, Rua de Ceuta, 118, 6, 4150-190 Porto, Portugal | [x] LIAAD-University of Porto, Porto, Portugal | [y] Polytechnic Pablo de Olavide University, Seville, Spain | [z] University of Dortmund, Dortmund, Germany

Correspondence: [*] Corresponding author. Tel.: +34 95 213 28 63; Fax: +34 95 213 13 97; E-mail: [email protected].

Abstract: Classification is a quite relevant task within data analysis field. This task is not a trivial task and different difficulties can arise depending on the nature of the problem. All these difficulties can become worse when the datasets are too large or when new information can arrive at any time. Incremental learning is an approach that can be used to deal with the classification task in these cases. It must alleviate, or solve, the problem of limited time and memory resources. One emergent approach uses concentration bounds to ensure that decisions are made when enough information supports them. IADEM is one of the most recent algorithms that use this approach. The aim of this paper is to improve the performance of this algorithm in different ways: simplifying the complexity of the induced models, adding the ability to deal with continuous data, improving the detection of noise, selecting new criteria for evolutionating the model, including the use of more powerful prediction techniques, etc. Besides these new properties, the new system, IADEM-2, preserves the ability to obtain a performance similar to standard learning algorithms independently of the datasets size and it can incorporate new information as the basic algorithm does: using short time per example.

Keywords: Data mining, incremental learning, no example memory, Chernoff and Hoeffding bounds, decision trees, continuous attributes, functional leaves

DOI: 10.3233/IDA-2008-12305

Journal: Intelligent Data Analysis, vol. 12, no. 3, pp. 305-318, 2008

Published: 30 May 2008

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

如果您在出版方面需要帮助或有任何建, 件至: [email protected]

Share this:

North America

Europe

Asia