Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Issue title: Ubiquitous Knowledge Discovery
Guest editors: João Gamax and Michael Mayy
Article type: Research Article
Authors: Gama, Joãoa; d; * | Rodrigues, Pedro Pereiraa; b; c | Lopes, Luísb; e
Affiliations: [a] LIAAD, University of Porto, Porto, Portugal | [b] Faculty of Sciences, University of Porto, Porto, Portugal | [c] Faculty of Medicine, University of Porto, Porto, Portugal | [d] Faculty of Economics, University of Porto, Porto, Portugal | [e] CRACS – INESC, Porto, Portugal | [x] LIAAD, University of Porto, Porto, Portugal | [y] Fraunhofer IAIS, Sankt Augustin, Germany
Correspondence: [*] Corresponding author: João Gama, LIAAD – INESC Porto L.A. Rua de Ceuta, 118-6 andar, 4050-190, Porto, Portugal. E-mail: [email protected].
Abstract: Nowadays applications produce infinite streams of data distributed across wide sensor networks. In this work we study the problem of continuously maintain a cluster structure over the data points generated by the entire network. Usual techniques operate by forwarding and concentrating the entire data in a central server, processing it as a multivariate stream. In this paper, we propose DGClust, a new distributed algorithm which reduces both the dimensionality and the communication burdens, by allowing each local sensor to keep an online discretization of its data stream, which operates with constant update time and (almost) fixed space. Each new data point triggers a cell in this univariate grid, reflecting the current state of the data stream at the local site. Whenever a local site changes its state, it notifies the central server about the new state it is in. This way, at each point in time, the central site has the global multivariate state of the entire network. To avoid monitoring all possible states, which is exponential in the number of sensors, the central site keeps a small list of counters of the most frequent global states. Finally, a simple adaptive partitional clustering algorithm is applied to the frequent states central points in order to provide an anytime definition of the clusters centers. The approach is evaluated in the context of distributed sensor networks, focusing on three outcomes: loss to real centroids, communication prevention, and processing reduction. The experimental work on synthetic data supports our proposal, presenting robustness to a high number of sensors, and the application to real data from physiological sensors exposes the aforementioned advantages of the system.
Keywords: Online adaptive clustering, distributed data streams, sensor networks, incremental discretization, frequent items monitoring
DOI: 10.3233/IDA-2010-0453
Journal: Intelligent Data Analysis, vol. 15, no. 1, pp. 3-28, 2011
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]