Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Issue title: Special Issue on Web Intelligence, Mining and Semantics
Guest editors: Costin Badica, Mirjana Ivanovic, Yannis Manolopoulos, Riccardo Rosati and Paolo Torroni
Article type: Research Article
Authors: Flisar, Jernej; * | Podgorelec, Vili
Affiliations: Koroška cesta 46, Maribor, Maribor, Slovenia. [email protected], [email protected]
Correspondence: [*] Address for correspondence: UM FERI, Koroška cesta 46, Maribor, Maribor, Slovenia
Abstract: With the emergence of social networks and micro-blogs, a huge amount of short textual documents are generated on a daily basis, for which effective tools for organization and classification are needed. These short text documents have extremely sparse representation, which is the main cause for the poor classification performance. We propose a new approach, where we identify relevant concepts in short text documents with the use of the DBpedia Spotlight framework and enrich the text with information derived from DBpedia ontology, which reduces the sparseness. We have developed six variants of text enrichment methods and tested them on four short text datasets using seven classification algorithms. The obtained results were compared to those of the baseline approach, among themselves, and also to some state-of-the-art text classification methods. Beside classification performance, the influence of the concepts similarity threshold and the size of the training data were also evaluated. The results show that the proposed text enrichment approach significantly improves classification of short texts and is robust with respect to different input sources, domains, and sizes of available training data. The proposed text enrichment methods proved to be beneficial for classification of short text documents, especially when only a small amount of documents are available for training.
Keywords: short text classification, DBPedia, ontology, text enrichment
DOI: 10.3233/FI-2020-1905
Journal: Fundamenta Informaticae, vol. 172, no. 3, pp. 261-297, 2020
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]