Neural Network Based Document Clustering Using WordNet Ontologies

Hung, Chihli; Wermter, Stefan

doi:10.3233/HIS-2004-13-402

Neural Network Based Document Clustering Using WordNet Ontologies

Article type: Research Article

Authors: Hung, Chihli^a | Wermter, Stefan^b

Affiliations: [a] De Lin Institute of Technology, Taiwan. [email protected] | [b] Centre for Hybrid Intelligent Systems School of Computing and Technology, University of Sunderland, UK. [email protected]

Abstract: Three novel text vector representation approaches for neural network based document clustering are proposed. The first is the extended significance vector model (ESVM), the second is the hypernym significance vector model (HSVM) and the last is the hybrid vector space model (HyM). ESVM extracts the relationship between words and their preferred classified labels. HSVM exploits a semantic relationship from the WordNet ontology. A more general term, the hypernym, substitutes for terms with similar concepts. This hypernym semantic relationship supplements the neural model in document clustering. HyM is a combination of a TFxIDF vector and a hypernym significance vector, which combines the advantages and reduces the disadvantages from both unsupervised and supervised vector representation approaches. According to our experiments, the self-organising map (SOM) model based on the HyM text vector representation approach is able to improve classification accuracy and to reduce the average quantization error (AQE) on 10,000 full-text articles.

Keywords: document clustering, neural news classification, WordNet, self-organising map (SOM)

DOI: 10.3233/HIS-2004-13-402

Journal: International Journal of Hybrid Intelligent Systems, vol. 1, no. 3-4, pp. 127-142, 2004

Published: 19 January 2005

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

如果您在出版方面需要帮助或有任何建, 件至: [email protected]

Share this:

North America

Europe

Asia