Affiliations: Department of Math and Computer Science, Saint Mary's
University, Halifax, Nova Scotia, Canada, B3H 3C3 | Department of Computer Science and Engineering,
Faculty of Electrical Engineering, Czech Technical University, Karlovo Nam. 13,
121 35 Prague 2, Czech Republic
Abstract: Web usage mining involves application of data mining techniques to
discover usage patterns from the web data. Clustering is one of the important
functions in web usage mining. The likelihood of bad or incomplete web usage
data is higher than the conventional applications. The clusters and
associations in web usage mining do not necessarily have crisp boundaries.
Researchers have studied the possibility of using fuzzy sets in web mining
clustering applications. Recent attempts have adapted the K-means clustering
algorithm as well as genetic algorithms based on rough sets to find interval
sets of clusters. The genetic algorithms based clustering may not be able to
handle large amounts of data. The K-means algorithm does not lend itself well
to adaptive clustering. This paper proposes an adaptation of Kohonen
self-organizing maps based on the properties of rough sets, to find the
interval sets of clusters. Experiments are used to create interval set
representations of clusters of web visitors on three educational web sites. The
proposed approach has wider applications in other areas of web mining as well
as data mining.