Abstract: This paper discusses the applications of rough ensemble classifier [27] in two emerging problems of web mining, the categorization of web services and the topic specific web crawling. Both applications, discussed here, consist of two major steps: (1) split of feature space based on internal tag structure of web services and hypertext to represent in a tensor space model, and (2) combining classifications obtained on different tensor components using rough ensemble classifier. In the first application we have discussed the classification of web services. Two step improvement on the existing classification results of web services has been shown here. In the first step we achieve better classification results over existing, by using tensor space model. In the second step further improvement of the results has been obtained by using Rough set based ensemble classifier. In the second application we have discussed the focused crawling using rough ensemble prediction. Our experiment regarding this application has provided better Harvest rate and better Target recall for focused crawling.
Keywords: Rough ensemble classifier, web service categorization, WSDL tag structure, focused crawling, URL prediction