Abstract: Corporations are extremely sensitive to issues such as brand stewardship and product reputation. Traditional brand image and reputation tracking is limited to news wires and contact centre analysis. However, with the emergence of web, Consumer Generated Media (CGM), such as blogs, news forums, message boards, and web pages/sites, is rapidly becoming the “voice of the people”. Effectively leveraging such readily available and massive amount of CGM content for brand and reputation insights can be extremely valuable to corporations. Yet the existing solutions in this space are commonly limited to keyword search based technologies that often result in excessive amount of information for users to digest manually. Some text mining based technologies exist for mining the web. However, they often target very narrow problems such as web page classification instead of looking at the overall stack of technologies required for web mining. This paper describes a holistically integrated brand and reputation analysis solution that mines CGM contents for insights, called COBRA (COrporate Brand and Reputation Analysis). COBRA contains a flexible ETL (Extract, Transform, and Load) engine that processes diverse sets of structured and unstructured information, a suite of analytical capabilities that mines CGM content to extract semantic entities and insights out of the data, and an alerting mechanism that utilizes a technique called “orthogonal filtering” to accurately generate brand and reputation alerts by filtering through orthogonal dimensions of information. We use a set of real-world case studies to demonstrate the effectiveness of our overall approach. We show that by applying such analytics techniques, we significantly improved the accuracy of alert identification by 10-200 times when compared to the popular keyword-search based techniques in today's alerting systems.
Keywords: Web mining, text analytics, alerting services, orthogonal filtering