Affiliations: Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom | Hypios Research, 187 rue du Temple, 75003 Paris, France STIH, University Paris-Sorbonne, 28 rue de Serpente, 75006 Paris, France E-mail: [email protected]
Abstract: Microblogging platforms, such as Twitter, now provide web users with an on-demand service to share and consume fragments of information. Such fragments often refer to real-world events (e.g., shows, conferences) and often refer to a particular event component (such as a particular talk), providing a bridge between the real and virtual worlds. The utility of tweets allows companies and organisations to quickly gauge feedback about their services, and provides event organisers with information describing how participants feel about their event. However, the scale of the Web, and the sheer number of Tweets which are published on an hourly basis, makes manually identifying event tweets difficult. In this paper we present an automated approach to align tweets with the events which they refer to. We aim to provide alignments on the sub-event level of granularity. We test two different machine learning-based techniques: proximity-based clustering and classification using Naive Bayes. We evaluate the performance of our approach using a dataset of tweets collected from the Extended Semantic Web Conference 2010. The best F0.2 scores obtained in our experiments for proximity-based clustering and Naive Bayes were 0.544 and 0.728 respectively.
Keywords: Social Web, semantic Web, machine learning, twitter