Affiliations: Institute for Information Systems, Technische Universität Braunschweig, Braunschweig, Germany. E-mail: {silviu,balke}@ifis.cs.tu-bs.de
Note: [] Coresponding author.
Abstract: The Web has become the primary source of information containing both structured and unstructured information. A good example is e-commerce where products are usually described by technical specifications (structured data) and textual user reviews (unstructured data). Both sources of information complement each other, covering quantifiable as well as perceived aspects of each product. In fact, for most searches users will have more or less abstract concepts in mind, as opposed to clear cut categorical information. In this paper, we develop a novel approach to reveal implicit product features for querying, by combining structured product data with natural-language product reviews. Using a self-supervised learning technique we progressively build a query-aware representation of the product domain under consideration. This representation can then effectively be used for intuitive querying. We performed extensive experiments confirming the effectiveness of our approach over real world product data. In particular, our evaluations show vastly improved precision and recall over the respective IR techniques.