Literally better: Analyzing and improving the quality of literals

Beek, Wouter; Ilievski, Filip; Debattista, Jeremy; Schlobach, Stefan; Wielemaker, Jan

doi:10.3233/SW-170288

Literally better: Analyzing and improving the quality of literals

Issue title: Quality management of Semantic Web assets (data, services and systems)

Guest editors: Amrapali Zaveri, Dimitris Kontokostas, Sebastian Hellmann and Jürgen Umbrich

Article type: Research Article

Authors: Beek, Wouter^{a; *} | Ilievski, Filip^a | Debattista, Jeremy^b | Schlobach, Stefan^a | Wielemaker, Jan^a

Affiliations: [a] VU University Amsterdam, The Netherlands. E-mails: [email protected], [email protected], [email protected], [email protected] | [b] University of Bonn & Fraunhofer IAIS, Germany. E-mail: [email protected]

Correspondence: [*] Corresponding author. E-mail: [email protected].

Abstract: Quality is a complicated and multifarious topic in contemporary Linked Data research. The aspect of literal quality in particular has not yet been rigorously studied. Nevertheless, analyzing and improving the quality of literals is important since literals form a substantial (one in seven statements) and crucial part of the Semantic Web. Specifically, literals allow infinite value spaces to be expressed and they provide the linguistic entry point to the LOD Cloud. We present a toolchain that builds on the LOD Laundromat data cleaning and republishing infrastructure and that allows us to analyze the quality of literals on a very large scale, using a collection of quality criteria we specify in a systematic way. We illustrate the viability of our approach by lifting out two particular aspects in which the current LOD Cloud can be immediately improved by automated means: value canonization and language tagging. Since not all quality aspects can be addressed algorithmically, we also give an overview of other problems that can be used to guide future endeavors in tooling, training, and best practice formulation.

Keywords: Data quality, data observatory, quality assessment, quality improvement, linked data

DOI: 10.3233/SW-170288

Journal: Semantic Web, vol. 9, no. 1, pp. 131-150, 2018

Published: 30 November 2017

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

如果您在出版方面需要帮助或有任何建, 件至: [email protected]

Share this:

North America

Europe

Asia