Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Purchase individual online access for 1 year to this journal.
Price: EUR 170.00Impact Factor 2023: 3
The journal Semantic Web – Interoperability, Usability, Applicability is an international and interdisciplinary journal bringing together researchers from various fields which share the vision and need for more effective and meaningful ways to share information across agents and services on the future Internet and elsewhere.
As such, Semantic Web technologies shall support the seamless integration of data, on-the-fly composition and interoperation of Web services, as well as more intuitive search engines. The semantics – or meaning – of information, however, cannot be defined without a context, which makes personalization, trust and provenance core topics for Semantic Web research.
New retrieval paradigms, user interfaces and visualization techniques have to unleash the power of the Semantic Web and at the same time hide its complexity from the user. Based on this vision, the journal welcomes contributions ranging from theoretical and foundational research over methods and tools to descriptions of concrete ontologies and applications in all areas. Papers which add a social, spatial and temporal dimension to Semantic Web research, as well as application-oriented papers making use of formal semantics, are especially welcome.
The journal is co-published by the Akademische Verlagsgesellschaft AKA.
Authors: Liartis, Jason | Dervakos, Edmund | Menis-Mastromichalakis, Orfeas | Chortaras, Alexandros | Stamou, Giorgos
Article Type: Research Article
Abstract: Deep learning models have achieved impressive performance in various tasks, but they are usually opaque with regards to their inner complex operation, obfuscating the reasons for which they make decisions. This opacity raises ethical and legal concerns regarding the real-life use of such models, especially in critical domains such as in medicine, and has led to the emergence of the eXplainable Artificial Intelligence (XAI) field of research, which aims to make the operation of opaque AI systems more comprehensible to humans. The problem of explaining a black-box classifier is often approached by feeding it data and observing its behaviour. In …this work, we feed the classifier with data that are part of a knowledge graph, and describe the behaviour with rules that are expressed in the terminology of the knowledge graph, that is understandable by humans. We first theoretically investigate the problem to provide guarantees for the extracted rules and then we investigate the relation of “explanation rules for a specific class” with “semantic queries collecting from the knowledge graph the instances classified by the black-box classifier to this specific class”. Thus we approach the problem of extracting explanation rules as a semantic query reverse engineering problem. We develop algorithms for solving this inverse problem as a heuristic search in the space of semantic queries and we evaluate the proposed algorithms on four simulated use-cases and discuss the results. Show more
Keywords: Explainable AI (XAI), opaque machine learning classifiers, knowledge graphs, description logics, semantic query answering, reverse query answering, post-hoc explainability, explanation rules
DOI: 10.3233/SW-233469
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-42, 2023
Authors: Adamski, Dariusz Max | Potoniec, Jędrzej
Article Type: Research Article
Abstract: We present a novel approach for learning embeddings of ALC knowledge base concepts. The embeddings reflect the semantics of the concepts in such a way that it is possible to compute an embedding of a complex concept from the embeddings of its parts by using appropriate neural constructors. Embeddings for different knowledge bases are vectors in a shared vector space, shaped in such a way that approximate subsumption checking for arbitrarily complex concepts can be done by the same neural network, called a reasoner head, for all the knowledge bases. To underline this unique property of enabling …reasoning directly on embeddings, we call them reason-able embeddings. We report the results of experimental evaluation showing that the difference in reasoning performance between training a separate reasoner head for each ontology and using a shared reasoner head, is negligible. Show more
Keywords: Neural-symbolic integration, deep deductive reasoning, embeddings, transfer learning, deep learning
DOI: 10.3233/SW-233355
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-33, 2023
Authors: Cima, Gianluca | Croce, Federico | Lenzerini, Maurizio
Article Type: Research Article
Abstract: Given two datasets, i.e., two sets of tuples of constants, representing positive and negative examples, logical separability is the reasoning task of finding a formula in a certain target query language that separates them. As already pointed out in previous works, this task turns out to be relevant in several application scenarios such as concept learning and generating referring expressions. Besides, if we think of the input datasets of positive and negative examples as composed of tuples of constants classified, respectively, positively and negatively by a black-box model, then the separating formula can be used to provide global post-hoc explanations …of such a model. In this paper, we study the separability task in the context of Ontology-based Data Management (OBDM), in which a domain ontology provides a high-level, logic-based specification of a domain of interest, semantically linked through suitable mapping assertions to the data source layer of an information system. Since a formula that properly separates (proper separation) two input datasets does not always exist, our first contribution is to propose (best) approximations of the proper separation, called (minimally) complete and (maximally) sound separations. We do this by presenting a general framework for separability in OBDM. Then, in a scenario that uses by far the most popular languages for the OBDM paradigm, our second contribution is a comprehensive study of three natural computational problems associated with the framework, namely Verification (check whether a given formula is a proper, complete, or sound separation of two given datasets), Existence (check whether a proper, or best approximated separation of two given datasets exists at all), and Computation (compute any proper, or any best approximated separation of two given datasets). Show more
Keywords: Ontology-based Data Management, Separability, Explainable Artificial Intelligence, Semantic Technologies
DOI: 10.3233/SW-233391
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-36, 2023
Authors: Rivas, Ariam | Collarana, Diego | Torrente, Maria | Vidal, Maria-Esther
Article Type: Research Article
Abstract: Neuro-Symbolic Artificial Intelligence (AI) focuses on integrating symbolic and sub-symbolic systems to enhance the performance and explainability of predictive models. Symbolic and sub-symbolic approaches differ fundamentally in how they represent data and make use of data features to reach conclusions. Neuro-symbolic systems have recently received significant attention in the scientific community. However, despite efforts in neural-symbolic integration, symbolic processing can still be better exploited, mainly when these hybrid approaches are defined on top of knowledge graphs. This work is built on the statement that knowledge graphs can naturally represent the convergence between data and their contextual meaning (i.e., knowledge). We …propose a hybrid system that resorts to symbolic reasoning, expressed as a deductive database, to augment the contextual meaning of entities in a knowledge graph, thus, improving the performance of link prediction implemented using knowledge graph embedding (KGE) models. An entity context is defined as the ego network of the entity in a knowledge graph. Given a link prediction task, the proposed approach deduces new RDF triples in the ego networks of the entities corresponding to the heads and tails of the prediction task on the knowledge graph (KG). Since knowledge graphs may be incomplete and sparse, the facts deduced by the symbolic system not only reduce sparsity but also make explicit meaningful relations among the entities that compose an entity ego network. As a proof of concept, our approach is applied over a KG for lung cancer to predict treatment effectiveness. The empirical results put the deduction power of deductive databases into perspective. They indicate that making explicit deduced relationships in the ego networks empowers all the studied KGE models to generate more accurate links. Show more
Keywords: Neuro-symbolic artificial intelligence, deductive systems, knowledge graph embeddings, drug-drug interactions
DOI: 10.3233/SW-233324
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-25, 2023
Authors: Badenes-Olmedo, Carlos | Corcho, Oscar
Article Type: Research Article
Abstract: There are two main limitations in most of the existing Knowledge Graph Question Answering (KGQA) algorithms. First, the approaches depend heavily on the structure and cannot be easily adapted to other KGs. Second, the availability and amount of additional domain-specific data in structured or unstructured formats has also proven to be critical in many of these systems. Such dependencies limit the applicability of KGQA systems and make their adoption difficult. A novel algorithm is proposed, MuHeQA, that alleviates both limitations by retrieving the answer from textual content automatically generated from KGs instead of queries over them. This new approach (1) works …on one or several KGs simultaneously, (2) does not require training data what makes it is domain-independent, (3) enables the combination of knowledge graphs with unstructured information sources to build the answer, and (4) reduces the dependency on the underlying schema since it does not navigate through structured content but only reads property values. MuHeQA extracts answers from textual summaries created by combining information related to the question from multiple knowledge bases, be them structured or not. Experiments over Wikidata and DBpedia show that our approach achieves comparable performance to other approaches in single-fact questions while being domain and KG independent. Results raise important questions for future work about how the textual content that can be created from knowledge graphs enables answer extraction. Show more
Keywords: Question answering, natural language processing, knowledge graphs
DOI: 10.3233/SW-233379
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-15, 2023
Authors: Stüber, Moritz | Frey, Georg
Article Type: Research Article
Abstract: Modelling and Simulation (M&S) are core tools for designing, analysing and operating today’s industrial systems. They often also represent both a valuable asset and a significant investment. Typically, their use is constrained to a software environment intended to be used by engineers on a single computer. However, the knowledge relevant to a task involving modelling and simulation is in general distributed in nature, even across organizational boundaries, and may be large in volume. Therefore, it is desirable to increase the FAIRness (Findability, Accessibility, Interoperability, and Reuse) of M&S capabilities; to enable their use in loosely coupled systems of systems; and …to support their composition and execution by intelligent software agents. In this contribution, the suitability of Semantic Web technologies to achieve these goals is investigated and an open-source proof of concept-implementation based on the Functional Mock-up Interface (FMI) standard is presented. Specifically, models, model instances, and simulation results are exposed through a hypermedia API and an implementation of the Pragmatic Proof Algorithm (PPA) is used to successfully demonstrate the API’s use by a generic software agent. The solution shows an increased degree of FAIRness and fully supports its use in loosely coupled systems. The FAIRness could be further improved by providing more “ rich” (meta)data. Show more
Keywords: Models and Simulation as a Service, FMI, hypermedia API, Pragmatic Proof Algorithm, FAIR principles
DOI: 10.3233/SW-233359
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-36, 2023
Authors: Flores, Javier | Rabbani, Kashif | Nadal, Sergi | Gómez, Cristina | Romero, Oscar | Jamin, Emmanuel | Dasiopoulou, Stamatia
Article Type: Research Article
Abstract: Virtual data integration is the current approach to go for data wrangling in data-driven decision-making. In this paper, we focus on automating schema integration, which extracts a homogenised representation of the data source schemata and integrates them into a global schema to enable virtual data integration. Schema integration requires a set of well-known constructs: the data source schemata and wrappers, a global integrated schema and the mappings between them. Based on them, virtual data integration systems enable fast and on-demand data exploration via query rewriting. Unfortunately, the generation of such constructs is currently performed in a largely manual manner, hindering …its feasibility in real scenarios. This becomes aggravated when dealing with heterogeneous and evolving data sources. To overcome these issues, we propose a fully-fledged semi-automatic and incremental approach grounded on knowledge graphs to generate the required schema integration constructs in four main steps: bootstrapping, schema matching, schema integration, and generation of system-specific constructs. We also present Nextia DI , a tool implementing our approach. Finally, a comprehensive evaluation is presented to scrutinize our approach. Show more
Keywords: Schema integration, bootstrapping, virtual data integration
DOI: 10.3233/SW-233347
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-38, 2023
Authors: Umbrico, Alessandro | Cesta, Amedeo | Orlandini, Andrea
Article Type: Research Article
Abstract: The diffusion of Human-Robot Collaborative cells is prevented by several barriers. Classical control approaches seem not yet fully suitable for facing the variability conveyed by the presence of human operators beside robots. The capabilities of representing heterogeneous knowledge representation and performing abstract reasoning are crucial to enhance the flexibility of control solutions. To this aim, the ontology SOHO (Sharework Ontology for Human-Robot Collaboration) has been specifically designed for representing Human-Robot Collaboration scenarios, following a context-based approach. This work brings several contributions. This paper proposes an extension of SOHO to better characterize behavioral constraints of collaborative tasks. Furthermore, this work shows …a knowledge extraction procedure designed to automatize the synthesis of Artificial Intelligence plan-based controllers for realizing flexible coordination of human and robot behaviors in collaborative tasks. The generality of the ontological model and the developed representation capabilities as well as the validity of the synthesized planning domains are evaluated on a number of realistic industrial scenarios where collaborative robots are actually deployed. Show more
Keywords: Ontology, knowledge representation and reasoning, Human-Robot Collaboration, automated planning and scheduling, Artificial Intelligence
DOI: 10.3233/SW-233394
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-40, 2023
Authors: Bareedu, Yashoda Saisree | Frühwirth, Thomas | Niedermeier, Christoph | Sabou, Marta | Steindl, Gernot | Thuluva, Aparna Saisree | Tsaneva, Stefani | Tufek Ozkaya, Nilay
Article Type: Research Article
Abstract: Industrial standards provide guidelines for data modeling to ensure interoperability between stakeholders of an industry branch (e.g., robotics). Most frequently, such guidelines are provided in an unstructured format (e.g., pdf documents) which hampers the automated validations of information objects (e.g., data models) that rely on such standards in terms of their compliance with the modeling constraints prescribed by the guidelines. This raises the risk of costly interoperability errors induced by the incorrect use of the standards. There is, therefore, an increased interest in automatic semantic validation of information objects based on industrial standards. In this paper we focus on an …approach to semantic validation by formally representing the modeling constraints from unstructured documents as explicit, machine-actionable rules (to be then used for semantic validation) and (semi-)automatically extracting such rules from pdf documents. While our approach aims to be generically applicable, we exemplify an adaptation of the approach in the concrete context of the OPC UA industrial standard, given its large-scale adoption among important industrial stakeholders and the OPC UA internal efforts towards semantic validation. We conclude that (i) it is feasible to represent modeling constraints from the standard specifications as rules, which can be organized in a taxonomy and represented using Semantic Web technologies such as OWL and SPARQL; (ii) we could automatically identify modeling constraints in the specification documents by inspecting the tables (P = 87 % ) and text of these documents (F1 up to 94%); (iii) the translation of the modeling constraints into formal rules could be fully automated when constraints were extracted from tables and required a Human-in-the-loop approach for constraints extracted from text. Show more
Keywords: Semantic validation, information extraction, natural language processing, human-in-the-loop, OPC UA
DOI: 10.3233/SW-233342
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-38, 2023
Authors: Yahya, Muhammad | Ali, Aabid | Mehmood, Qaiser | Yang, Lan | Breslin, John G. | Ali, Muhammad Intizar
Article Type: Research Article
Abstract: Industry 4.0 (I4.0) is a new era in the industrial revolution that emphasizes machine connectivity, automation, and data analytics. The I4.0 pillars such as autonomous robots, cloud computing, horizontal and vertical system integration, and the industrial internet of things have increased the performance and efficiency of production lines in the manufacturing industry. Over the past years, efforts have been made to propose semantic models to represent the manufacturing domain knowledge, one such model is Reference Generalized Ontological Model (RGOM).1 1 https://w3id.org/rgom However, its adaptability like other models is not ensured due to the lack of manufacturing …data. In this paper, we aim to develop a benchmark dataset for knowledge graph generation in Industry 4.0 production lines and to show the benefits of using ontologies and semantic annotations of data to showcase how the I4.0 industry can benefit from KGs and semantic datasets. This work is the result of collaboration with the production line managers, supervisors, and engineers in the football industry to acquire realistic production line data2 2 https://github.com/MuhammadYahta/ManufacturingProductionLineDataSetGeneration-Football , .3 3 https://zenodo.org/record/7779522 Knowledge Graphs (KGs) or Knowledge Graph (KG) have emerged as a significant technology to store the semantics of the domain entities. KGs have been used in a variety of industries, including banking, the automobile industry, oil and gas, pharmaceutical and health care, publishing, media, etc. The data is mapped and populated to the RGOM classes and relationships using an automated solution based on JenaAPI, producing an I4.0 KG. It contains more than 2.5 million axioms and about 1 million instances. This KG enables us to demonstrate the adaptability and usefulness of the RGOM. Our research helps the production line staff to take timely decisions by exploiting the information embedded in the KG. In relation to this, the RGOM adaptability is demonstrated with the help of a use case scenario to discover required information such as current temperature at a particular time, the status of the motor, tools deployed on the machine, etc. https://w3id.org/rgom https://github.com/MuhammadYahta/ManufacturingProductionLineDataSetGeneration-Football https://zenodo.org/record/7779522 Show more
Keywords: Industry 4.0, production line, Knowledge Graphs, Industry 4.0 Knowledge Graph
DOI: 10.3233/SW-233431
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-19, 2023
Authors: Teze, Juan Carlos L. | Paredes, Jose Nicolas | Martinez, Maria Vanina | Simari, Gerardo Ignacio
Article Type: Research Article
Abstract: The role of explanations in intelligent systems has in the last few years entered the spotlight as AI-based solutions appear in an ever-growing set of applications. Though data-driven (or machine learning) techniques are often used as examples of how opaque (also called black box) approaches can lead to problems such as bias and general lack of explainability and interpretability, in reality these features are difficult to tame in general, even for approaches that are based on tools typically considered to be more amenable, like knowledge-based formalisms. In this paper, we continue a line of research and development towards building tools …that facilitate the implementation of explainable and interpretable hybrid intelligent socio-technical systems, focusing on features that users can leverage to build explanations to their queries. In particular, we present the implementation of a recently-proposed application framework (and make available its source code) for developing such systems, and explore user-centered mechanisms for building explanations based both on the kinds of explanations required (such as counterfactual, contextual, etc.) and the inputs used for building them (coming from various sources, such as the knowledge base and lower-level data-driven modules). In order to validate our approach, we develop two use cases, one as a running example for detecting hate speech in social platforms and the other as an extension that also contemplates cyberbullying scenarios. Show more
Keywords: Ontological languages, socio-technical systems, Explainable Artificial Intelligence, hate speech in social platforms
DOI: 10.3233/SW-233297
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-30, 2023
Authors: Faria, Daniel | Santos, Emanuel | Balasubramani, Booma Sowkarthiga | Silva, Marta C. | Couto, Francisco M. | Pesquita, Catia
Article Type: Research Article
Abstract: Ontology matching establishes correspondences between entities of related ontologies, with applications ranging from enabling semantic interoperability to supporting ontology and knowledge graph development. Its demand within the Semantic Web community is on the rise, as the popularity of knowledge graph supporting information systems or artificial intelligence applications continues to increase. In this article, we showcase AgreementMakerLight (AML), an ontology matching system in continuous development since 2013, with demonstrated performance over nine editions of the Ontology Alignment Evaluation Initiative (OAEI), and a history of real-world applications across a variety of domains. We overview AML’s architecture and algorithms, its user interfaces …and functionalities, its performance, and its impact. AML has participated in more OAEI tracks since 2013 than any other matching system, has a median rank by F-measure between 1 and 2 across all tracks in every year since 2014, and a rank by run time between 3 and 4. Thus, it offers a combination of range, quality and efficiency that few matching systems can rival. Moreover, AML’s impact can be gauged by the 263 (non-self) publications that cite one or more of its papers, among which we count 34 real-world applications. Show more
Keywords: Ontology matching, instance matching, tool
DOI: 10.3233/SW-233304
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-13, 2023
Authors: Brewster, Christopher | Kalatzis, Nikos | Nouwt, Barry | Kruiger, Han | Verhoosel, Jack
Article Type: Research Article
Abstract: The agrifood system faces a great many economic, social and environmental challenges. One of the biggest practical challenges has been to achieve greater data sharing throughout the agrifood system and the supply chain, both to inform other stakeholders about a product and equally to incentivise greater environmental sustainability. In this paper, a data sharing architecture is described built on three principles (a) reuse of existing semantic standards; (b) integration with legacy systems; and (c) a distributed architecture where stakeholders control access to their own data. The system has been developed based on the requirements of commercial users and is designed …to allow queries across a federated network of agrifood stakeholders. The Ploutos semantic model is built on an integration of existing ontologies. The Ploutos architecture is built on a discovery directory and interoperability enablers, which use graph query patterns to traverse the network and collect the requisite data to be shared. The system is exemplified in the context of a pilot involving commercial stakeholders in the processed fruit sector. The data sharing approach is highly extensible with considerable potential for capturing sustainability related data. Show more
Keywords: Data sharing, supply chain, agrifood, graph pattern, ontology, Farm Management Systems
DOI: 10.3233/SW-233287
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-31, 2023
Authors: Păis,, Vasile | Mitrofan, Maria | Gasan, Carol Luca | Ianov, Alexandru | Ghit,ă, Corvin | Coneschi, Vlad Silviu | Onut,, Andrei
Article Type: Research Article
Abstract: LegalNERo is a manually annotated corpus for named entity recognition in the Romanian legal domain. It provides gold annotations for organizations, locations, persons, time expressions and legal resources mentioned in legal documents. Furthermore, GeoNames identifiers are provided. The resource is available in multiple formats, including span-based, token-based and RDF. The Linked Open Data version is available for both download and querying using SPARQL.
Keywords: Named entity recognition, linguistic linked data, Romanian language, corpus
DOI: 10.3233/SW-233351
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-14, 2023
Authors: Erenrich, Daniel
Article Type: Research Article
Abstract: Despite its size, Wikidata remains incomplete and inaccurate in many areas. Hundreds of thousands of articles on English Wikipedia have zero or limited meaningful structure on Wikidata. Much work has been done in the literature to partially or fully automate the process of completing knowledge graphs, but little of it has been practically applied to Wikidata. This paper presents two interconnected practical approaches to speeding up the Wikidata completion task. The first is Wwwyzzerdd, a browser extension that allows users to quickly import statements from Wikipedia to Wikidata. Wwwyzzerdd has been used to make over 100 thousand edits to Wikidata. …The second is Psychiq, a new model for predicting instance and subclass statements based on English Wikipedia articles. Psychiq’s performance and characteristics make it well suited to solving a variety of problems for the Wikidata community. One initial use is integrating the Psychiq model into the Wwwyzzerdd browser extension. Show more
Keywords: Wikidata, Wikipedia, browser extension, knowledge graph completion
DOI: 10.3233/SW-233450
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-14, 2023
Authors: Amaral, Gabriel | Rodrigues, Odinaldo | Simperl, Elena
Article Type: Research Article
Abstract: Knowledge Graphs are repositories of information that gather data from a multitude of domains and sources in the form of semantic triples, serving as a source of structured data for various crucial applications in the modern web landscape, from Wikipedia infoboxes to search engines. Such graphs mainly serve as secondary sources of information and depend on well-documented and verifiable provenance to ensure their trustworthiness and usability. However, their ability to systematically assess and assure the quality of this provenance, most crucially whether it properly supports the graph’s information, relies mainly on manual processes that do not scale with size. ProVe …aims at remedying this, consisting of a pipelined approach that automatically verifies whether a Knowledge Graph triple is supported by text extracted from its documented provenance. ProVe is intended to assist information curators and consists of four main steps involving rule-based methods and machine learning models: text extraction, triple verbalisation, sentence selection, and claim verification. ProVe is evaluated on a Wikidata dataset, achieving promising results overall and excellent performance on the binary classification task of detecting support from provenance, with 87.5 % accuracy and 82.9 % F1-macro on text-rich sources. The evaluation data and scripts used in this paper are available in GitHub and Figshare. Show more
Keywords: Fact verification, data verbalisation, knowledge graphs
DOI: 10.3233/SW-233467
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-34, 2023
Authors: Yaman, Beyza | Thompson, Kevin | Fahey, Fergus | Brennan, Rob
Article Type: Research Article
Abstract: This work describes the application of semantic web standards to data quality governance of data production pipelines in the architectural, engineering, and construction (AEC) domain for Ordnance Survey Ireland (OSi). It illustrates a new approach to data quality governance based on establishing a unified knowledge graph for data quality measurements across a complex, heterogeneous, quality-centric data production pipeline. It provides the first comprehensive formal mappings between semantic models of data quality dimensions defined by the four International Organization for Standardization (ISO) and World Wide Web Consortium (W3C) data quality standards applied by different tools and stakeholders. It provides an approach …to uplift rule-based data quality reports into quality metrics suitable for aggregation and end-to-end analysis. Current industrial practice tends towards stove-piped, vendor-specific and domain-dependent tools to process data quality observations however there is a lack of open techniques and methodologies for combining quality measurements derived from different data quality standards to provide end-to-end data quality reporting, root cause analysis or visualisation. This work demonstrated that it is effective to use a knowledge graph and semantic web standards to unify distributed data quality monitoring in an organisation and present the results in an end-to-end data dashboard in a data quality standards-agnostic fashion for the Ordnance Survey Ireland data publishing pipeline. Show more
Keywords: Geospatial Linked Data, data quality, data governance
DOI: 10.3233/SW-233293
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-27, 2023
Authors: Zeginis, Dimitris | Kalampokis, Evangelos | Palma, Raul | Atkinson, Rob | Tarabanis, Konstantinos
Article Type: Research Article
Abstract: At the domains of agriculture and livestock farming a large amount of data are produced through numerous heterogeneous sources including sensor data, weather/climate data, statistical and government data, drone/satellite imagery, video, and maps. This plethora of data can be used at precision agriculture and precision livestock farming in order to provide predictive insights in farming operations, drive real-time operational decisions, redesign business processes and support policy-making. The predictive power of the data can be further boosted if data from diverse sources are integrated and processed together, thus providing more unexplored insights. However, the exploitation and integration of data used in …precision agriculture is not straightforward since they: i) cannot be easily discovered across the numerous heterogeneous sources and ii) use different structural and naming conventions hindering their interoperability. The aim of this paper is to: i) study the characteristics of data used in precision agriculture & livestock farming and ii) study the user requirements related to data modeling and processing from nine real cases at the agriculture, livestock farming and aquaculture domains and iii) propose a semantic meta-model that is based on W3C standards (DCAT, PROV-O and QB vocabulary) in order to enable the definition of metadata that facilitate the discovery, exploration, integration and accessing of data in the domain. Show more
Keywords: Semantic model, metadata, data integration, precision agriculture, precision livestock farming, DCAT
DOI: 10.3233/SW-233156
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-29, 2023
Authors: Thornton, Katherine | Seals-Nutt, Kenneth | Matsuzaki, Mika | Dooley, Damion
Article Type: Research Article
Abstract: We describe our work to integrate the FoodOn ontology with our knowledge base of food composition data, WikiFCD. WikiFCD is knowledge base of structured data related to food composition and food items. With a goal to reuse FoodOn identifiers for food items, we imported a subset of the FoodOn ontology into the WikiFCD knowledge base. We aligned the import via a shared use of NCBI taxon identifiers for the taxon names of the plants from which the food items are derived. Reusing FoodOn benefits WikiFCD by allowing us to leverage the food item groupings that FoodOn contains. This integration also …has potential future benefits for the FoodOn community due to the fact that WikiFCD provides food composition data at the food item level, and that WikiFCD is mapped to Wikidata and contains a SPARQL endpoint that supports federated queries. Federated queries across WikiFCD and Wikidata allow us to ask questions about food items that benefit from the cross-domain information of Wikidata, greatly increasing the breadth of possible data combinations. Show more
Keywords: Food composition data, Wikibase, FoodOn
DOI: 10.3233/SW-233207
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-12, 2023
Authors: Woods, Caitlin | Selway, Matt | Bikaun, Tyler | Stumptner, Markus | Hodkiewicz, Melinda
Article Type: Research Article
Abstract: Maintenance of assets is a multi-million dollar cost each year for asset intensive organisations in the defence, manufacturing, resource and infrastructure sectors. These costs are tracked though maintenance work order (MWO) records. MWO records contain structured data for dates, costs, and asset identification and unstructured text describing the work required, for example ‘replace leaking pump’. Our focus in this paper is on data quality for maintenance activity terms in MWO records (e.g. replace , repair , adjust and inspect ). We present two contributions in this paper. First, we propose a reference ontology for maintenance activity terms. We …use natural language processing to identify seven core maintenance activity terms and their synonyms from 800,000 MWOs. We provide elucidations for these seven terms. Second, we demonstrate use of the reference ontology in an application-level ontology using an industrial use case. The end-to-end NLP-ontology pipeline identifies data quality issues with 55% of the MWO records for a centrifugal pump over 8 years. For the 33% of records where a verb was not provided in the unstructured text, the ontology can infer a relevant activity class. The selection of the maintenance activity terms is informed by the ISO 14224 and ISO 15926-4 standards and conforms to ISO/IEC 21838-2 Basic Formal Ontology (BFO). The reference and application ontologies presented here provide an example for how industrial organisations can augment their maintenance work management processes with ontological workflows to improve data quality. Show more
Keywords: Maintenance work order, ontology, natural language processing, centrifugal pump
DOI: 10.3233/SW-233299
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-34, 2023
Authors: Li, Juan | Chen, Xiangnan | Yu, Hongtao | Chen, Jiaoyan | Zhang, Wen
Article Type: Research Article
Abstract: Knowledge graph reasoning (KGR) aims to infer new knowledge or detect noises, which is essential for improving the quality of knowledge graphs. Recently, various KGR techniques, such as symbolic- and embedding-based methods, have been proposed and shown strong reasoning ability. Symbolic-based reasoning methods infer missing triples according to predefined rules or ontologies. Although rules and axioms have proven effective, it is difficult to obtain them. Embedding-based reasoning methods represent entities and relations as vectors, and complete KGs via vector computation. However, they mainly rely on structural information and ignore implicit axiom information not predefined in KGs but can be reflected …in data. That is, each correct triple is also a logically consistent triple and satisfies all axioms. In this paper, we propose a novel NeuR al A xiom N etwork (NeuRAN ) framework that combines explicit structural and implicit axiom information without introducing additional ontologies. Specifically, the framework consists of a KG embedding module that preserves the semantics of triples and five axiom modules that encode five kinds of implicit axioms. These axioms correspond to five typical object property expression axioms defined in OWL2, including ObjectPropertyDomain , ObjectPropertyRange , DisjointObjectProperties , IrreflexiveObjectProperty and AsymmetricObjectProperty . The KG embedding module and axiom modules compute the scores that the triple conforms to the semantics and the corresponding axioms, respectively. Compared with KG embedding models and CKRL, our method achieves comparable performance on noise detection and triple classification and achieves significant performance on link prediction. Compared with TransE and TransH, our method improves the link prediction performance on the Hits@1 metric by 22.0% and 20.8% on WN18RR-10% dataset, respectively. Show more
Keywords: Knowledge graph reasoning, knowledge graph embedding, noise detection, triple classification, link prediction
DOI: 10.3233/SW-233276
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-16, 2023
Authors: Chari, Shruthi | Seneviratne, Oshani | Ghalwash, Mohamed | Shirai, Sola | Gruen, Daniel M. | Meyer, Pablo | Chakraborty, Prithwish | McGuinness, Deborah L.
Article Type: Research Article
Abstract: In the past decade, trustworthy Artificial Intelligence (AI) has emerged as a focus for the AI community to ensure better adoption of AI models, and explainable AI is a cornerstone in this area. Over the years, the focus has shifted from building transparent AI methods to making recommendations on how to make black-box or opaque machine learning models and their results more understandable by experts and non-expert users. In our previous work, to address the goal of supporting user-centered explanations that make model recommendations more explainable, we developed an Explanation Ontology (EO). The EO is a general-purpose representation that was …designed to help system designers connect explanations to their underlying data and knowledge. This paper addresses the apparent need for improved interoperability to support a wider range of use cases. We expand the EO, mainly in the system attributes contributing to explanations, by introducing new classes and properties to support a broader range of state-of-the-art explainer models. We present the expanded ontology model, highlighting the classes and properties that are important to model a larger set of fifteen literature-backed explanation types that are supported within the expanded EO. We build on these explanation type descriptions to show how to utilize the EO model to represent explanations in five use cases spanning the domains of finance, food, and healthcare. We include competency questions that evaluate the EO’s capabilities to provide guidance for system designers on how to apply our ontology to their own use cases. This guidance includes allowing system designers to query the EO directly and providing them exemplar queries to explore content in the EO represented use cases. We have released this significantly expanded version of the Explanation Ontology at https://purl.org/heals/eo and updated our resource website, https://tetherless-world.github.io/explanation-ontology , with supporting documentation. Overall, through the EO model, we aim to help system designers be better informed about explanations and support these explanations that can be composed, given their systems’ outputs from various AI models, including a mix of machine learning, logical and explainer models, and different types of data and knowledge available to their systems. Show more
Keywords: Explainable AI, semantic representation of explanations, Explanation Ontology, modeling explanation types – AI method outputs and knowledge, supporting patterns for explanation types
DOI: 10.3233/SW-233282
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-31, 2023
Authors: Glauer, Martin | Memariani, Adel | Neuhaus, Fabian | Mossakowski, Till | Hastings, Janna
Article Type: Research Article
Abstract: Reference ontologies provide a shared vocabulary and knowledge resource for their domain. Manual construction and annotation enables them to maintain high quality, allowing them to be widely accepted across their community. However, the manual ontology development process does not scale for large domains. We present a new methodology for automatic ontology extension for domains in which the ontology classes have associated graph-structured annotations, and apply it to the ChEBI ontology, a prominent reference ontology for life sciences chemistry. We train Transformer-based deep learning models on the leaf node structures from the ChEBI ontology and the classes to which they …belong. The models are then able to automatically classify previously unseen chemical structures, resulting in automated ontology extension. The proposed models achieved an overall F1 scores of 0.80 and above, improvements of at least 6 percentage points over our previous results on the same dataset. In addition, the models are interpretable: we illustrate that visualizing the model’s attention weights can help to explain the results by providing insight into how the model made its decisions. We also analyse the performance for molecules that have not been part of the ontology and evaluate the logical correctness of the resulting extension. Show more
Keywords: Ontology extension, ontology learning, chemical ontology, Transformers, automated classification, transfer learning, multi-label classification
DOI: 10.3233/SW-233183
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-22, 2023
Authors: Daga, Enrico | Groth, Paul
Article Type: Research Article
Abstract: Artificial intelligence systems are not simply built on a single dataset or trained model. Instead, they are made by complex data science workflows involving multiple datasets, models, preparation scripts, and algorithms. Given this complexity, in order to understand these AI systems, we need to provide explanations of their functioning at higher levels of abstraction. To tackle this problem, we focus on the extraction and representation of data journeys from these workflows. A data journey is a multi-layered semantic representation of data processing activity linked to data science code and assets. We propose an ontology to capture the essential elements …of a data journey and an approach to extract such data journeys. Using a corpus of Python notebooks from Kaggle, we show that we are able to capture high-level semantic data flow that is more compact than using the code structure itself. Furthermore, we show that introducing an intermediate knowledge graph representation outperforms models that rely only on the code itself. Finally, we report on a user survey to reflect on the challenges and opportunities presented by computational data journeys for explainable AI. Show more
Keywords: Data science analysis, XAI, transparency, explainability, data provenance, workflows
DOI: 10.3233/SW-233407
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-27, 2023
Authors: Vega-Gorgojo, Guillermo
Article Type: Research Article
Abstract: LOD4Culture is a web application that exploits Cultural Heritage Linked Open Data for tourism and education purposes. Since target users are not fluid on Semantic Web technologies, the user interface is designed to hide the intricacies of RDF or SPARQL. An interactive map is provided for exploring world-wide Cultural Heritage sites that can be filtered by type and that uses cluster markers to adapt the view to different zoom levels. LOD4Culture also includes a Cultural Heritage entity browser that builds comprehensive visualizations of sites, artists, and artworks. All data exchanges are facilitated through the use of a generator of REST …APIs over Linked Open Data that translates API calls into SPARQL queries across multiple sources, including Wikidata and DBpedia. Since March 2022, more than 1.7K users have employed LOD4Culture. The application has been mentioned many times in social media and has been featured in the DBpedia Newsletter, in the list of Wikidata tools for visualizing data, and in the open data applications list of datos.gob.es . Show more
Keywords: Cultural Heritage, Linked Open Data, data access, REST API, map visualizations, user interfaces
DOI: 10.3233/SW-233358
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-30, 2023
Authors: Steenwinckel, Bram | De Turck, Filip | Ongenae, Femke
Article Type: Research Article
Abstract: Semantic rule mining can be used for both deriving task-agnostic or task-specific information within a Knowledge Graph (KG). Underlying logical inferences to summarise the KG or fully interpretable binary classifiers predicting future events are common results of such a rule mining process. The current methods to perform task-agnostic or task-specific semantic rule mining operate, however, a completely different KG representation, making them less suitable to perform both tasks or incorporate each other’s optimizations. This also results in the need to master multiple techniques for both exploring and mining rules within KGs, as well losing time and resources when converting one …KG format into another. In this paper, we use INK, a KG representation based on neighbourhood nodes of interest to mine rules for improved decision support. By selecting one or two sets of nodes of interest, the rule miner created on top of the INK representation will either mine task-agnostic or task-specific rules. In both subfields, the INK miner is competitive to the currently state-of-the-art semantic rule miners on 14 different benchmark datasets within multiple domains. Show more
Keywords: Knowledge representation, semantic rule mining
DOI: 10.3233/SW-233495
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-22, 2023
Authors: Lambrix, Patrick | Armiento, Rickard | Li, Huanyu | Hartig, Olaf | Abd Nikooie Pour, Mina | Li, Ying
Article Type: Research Article
Abstract: In the materials design domain, much of the data from materials calculations is stored in different heterogeneous databases with different data and access models. Therefore, accessing and integrating data from different sources is challenging. As ontology-based access and integration alleviates these issues, in this paper we address data access and interoperability for computational materials databases by developing the Materials Design Ontology. This ontology is inspired by and guided by the OPTIMADE effort that aims to make materials databases interoperable and includes many of the data providers in computational materials science. In this paper, first, we describe the development and the …content of the Materials Design Ontology. Then, we use a topic model-based approach to propose additional candidate concepts for the ontology. Finally, we show the use of the Materials Design Ontology by a proof-of-concept implementation of a data access and integration system for materials databases based on the ontology.1 1 This paper is an extension of (In The Semantic Web – ISWC 2020 – 19th International Semantic Web Conference, Proceedings, Part II (2000 ) 212–227 Springer) with results from (In ESWC Workshop on Domain Ontologies for Research Data Management in Industry Commons of Materials and Manufacturing 2021 1–11) and currently unpublished results regarding an application using the ontology. This paper is an extension of (In The Semantic Web – ISWC 2020 – 19th International Semantic Web Conference, Proceedings, Part II (2000 ) 212–227 Springer) with results from (In ESWC Workshop on Domain Ontologies for Research Data Management in Industry Commons of Materials and Manufacturing 2021 1–11) and currently unpublished results regarding an application using the ontology. Show more
Keywords: Ontology, ontology development, data access, data integration, materials science, Materials Design Ontology
DOI: 10.3233/SW-233340
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-35, 2023
Authors: Bushati, Geni | Rasmusen, Sven Carsten | Kurteva, Anelia | Vats, Anurag | Nako, Petraq | Fensel, Anna
Article Type: Research Article
Abstract: The General Data Protection Regulation (GDPR) has imposed strict requirements for data sharing, one of which is informed consent. A common way to request consent online is via cookies. However, commonly, users accept online cookies being unaware of the meaning of the given consent and the following implications. Once consent is given, the cookie “disappears”, and one forgets that consent was given in the first place. Retrieving cookies and consent logs becomes challenging, as most information is stored in the specific Internet browser’s logs. To make users aware of the data sharing implied by cookie consent and to support transparency and …traceability within systems, we present a knowledge graph (KG) based tool for personalised cookie consent information visualisation. The KG is based on the OntoCookie ontology, which models cookies in a machine-readable format and supports data interpretability across domains. Evaluation results confirm that the users’ comprehension of the data shared through cookies is vague and insufficient. Furthermore, our work has resulted in an increase of 47.5% in the users’ willingness to be cautious when viewing cookie banners before giving consent. These and other evaluation results confirm that our cookie data visualisation approach and tool help to increase users’ awareness of cookies and data sharing. Show more
Keywords: Cookies, consent, GDPR, ontology, knowledge graph, data sharing, comprehension
DOI: 10.3233/SW-233435
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-17, 2023
Authors: Giustozzi, Franco | Saunier, Julien | Zanni-Merk, Cecilia
Article Type: Research Article
Abstract: In Industry 4.0, factory assets and machines are equipped with sensors that collect data for effective condition monitoring. This is a difficult task since it requires the integration and processing of heterogeneous data from different sources, with different temporal resolutions and underlying meanings. Ontologies have emerged as a pertinent method to deal with data integration and to represent manufacturing knowledge in a machine-interpretable way through the construction of semantic models. Ontologies are used to structure knowledge in knowledge bases, which also contain instances and information about these data. Thus, a knowledge base provides a sort of virtual representation of the …different elements involved in a manufacturing process. Moreover, the monitoring of industrial processes depends on the dynamic context of their execution. Under these circumstances, the semantic model must provide a way to represent this evolution in order to represent in which situation(s) a resource is in during the execution of its tasks to support decision making. This paper proposes a semantic framework to address the evolution of knowledge bases for condition monitoring in Industry 4.0. To this end, firstly we propose a semantic model (the COInd4 ontology) for the manufacturing domain that represents the resources and processes that are part of a factory, with special emphasis on the context of these resources and processes. Relevant situations that combine sensor observations with domain knowledge are also represented in the model. Secondly, an approach that uses stream reasoning to detect these situations that lead to potential failures is introduced. This approach enriches data collected from sensors with contextual information using the proposed semantic model. The use of stream reasoning facilitates the integration of data from different data sources, different temporal resolutions as well as the processing of these data in real time. This allows to derive high-level situations from lower-level context and sensor information. Detecting situations can trigger actions to adapt the process behavior, and in turn, this change in behavior can lead to the generation of new contexts leading to new situations. These situations can have different levels of severity, and can be nested in different ways. Dealing with the rich relations among situations requires an efficient approach to organize them. Therefore, we propose a method to build a lattice, ordering those situations depending on the constraints they rely on. This lattice represents a road-map of all the situations that can be reached from a given one, normal or abnormal. This helps in decision support, by allowing the identification of the actions that can be taken to correct the abnormality avoiding in this way the interruption of the manufacturing processes. Finally, an industrial application scenario for the proposed approach is described. Show more
Keywords: Semantic technologies, ontology, context modeling, stream reasoning, condition monitoring, Industry 4.0
DOI: 10.3233/SW-233481
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-29, 2023
Authors: Donkers, Alex | de Vries, Bauke | Yang, Dujuan
Article Type: Research Article
Abstract: Occupant feedback enables building managers to improve occupants’ health, comfort, and satisfaction. However, acquiring continuous occupant feedback and integrating this feedback with other building information is challenging. This paper presents a scalable method to acquire continuous occupant feedback and directly integrate this with other building information. Semantic web technologies were applied to solve data interoperability issues. The Occupant Feedback Ontology was developed to describe feedback semantically. Next to this, a smartwatch app – Mintal – was developed to acquire continuous feedback on indoor environmental quality. The app gathers location, medical information, and answers on short micro surveys. Mintal applied the …Occupant Feedback Ontology to directly integrate the feedback with linked building data. A case study was performed to evaluate this method. A semantic digital twin was created by integrating linked building data, sensor data, and occupant feedback. Results from SPARQL queries gave more insight into an occupant’s perceived comfort levels in the Open Flat. The case study shows how integrating feedback with building information allows for more occupant-centric decision support tools. The approach presented in this paper can be used in a wide range of use cases, both within and without the architecture, building, and construction domain. Show more
Keywords: Digital twin, Occupant Feedback Ontology, smartwatch, semantic web, linked building data
DOI: 10.3233/SW-223254
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-26, 2022
Authors: De Giorgis, Stefano | Gangemi, Aldo | Gromann, Dagmar
Article Type: Research Article
Abstract: Commonsense knowledge is a broad and challenging area of research which investigates our understanding of the world as well as human assumptions about reality. Deriving directly from the subjective perception of the external world, it is intrinsically intertwined with embodied cognition. Commonsense reasoning is linked to human sense-making, pattern recognition and knowledge framing abilities. This work presents a new resource that formalizes the cognitive theory of image schemas. Image schemas are dynamic conceptual building blocks originating from our sensorimotor interactions with the physical world, and enable our sense-making cognitive activity to assign coherence and structure to entities, events and situations …we experience everyday. ImageSchemaNet is an ontology that aligns pre-existing resources, such as FrameNet, VerbNet, WordNet and MetaNet from the Framester hub, to image schema theory. This article describes an empirical application of ImageSchemaNet, combined with semantic parsers, on the task of annotating natural language sentences with image schemas. Show more
Keywords: Image schemas, cognitive semantics, frame semantics, commonsense reasoning
DOI: 10.3233/SW-223084
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-25, 2022
Authors: Dooley, Damion | Weber, Magalie | Ibanescu, Liliana | Lange, Matthew | Chan, Lauren | Soldatova, Larisa | Yang, Chen | Warren, Robert | Shimizu, Cogan | McGinty, Hande K. | Hsiao, William
Article Type: Research Article
Abstract: People often value the sensual, celebratory, and health aspects of food, but behind this experience exists many other value-laden agricultural production, distribution, manufacturing, and physiological processes that support or undermine a healthy population and a sustainable future. The complexity of such processes is evident in both every-day food preparation of recipes and in industrial food manufacturing, packaging and storage, each of which depends critically on human or machine agents, chemical or organismal ingredient references, and the explicit instructions and implicit procedures held in formulations or recipes. An integrated ontology landscape does not yet exist to cover all the entities at …work in this farm to fork journey. It seems necessary to construct such a vision by reusing expert-curated fit-to-purpose ontology subdomains and their relationship, material, and more abstract organization and role entities. The challenge is to make this merger be, by analogy, one language, rather than nouns and verbs from a dozen or more dialects which cannot be used directly in statements about some aspect of the farm to fork journey without expensive translation or substantial dialect education in order to understand a particular text or domain of knowledge. This work focuses on the ontology components – object and data properties and annotations – needed to model food processes or more general process modelling within the context of the Open Biological and Biomedical Ontology Foundry and congruent ontologies. Ideally these components can be brought together in a general process ontology that can be specialized not only for the food domain but for carrying out other protocols as well. Many operations involved in food identification, preparation, transportation and storage – shaking, boiling, mixing, freezing, labeling, shipping – are actually common to activities from manufacturing and laboratory work to local or home food preparation. Show more
Keywords: Ontology, food processing, recipe, process modelling, OBO Foundry
DOI: 10.3233/SW-223096
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-32, 2022
Authors: Spahiu, Blerina | Palmonari, Matteo | Alva Principe, Renzo Arturo | Rula, Anisa
Article Type: Research Article
Abstract: While there has been a trend in the last decades for publishing large-scale and highly-interconnected Knowledge Graphs (KGs), their users often get overwhelmed by the task of understanding their content as a result of their size and complexity. Data profiling approaches have been proposed to summarize large KGs into concise and meaningful representations, so that they can be better explored, processed, and managed. Profiles based on schema patterns represent each triple in a KG with its schema-level counterpart, thus covering the entire KG with profiles of considerable size. In this paper, we provide empirical evidence that profiles based on schema …patterns, if explored with suitable mechanisms, can be useful to help users understand the content of big and complex KGs. ABSTAT provides concise pattern-based profiles and comes with faceted interfaces for profile exploration. Using this tool we present a user study based on query completion tasks. We demonstrate that users who look at ABSTAT profiles formulate their queries better and faster than users browsing the ontology of the KGs. The latter is a pretty strong baseline considering that many KGs do not even come with a specific ontology to be explored by the users. To the best of our knowledge, this is the first attempt to investigate the impact of profiling techniques on tasks related to knowledge graph understanding with a user study. Show more
Keywords: Data understanding, data profiling, summarization, rdf, knowledge graph
DOI: 10.3233/SW-223181
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-27, 2023
Authors: Compagno, Francesco | Borgo, Stefano
Article Type: Research Article
Abstract: In both applied ontology and engineering, functionality is a well-researched topic, since it is through teleological causal reasoning that domain experts build mental models of engineering systems, giving birth to functions. These mental models are important throughout the whole lifecycle of any product, being used from the design phase up to diagnosis activities. Though a vast amount of work to model functions has already been carried out, the literature has not settled on a shared and well-defined approach due to the variety of concepts involved and the modeling tasks that functional descriptions should satisfy. The work in this paper posits …the basis and makes some crucial steps towards a rich ontological description of functions and related concepts, such as behaviour, capability, and capacity. A conceptual analysis of such notions is carried out using the top-level ontology DOLCE as a framework, and the ensuing logical theory is formally described in first-order logic and OWL, showing how ontological concepts can model major aspects of engineering products in applications. In particular, it is shown how functions can be distinguished from the implementation methods to realize them, how one can differentiate between capabilities and capacities of a product, and how these are related to engineering functions. Show more
Keywords: Ontology, function, behaviour, capability, DOLCE
DOI: 10.3233/SW-223188
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-34, 2023
Authors: Portisch, Jan | Hladik, Michael | Paulheim, Heiko
Article Type: Research Article
Abstract: Ontology matching is an integral part for establishing semantic interoperability. One of the main challenges within the ontology matching operation is semantic heterogeneity, i.e. modeling differences between the two ontologies that are to be integrated. The semantics within most ontologies or schemas are, however, typically incomplete because they are designed within a certain context which is not explicitly modeled. Therefore, external background knowledge plays a major role in the task of (semi-) automated ontology and schema matching. In this survey, we introduce the reader to the general ontology matching problem. We review the background knowledge sources as well as …the approaches applied to make use of external knowledge. Our survey covers all ontology matching systems that have been presented within the years 2004–2021 at a well-known ontology matching competition together with systematically selected publications in the research field. We present a classification system for external background knowledge, concept linking strategies, as well as for background knowledge exploitation approaches. We provide extensive examples and classify all ontology matching systems under review in a resource/strategy matrix obtained by coalescing the two classification systems. Lastly, we outline interesting and yet underexplored research directions of applying external knowledge within the ontology matching process. Show more
Keywords: Ontology matching, schema matching, background knowledge, data integration, semantic integration, knowledge graphs, ontologies
DOI: 10.3233/SW-223085
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-55, 2022
Authors: Nguyen, Phuc | Kertkeidkachorn, Natthawut | Ichise, Ryutaro | Takeda, Hideaki
Article Type: Research Article
Abstract: Semantic annotation of tabular data is the process of matching table elements with knowledge graphs. As a result, the table contents could be interpreted or inferred using knowledge graph concepts, enabling them to be useful in downstream applications such as data analytics and management. Nevertheless, semantic annotation tasks are challenging due to insufficient tabular data descriptions, heterogeneous schema, and vocabulary issues. This paper presents an automatic semantic annotation system for tabular data, called MTab4D, to generate annotations with DBpedia in three annotation tasks: 1) matching table cells to entities, 2) matching columns to entity types, and 3) matching pairs of …columns to properties. In particular, we propose an annotation pipeline that combines multiple matching signals from different table elements to address schema heterogeneity, data ambiguity, and noisiness. Additionally, this paper provides insightful analysis and extra resources on benchmarking semantic annotation with knowledge graphs. Experimental results on the original and adapted datasets of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab 2019) show that our system achieves an impressive performance for the three annotation tasks. MTab4D’s repository is publicly available at https://github.com/phucty/mtab4dbpedia . Show more
Keywords: Table annotation, knowledge graph, DBpedia, semantic table interpretation
DOI: 10.3233/SW-223098
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-25, 2022
Authors: Hamilton, Kyle | Nayak, Aparna | Božić, Bojan | Longo, Luca
Article Type: Research Article
Abstract: Advocates for Neuro-Symbolic Artificial Intelligence (NeSy) assert that combining deep learning with symbolic reasoning will lead to stronger AI than either paradigm on its own. As successful as deep learning has been, it is generally accepted that even our best deep learning systems are not very good at abstract reasoning. And since reasoning is inextricably linked to language, it makes intuitive sense that Natural Language Processing (NLP), would be a particularly well-suited candidate for NeSy. We conduct a structured review of studies implementing NeSy for NLP, with the aim of answering the question of whether NeSy is indeed meeting its …promises: reasoning, out-of-distribution generalization, interpretability, learning and reasoning from small data, and transferability to new domains. We examine the impact of knowledge representation, such as rules and semantic networks, language structure and relational structure, and whether implicit or explicit reasoning contributes to higher promise scores. We find that systems where logic is compiled into the neural network lead to the most NeSy goals being satisfied, while other factors such as knowledge representation, or type of neural architecture do not exhibit a clear correlation with goals being met. We find many discrepancies in how reasoning is defined, specifically in relation to human level reasoning, which impact decisions about model architectures and drive conclusions which are not always consistent across studies. Hence we advocate for a more methodical approach to the application of theories of human reasoning as well as the development of appropriate benchmarks, which we hope can lead to a better understanding of progress in the field. We make our data and code available on github for further analysis.1 1 https://github.com/kyleiwaniec/neuro-symbolic-ai-systematic-review https://github.com/kyleiwaniec/neuro-symbolic-ai-systematic-review Show more
Keywords: Neuro-symbolic artificial intelligence, natural language processing, deep learning, knowledge representation & reasoning, structured review
DOI: 10.3233/SW-223228
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-42, 2022
Authors: Khan, M. Jaleed | G. Breslin, John | Curry, Edward
Article Type: Research Article
Abstract: Exploring the potential of neuro-symbolic hybrid approaches offers promising avenues for seamless high-level understanding and reasoning about visual scenes. Scene Graph Generation (SGG) is a symbolic image representation approach based on deep neural networks (DNN) that involves predicting objects, their attributes, and pairwise visual relationships in images to create scene graphs, which are utilized in downstream visual reasoning. The crowdsourced training datasets used in SGG are highly imbalanced, which results in biased SGG results. The vast number of possible triplets makes it challenging to collect sufficient training samples for every visual concept or relationship. To address these challenges, we propose …augmenting the typical data-driven SGG approach with common sense knowledge to enhance the expressiveness and autonomy of visual understanding and reasoning. We present a loosely-coupled neuro-symbolic visual understanding and reasoning framework that employs a DNN-based pipeline for object detection and multi-modal pairwise relationship prediction for scene graph generation and leverages common sense knowledge in heterogenous knowledge graphs to enrich scene graphs for improved downstream reasoning. A comprehensive evaluation is performed on multiple standard datasets, including Visual Genome and Microsoft COCO, in which the proposed approach outperformed the state-of-the-art SGG methods in terms of relationship recall scores, i.e. Recall@K and mean Recall@K, as well as the state-of-the-art scene graph-based image captioning methods in terms of SPICE and CIDEr scores with comparable BLEU, ROGUE and METEOR scores. As a result of enrichment, the qualitative results showed improved expressiveness of scene graphs, resulting in more intuitive and meaningful caption generation using scene graphs. Our results validate the effectiveness of enriching scene graphs with common sense knowledge using heterogeneous knowledge graphs. This work provides a baseline for future research in knowledge-enhanced visual understanding and reasoning. The source code is available at https://github.com/jaleedkhan/neusire . Show more
Keywords: Scene graph, image representation, common sense knowledge, knowledge enrichment, visual reasoning, image captioning
DOI: 10.3233/SW-233510
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-25, 2023
Authors: Ilievski, Filip | Shenoy, Kartik | Chalupsky, Hans | Klein, Nicholas | Szekely, Pedro
Article Type: Research Article
Abstract: Robust estimation of concept similarity is crucial for applications of AI in the commercial, biomedical, and publishing domains, among others. While the related task of word similarity has been extensively studied, resulting in a wide range of methods, estimating concept similarity between nodes in Wikidata has not been considered so far. In light of the adoption of Wikidata for increasingly complex tasks that rely on similarity, and its unique size, breadth, and crowdsourcing nature, we propose that conceptual similarity should be revisited for the case of Wikidata. In this paper, we study a wide range of representative similarity methods for …Wikidata, organized into three categories, and leverage background information for knowledge injection via retrofitting. We measure the impact of retrofitting with different weighted subsets from Wikidata and ProBase. Experiments on three benchmarks show that the best performance is achieved by pairing language models with rich information, whereas the impact of injecting knowledge is most positive on methods that originally do not consider comprehensive information. The performance of retrofitting is conditioned on the selection of high-quality similarity knowledge. A key limitation of this study, similar to prior work lies in the limited size and scope of the similarity benchmarks. While Wikidata provides an unprecedented possibility for a representative evaluation of concept similarity, effectively doing so remains a key challenge. Show more
Keywords: Similarity, Wikidata, retrofitting, knowledge graphs, embeddings
DOI: 10.3233/SW-233520
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-20, 2024
Authors: Li, Huanyu | Hartig, Olaf | Armiento, Rickard | Lambrix, Patrick
Article Type: Research Article
Abstract: In a GraphQL Web API, a so-called GraphQL schema defines the types of data objects that can be queried, and so-called resolver functions are responsible for fetching the relevant data from underlying data sources. Thus, we can expect to use GraphQL not only for data access but also for data integration, if the GraphQL schema reflects the semantics of data from multiple data sources, and the resolver functions can obtain data from these data sources and structure the data according to the schema. However, there does not exist a semantics-aware approach to employ GraphQL for data integration. Furthermore, there are …no formal methods for defining a GraphQL API based on an ontology. In this work, we introduce a framework for using GraphQL in which a global domain ontology informs the generation of a GraphQL server that answers requests by querying heterogeneous data sources. The core of this framework consists of an algorithm to generate a GraphQL schema based on an ontology and a generic resolver function based on semantic mappings. We provide a prototype, OBG-gen, of this framework, and we evaluate our approach over a real-world data integration scenario in the materials design domain and two synthetic benchmark scenarios (Linköping GraphQL Benchmark and GTFS-Madrid-Bench). The experimental results of our evaluation indicate that: (i) our approach is feasible to generate GraphQL servers for data access and integration over heterogeneous data sources, thus avoiding a manual construction of GraphQL servers, and (ii) our data access and integration approach is general and applicable to different domains where data is shared or queried via different ways. Show more
Keywords: Data integration, ontology, GraphQL
DOI: 10.3233/SW-233550
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-37, 2024
Authors: Merkle, Nicole | Mikut, Ralf
Article Type: Research Article
Abstract: Computational agents support humans in many areas of life and are therefore found in heterogeneous contexts. This means that agents operate in rapidly changing environments and can be confronted with huge state and action spaces. In order to perform services and carry out activities satisfactorily, i.e. in a goal-oriented manner, agents require prior knowledge and therefore have to develop and pursue context-dependent policies. The problem here is that prescribing policies in advance is limited and inflexible, especially in dynamically changing environments. Moreover, the context (i.e. the external and internal state) of an agent determines its choice of actions. Since the …environments in which agents operate can be stochastic and complex in terms of the number of states and feasible actions, activities are usually modelled in a simplified way by Markov decision processes so that, for example, agents with reinforcement learning are able to learn policies, i.e. state-action pairs, that help to capture the context and act accordingly to optimally perform activities. However, training policies for all possible contexts using reinforcement learning is time-consuming. A requirement and challenge for agents is to learn strategies quickly and respond immediately in cross-context environments and applications, e.g., the Internet, service robotics, cyber-physical systems. In this work, we propose a novel simulation-based approach that enables a) the representation of heterogeneous contexts through knowledge graphs and entity embeddings and b) the context-aware composition of policies on demand by ensembles of agents running in parallel. The evaluation we conducted with the “Virtual Home” dataset indicates that agents with a need to switch seamlessly between different contexts, e.g. in a home environment, can request on-demand composed policies that lead to the successful completion of context-appropriate activities without having to learn these policies in lengthy training steps and episodes, in contrast to agents that use reinforcement learning. The presented approach enables both context-aware and cross-context applicability of untrained computational agents. Furthermore, the source code of the approach as well as the generated data, i.e. the trained embeddings and the semantic representation of domestic activities, is open source and openly accessible on Github and Figshare. Show more
Keywords: Knowledge graphs, word embeddings, web platform, reinforcement learning, computational agents
DOI: 10.3233/SW-233531
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-29, 2024
Authors: Zhao, Yingshen | Sarkar, Arkopaul | Elmhadhbi, Linda | Karray, Mohamed Hedi | Fillatreau, Philippe | Archimède, Bernard
Article Type: Research Article
Abstract: Thanks to the advent of robotics in shopfloor and warehouse environments, control rooms need to seamlessly exchange information regarding the dynamically changing 3D environment to facilitate tasks and path planning for the robots. Adding to the complexity, this type of environment is heterogeneous as it includes both free space and various types of rigid bodies (equipment, materials, humans etc.). At the same time, 3D environment-related information is also required by the virtual applications (e.g., VR techniques) for the behavioral study of CAD-based product models or simulation of CNC operations. In past research, information models for such heterogeneous 3D environments are …often built without ensuring connection among different levels of abstractions required for different applications. For addressing such multiple points of view and modelling requirements for 3D objects and environments, this paper proposes an ontology model that integrates the contextual, topologic, and geometric information of both the rigid bodies and the free space. The ontology provides an evolvable knowledge model that can support simulated task-related information in general. This ontology aims to greatly improve interoperability as a path planning system (e.g., robot) and will be able to deal with different applications by simply updating the contextual semantics related to some targeted application while keeping the geometric and topological models intact by leveraging the semantic link among the models. Show more
Keywords: Path planning, joint task and path planning, ontology, simulated task-related knowledge
DOI: 10.3233/SW-233460
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-28, 2023
Authors: Hosseini Beghaeiraveri, Seyed Amir | Labra Gayo, Jose Emilio | Waagmeester, Andra | Ammar, Ammar | Gonzalez, Carolina | Slenter, Denise | Ul-Hasan, Sabah | Willighagen, Egon | McNeill, Fiona | Gray, Alasdair J.G.
Article Type: Research Article
Abstract: Wikidata is a massive Knowledge Graph (KG), including more than 100 million data items and nearly 1.5 billion statements, covering a wide range of topics such as geography, history, scholarly articles, and life science data. The large volume of Wikidata is difficult to handle for research purposes; many researchers cannot afford the costs of hosting 100 GB of data. While Wikidata provides a public SPARQL endpoint, it can only be used for short-running queries. Often, researchers only require a limited range of data from Wikidata focusing on a particular topic for their use case. Subsetting is the process of defining and …extracting the required data range from the KG; this process has received increasing attention in recent years. Specific tools and several approaches have been developed for subsetting, which have not been evaluated yet. In this paper, we survey the available subsetting approaches, introducing their general strengths and weaknesses, and evaluate four practical tools specific for Wikidata subsetting – WDSub, KGTK, WDumper, and WDF – in terms of execution performance, extraction accuracy, and flexibility in defining the subsets. Results show that all four tools have a minimum of 99.96% accuracy in extracting defined items and 99.25% in extracting statements. The fastest tool in extraction is WDF, while the most flexible tool is WDSub. During the experiments, multiple subset use cases have been defined and the extracted subsets have been analyzed, obtaining valuable information about the variety and quality of Wikidata, which would otherwise not be possible through the public Wikidata SPARQL endpoint. Show more
Keywords: Knowledge Graph, Wikidata, Subsetting, Big Data, Accuracy, Performance
DOI: 10.3233/SW-233491
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-27, 2023
Authors: Dooley, Damion | Andrés-Hernández, Liliana | Bordea, Georgeta | Carmody, Leigh | Cavalieri, Duccio | Chan, Lauren | Castellano-Escuder, Pol | Lachat, Carl | Mougin, Fleur | Vitali, Francesco | Yang, Chen | Weber, Magalie | Kucuk McGinty, Hande | Lange, Matthew
Article Type: Research Article
Abstract: Since its creation in 2016, the FoodOn food ontology has become an interconnected partner in various academic and government projects that span agricultural and public health domains. This paper examines recent data interoperability capabilities arising from food-related ontologies belonging to, or compatible with, the encyclopedic Open Biological and Biomedical Ontology Foundry (OBO) ontology platform, and how research organizations and industry might utilize them for their own projects or for data exchange. Projects are seeking standardized vocabulary across many food supply activities ranging from agricultural production, harvesting, preparation, food processing, marketing, distribution and consumption, as well as more indirect health, economic, …food security and sustainability analysis and reporting tools. To satisfy this demand for controlled vocabulary requires establishing domain specific ontologies whose curators coordinate closely to produce recommended patterns for food system vocabulary. Show more
Keywords: Ontology, data harmonization, OBO Foundry, food systems, public health, epidemiology, multi-ontology framework, One Health
DOI: 10.3233/SW-233458
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-20, 2024
Authors: Werbrouck, Jeroen | Pauwels, Pieter | Beetz, Jakob | Verborgh, Ruben | Mannens, Erik
Article Type: Research Article
Abstract: In many industries, multiple parties collaborate on a larger project. At the same time, each of those stakeholders participates in multiple independent projects simultaneously. A double patchwork can thus be identified, with a many-to-many relationship between actors and collaborative projects. One key example is the construction industry, where every project is unique, involving specialists for many subdomains, ranging from the architectural design over technical installations to geospatial information, governmental regulation and sometimes even historical research. A digital representation of this process and its outcomes requires semantic interoperability between these subdomains, which however often work with heterogeneous and unstructured data. In …this paper we propose to address this double patchwork via a decentralized ecosystem for multi-stakeholder, multi-industry collaborations dealing with heterogeneous information snippets. At its core, this ecosystem, called ConSolid, builds upon the Solid specifications for Web decentralization, but extends these both on a (meta)data pattern level and on microservice level. To increase the robustness of data allocation and filtering, we identify the need to go beyond Solid’s current LDP-inspired interfaces to a Solid Pod and introduce the concept of metadata-generated ‘virtual views’, to be generated using an access-controlled SPARQL interface to a Pod. A recursive, scalable way to discover multi-vault aggregations is proposed, along with data patterns for connecting and aligning heterogeneous (RDF and non-RDF) resources across vaults in a mediatype-agnostic fashion. We demonstrate the use and benefits of the ecosystem using minimal running examples, concluding with the setup of an example use case from the Architecture, Engineering, Construction and Operations (AECO) industry. Show more
Keywords: Solid, DCAT, interdisciplinary collaboration, Common Data Environment, semantic enrichment
DOI: 10.3233/SW-233396
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-32, 2024
Authors: Usbeck, Ricardo | Yan, Xi | Perevalov, Aleksandr | Jiang, Longquan | Schulz, Julius | Kraft, Angelie | Möller, Cedric | Huang, Junbo | Reineke, Jan | Ngonga Ngomo, Axel-Cyrille | Saleem, Muhammad | Both, Andreas
Article Type: Research Article
Abstract: Knowledge Graph Question Answering (KGQA) has gained attention from both industry and academia over the past decade. Researchers proposed a substantial amount of benchmarking datasets with different properties, pushing the development in this field forward. Many of these benchmarks depend on Freebase, DBpedia, or Wikidata. However, KGQA benchmarks that depend on Freebase and DBpedia are gradually less studied and used, because Freebase is defunct and DBpedia lacks the structural validity of Wikidata. Therefore, research is gravitating toward Wikidata-based benchmarks. That is, new KGQA benchmarks are created on the basis of Wikidata and existing ones are migrated. We present a new, …multilingual, complex KGQA benchmarking dataset as the 10th part of the Question Answering over Linked Data (QALD) benchmark series. This corpus formerly depended on DBpedia. Since QALD serves as a base for many machine-generated benchmarks, we increased the size and adjusted the benchmark to Wikidata and its ranking mechanism of properties. These measures foster novel KGQA developments by more demanding benchmarks. Creating a benchmark from scratch or migrating it from DBpedia to Wikidata is non-trivial due to the complexity of the Wikidata knowledge graph, mapping issues between different languages, and the ranking mechanism of properties using qualifiers. We present our creation strategy and the challenges we faced that will assist other researchers in their future work. Our case study, in the form of a conference challenge, is accompanied by an in-depth analysis of the created benchmark. Show more
Keywords: Knowledge graph question answering, benchmark, challenge, query analysis
DOI: 10.3233/SW-233471
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-15, 2023
Authors: Buil-Aranda, Carlos | Lobo, Jorge | Olmedo, Federico
Article Type: Research Article
Abstract: Differential privacy is a framework that provides formal tools to develop algorithms to access databases and answer statistical queries with quantifiable accuracy and privacy guarantees. The notions of differential privacy are defined independently of the data model and the query language at steak. Most differential privacy results have been obtained on aggregation queries such as counting or finding maximum or average values, and on grouping queries over aggregations such as the creation of histograms. So far, the data model used by the framework research has typically been the relational model and the query language SQL. However, effective realizations of …differential privacy for SQL queries that required joins had been limited. This has imposed severe restrictions on applying differential privacy in RDF knowledge graphs and SPARQL queries. By the simple nature of RDF data, most useful queries accessing RDF graphs will require intensive use of joins. Recently, new differential privacy techniques have been developed that can be applied to many types of joins in SQL with reasonable results. This opened the question of whether these new results carry over to RDF and SPARQL. In this paper we provide a positive answer to this question by presenting an algorithm that can answer counting queries over a large class of SPARQL queries that guarantees differential privacy, if the RDF graph is accompanied with semantic information about its structure. We have implemented our algorithm and conducted several experiments, showing the feasibility of our approach for large graph databases. Our aim has been to present an approach that can be used as a stepping stone towards extensions and other realizations of differential privacy for SPARQL and RDF. Show more
Keywords: Differential privacy, SPARQL
DOI: 10.3233/SW-233474
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-29, 2023
Authors: Portisch, Jan | Paulheim, Heiko
Article Type: Research Article
Abstract: Knowledge graph embeddings represent a group of machine learning techniques which project entities and relations of a knowledge graph to continuous vector spaces. RDF2vec is a scalable embedding approach rooted in the combination of random walks with a language model. It has been successfully used in various applications. Recently, multiple variants to the RDF2vec approach have been proposed, introducing variations both on the walk generation and on the language modeling side. The combination of those different approaches has lead to an increasing family of RDF2vec variants. In this paper, we evaluate a total of twelve RDF2vec variants on a …comprehensive set of benchmark models, and compare them to seven existing knowledge graph embedding methods from the family of link prediction approaches. Besides the established GEval benchmark introducing various downstream machine learning tasks on the DBpedia knowledge graph, we also use the new DLCC (Description Logic Class Constructors) benchmark consisting of two gold standards, one based on DBpedia, and one based on synthetically generated graphs. The latter allows for analyzing which ontological patterns in a knowledge graph can actually be learned by different embedding. With this evaluation, we observe that certain tailored RDF2vec variants can lead to improved performance on different downstream tasks, given the nature of the underlying problem, and that they, in particular, have a different behavior in modeling similarity and relatedness. The findings can be used to provide guidance in selecting a particular RDF2vec method for a given task. Show more
Keywords: RDF2vec, knowledge graph embedding, representation learning, embedding evaluation
DOI: 10.3233/SW-233514
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-32, 2024
Authors: Thiéblin, Elodie | Sousa, Guilherme | Haemmerlé, Ollivier | Trojahn, Cássia
Article Type: Research Article
Abstract: Ontology matching aims at making ontologies interoperable. While the field has fully developed in the last years, most approaches are still limited to the generation of simple correspondences. More expressiveness is, however, required to better address the different kinds of ontology heterogeneities. This paper presents CANARD (C omplex A lignment N eed and A -box based R elation D iscovery), an approach for generating expressive correspondences that rely on the notion of competency questions for alignment (CQA). A CQA expresses the user knowledge needs in terms of alignment and aims at reducing the alignment space. The approach takes as input …a set of CQAs as SPARQL queries over the source ontology. The generation of correspondences is performed by matching the subgraph from the source CQA to the similar surroundings of the instances from the target ontology. Evaluation is carried out on both synthetic and real-world datasets. The impact of several approach parameters is discussed. Experiments have showed that CANARD performs, overall, better on CQA coverage than precision and that using existing same:As links, between the instances of the source and target ontologies, gives better results than exact label matches of their labels. The use of CQA improved also both CQA coverage and precision with respect to using automatically generated queries. The reassessment of the counter-example increased significantly the precision, to the detriment of runtime. Finally, experiments on large datasets showed that CANARD is one of the few systems that can perform on large knowledge bases, but depends on regularly populated knowledge bases and the quality of instance links. Show more
Keywords: Ontology matching, complex alignment, competency question for alignment, user needs
DOI: 10.3233/SW-233521
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-33, 2024
Authors: Serderidis, Konstantinos | Konstantinidis, Ioannis | Meditskos, Georgios | Peristeras, Vassilios | Bassiliades, Nick
Article Type: Research Article
Abstract: To implement Open Governance a crucial element is the efficient use of the big amounts of open data produced in the public domain. Public administration is a rich source of data and potentially new knowledge. It is a data intensive sector producing vast amounts of information encoded in government decisions and acts, published nowadays on the World Wide Web. The knowledge shared on the Web is mostly made available via semi-structured documents written in natural language. To exploit this knowledge, technologies such as Natural Language Processing, Information Extraction, Data mining and the Semantic Web could be used, embedding into documents …explicit semantics based on formal knowledge representations such as ontologies. Knowledge representation can be made possible by the deployment of Knowledge Graphs, collections of interlinked representations of entities, events or concepts, based on underlying ontologies. This can assist data analysts to achieve a higher level of situational awareness, facilitating automated reasoning towards different objectives, such as for knowledge management, data maintenance, transparency and cybersecurity. This paper presents a new ontology d2kg [d(iavgeia) 2(to) k(nowledge) g(raph)] integrating in a unique way standard EU ontologies, core and controlled vocabularies to enable exploitation of publicly available data from government decisions and acts published on the Greek platform Diavgeia with the aim to facilitate data sharing, re-usability and interoperability. It demonstrates a characteristic example of a Knowledge Graph based representation of government decisions and acts, highlighting its added value to respond to real practical use cases for the promotion of transparency, accountability and public awareness. The developed d2kg ontology in owl is accessible at: http://w3id.org/d2kg , as well as documented at: http://w3id.org/d2kg/documentation . Show more
Keywords: Semantic Web, Linked Open Data, ontologies, Knowledge Graphs, government decisions and acts, Diavgeia
DOI: 10.3233/SW-243535
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-23, 2024
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]