The journal Data Science is an interdisciplinary journal that aims to publish novel and effective methods on using scientific data in a principled, well-defined, and reproducible fashion, concrete tools that are based on these methods, and applications thereof. The ultimate goal is to unleash the power of scientific data to deepen our understanding of physical, biological, and digital systems, gain insight into human social and economic behavior, and design new solutions for the future. The rising importance of scientific data, both big and small, brings with it a wealth of challenges to combine structured, but often siloed data with messy, incomplete, and unstructured data from text, audio, visual content such as sensor and weblog data. New methods to extract, transport, pool, refine, store, analyze, and visualize data are needed to unleash their power while simultaneously making tools and workflows easier to use by the public at large. The journal invites contributions ranging from theoretical and foundational research, platforms, methods, applications, and tools in all areas. We welcome papers which add a social, geographical, and temporal dimension to Data Science research, as well as application-oriented papers that prepare and use data in discovery research.
This journal focuses on methods, infrastructure, and applications around the following core topics:
- scientific data mining, machine learning, and Big Data analytics
- data management, network analysis, and scientific knowledge discovery
- scholarly communication and (semantic) publishing
- research data publication, indexing, quality, and discovery
- data wrangling, integration, provenance
- trend analysis, prediction, and visualization
- crowdsourcing and collaboration
- corroboration, validation, trust, and reproducibility
- scalable computing, analysis, and learning
- smart and semantic web services, executable workflows
- analytics, intelligence, and real time decision making
- socio-technical systems
- social impacts of data science
Semantic publishing has been defined as anything that enhances the meaning of a published journal article, facilitates its automated discovery, enables its linking to semantically related articles, provides access to data within the article in actionable form, or facilitates integration of data between papers. Towards the goal of genuine semantic publishing, where a work may be published with its content and metadata represented in a machine-interpretable semantic notation, this journal will work with a global set of partners to develop standardized methods to ensure that our publications can be seen as a machine-accessible store of knowledge.
An important goal of the journal is to promote an environment to produce and share annotated data to the wider research community. The development and use of data and metadata standards are critical for achieving this goal. Authors should ensure that any data used or produced in the study is represented with community-based data formats and metadata standards.
Rapid, Open, Transparent, and Attributed Reviews
Data Science journal relies on an open and transparent review process. Submitted manuscripts are posted on the journal’s website and are publicly available. In addition to solicited reviews selected by members of the editorial board, public reviews and comments are welcome by any researcher and can be uploaded using the journal website. All reviews and responses from the authors are posted on the journal homepage. All involved reviewers and editors will be acknowledged in the final printed version. While we strongly encourage reviewers to participate in the open and transparent review process, it is still possible to submit anonymous reviews. Editors, non-anonymous reviewers will be included in all published articles. The journal will aim to complete reviews within 2-4 weeks of submission.
The journal will provide editor and reviewer profiles and metrics (links to ORCID, Google Scholar, etc.).
Abstract: The FAIR principles outline key attributes to make digital resources more Findable, Accessible, Interoperable, and Reusable. Globally endorsed and widely adopted, there is now a pressing need to enable the establishment of an Internet of FAIR Data and Services, to demonstrate how these can be used to generate new insights, and to assess the overall value for FAIR across different sectors. Realizing the value of the FAIR principles will require a combination of scientific, technical, social, legal, and ethical advances for the production, sharing, discovery, assessment, and reuse of data. This special issue highlights research and software that is making…FAIR data and services a reality.
Abstract: FAIR data requires unique and persistent identifiers. Persistent Uniform Resource Locators (PURLs) are one common solution, introducing a mapping layer from the permanent identifier to a target URL that can change over time. Maintaining a PURL system requires long-term commitment and resources, and this can present a challenge for open projects that rely heavily on volunteers and donated resources. When the PURL system used by the Open Biological and Biomedical Ontologies (OBO) community suffered major technical problems in 2015, OBO developers had to migrate quickly to a new system. We describe that migration, the new OBO PURL system that we…built, and the key factors behind our design. The OBO PURL system is low-cost and low-maintenance, built on well-established open source software, customized to the needs of the OBO community, and shows how key FAIR principles can be supported on a tight budget.
Abstract: An increasing number of solutions aim to support the steady increase of the number of requirements and requests for Linked Data at scale. This plethora of solutions leads to a growing need for objective means that facilitate the selection of adequate solutions for particular use cases. We hence present Hobbit , a distributed benchmarking platform designed for the unified execution of benchmarks for Linked Data solutions. The Hobbit benchmarking platform is based on the FAIR principles and is the first benchmarking platform able to scale up to benchmarking real-world scenarios for Big Linked Data solutions. Our online instance of…the platform has more than 300 registered users and offers more than 40 benchmarks. It has been used in eleven benchmarking challenges and for more than 13000 experiments. We give an overview of the results achieved during 2 of these challenges and point to some of the novel insights that were gained from the results of the platform. Hobbit is open-source and available at http://github.com/hobbit-project .
Abstract: The FAIR Guiding Principles, published in 2016, aim to improve the findability, accessibility, interoperability and reusability of digital research objects for both humans and machines. Until now the FAIR principles have been mostly applied to research data. The ideas behind these principles are, however, also directly relevant to research software. Hence there is a distinct need to explore how the FAIR principles can be applied to software. In this work, we aim to summarize the current status of the debate around FAIR and software, as basis for the development of community-agreed principles for FAIR research software in the future. We…discuss what makes software different from data with regard to the application of the FAIR principles, and which desired characteristics of research software go beyond FAIR. Then we present an analysis of where the existing principles can directly be applied to software, where they need to be adapted or reinterpreted, and where the definition of additional principles is required. Here interoperability has proven to be the most challenging principle, calling for particular attention in future discussions. Finally, we outline next steps on the way towards definite FAIR principles for research software.
Keywords: FAIR, research software, software sustainability, reproducible research