One place for all scholarly literature: An Open Science Prize proposal

Openly accessible scholarly literature is referred to as “the fabric and the substance of Open Science” in the present small grant proposal, submitted to the Open Science Prize contest and published in the Research Ideas and Outcomes (RIO) open access journal. However, the scholarly literature is currently quite chaotically dispersed across thousands of different websites and disconnected from its context.

To tackle this issue, authors Marcin Wojnarski, Paperity, Poland, and Debra Hanken Kurtz, DuraSpace, USA, build on the existing prototype Paperity, the first open access aggregator of scholarly journals. Their suggestion is the first global universal catalog of open access scientific literature. It is to bring together all publications by automatically harvesting both “gold” and “green” ones.

Called Paperity Central, it is to also incorporate many helpful functionalities and features, such as a wiki-type one, meant to allow for registered users to manually improve, curate and extend the catalog in a collaborative, community-controlled way.

“Manual curation will be particularly important for “green” metadata, which frequently contain missing or incorrect information; and for cataloguing those publications that are inaccessible for automatic harvesting, like the articles posted on author homepages only,” further explain the authors.

To improve on its ancestor, the planned catalog is to seamlessly add “green” publications from across repositories to the already available articles indexed from gold and hybrid journals. Paperity Central is to derive its initial part of “green” content from DSpace, the most popular repository platform worldwide, developed and stewarded by DuraSpace, and powering over 1,500 academic repositories around the world.

All items available from Paperity Central are to be assigned with globally unique permanent identifiers, thus reconnecting them to their primary source of origin. Moreover, all different types of Open Science resources related to a publication, such as author profiles, institutions, funders, grants, datasets, protocols, reviews, cited/citing works, are to be semantically linked in order to assure none of them is disconnected from its context.

Furthermore, the catalog is to perform deduplication of each entry in the same systematic and consistent way. Then, these corrections and expansions are to be transferred back to the source repositories in a feedback loop via open application programming interfaces (APIs). However, being developed from a scratch, its code will possess many distinct features setting it apart from existing wiki-type platforms, such as Wikipedia, for example.

“Every entry will consist of structured data, unlike Wikipedia pages which are basically text documents,” explain the scientists. “The catalog itself will possess internal structure, with every item being assigned to higher-level objects: journals, repositories, collections – unlike Wikipedia, where the corpus is a flat list of articles.”

In order to guarantee the correctness of the catalog, Paperity Central is to be fully transparent, meaning the history of changes is to be made public. Meanwhile, edits are to be moderated by peers, preferably journal editors or institutional repository admins overlooking the items assigned to their collections.

In their proposal, the authors note that the present development plan is only the first phase of their project. They outline the areas where the catalog is planned to be further enhanced in future. Among others, these include involvement of more repositories and platforms, fully developed custom APIs and expansion on the scholarly output types to be included in the catalog.

“If we are serious about opening up the system of scientific research, we must plant it on the foundation of open literature and make sure that this literature is properly organized and maintained: accessible for all in one central location, easily discoverable, available within its full context, annotated and semantically linked with related objects,” explain the scientists.

“Assume we want to find all articles on Zika published in 2015,” they exemplify. “We can find some of them today using services like Google Scholar or PubMed Central, but how do we know that no other exist? Or that we have not missed any important piece of literature? With the existing tools, which have incomplete and undefined coverage, we do not know and will never know for sure.”

In the spirit of their principles of openness, the authors assure that once funded, Paperity Central will be releasing its code as open source under an open license.

###

Original source:

Wojnarski M, Hanken Kurtz D (2016) Paperity Central: An Open Catalog of All Scholarly Literature. Research Ideas and Outcomes 2: e8462. doi: 10.3897/rio.2.e8462

An Open Science plan: Wikidata for Research

Wikidata is to databases what Wikipedia is to encyclopedias – the free version that anyone can edit. Both aim to share “the sum of all human knowledge” across the world in a multitude of languages, and while Wikidata is younger and has a smaller community, it attracts the collaboration of more than 16,000 volunteer contributors globally each month (up from 14,000 a year ago).

Meanwhile, recent years have witnessed a constantly increasing demand and support for Open Access and Open Science across professional research communities and citizen scientists. Therefore, a Horizon 2020 project plan was put together by a team of six European partners led by the Museum für Naturkunde Berlin to integrate research workflows with Wikidata into a new virtual research environment (VRE) for Open Science, called Wiki4R. The plan combined approaches to make Wikidata useful for researchers both across disciplines and for several specific use cases, e.g. chemistry.

The cross-disciplinary aspects included standard ways for handling scholarly references in Wikidata and for asking questions of Wikidata, whereas the chemical part focused on how to describe Wikidata entries for chemical topics like molecules, solvents or reactions and pathways, how to link this information to scholarly databases and publications, and how to ask chemical questions of Wikidata. These technical parts of the proposal were complemented by parts on how to bring Wikidata together with citizen science projects, on what the value proposition of openness is for institutions, and on training activities.

The grant proposal was submitted in January and ultimately rejected, but its drafters believe it contains a range of ideas that may still be worth pursuing. In fact, efforts to handle scholarly references through Wikidata are ongoing, and Wikidata can now be queried for things like a list of countries ordered by the number of their cities with a female mayor.

“The idea of a closer integration between Wikidata and research workflows is not itself rejected, and we believe that it is useful for both the research and Wikimedia communities to continue to explore the opportunities here, to pilot them and to keep talking to funders and other stakeholders about the value that such infrastructure would provide to society, so they can consider making the necessary resources available,” comments Dr. Daniel Mietchen, who spearheaded the effort.

In order to stimulate such activities, the Wiki4R proposal is among the first ones published via the new open-access journal Research Ideas & Outcomes (RIO). The innovative platform accepts submissions of scholarly works from the entire research life-cycle, including research ideas and proposals that are deemed to be valuable to scholarly research and its future.

“Our proposal focuses on the needs of open science and empowering researchers to work together across disciplines in an open environment,” explains Dr. Daniel Mietchen. “The concept of open science is central to this proposal. Open science is highly inclusive, inviting collaboration from professional peers as well as other interested parties, including citizen scientists. It is also open with respect to the process, providing access to research as it unfolds, allowing anyone to engage with it right away.”

###

Original source:

Mietchen D, Hagedorn G, Willighagen E, Rico M, Gómez-Pérez A, Aibar E, Rafes K, Germain C, Dunning A, Pintscher L, Kinzler D (2015) Enabling Open Science: Wikidata for Research (Wiki4R). Research Ideas and Outcomes 1: e7573. doi: 10.3897/rio.1.e7573

 

Additional Information:

The mission of RIO is to catalyse change in research communication by publishing ideas, proposals and outcomes in order to increase transparency, trust and efficiency of the whole research ecosystem. Its scope encompasses all areas of academic research, including science, technology, the humanities and the social sciences.

The journal harnesses the full value of investment in the academic system by registering, reviewing, publishing and permanently archiving a wider variety of research outputs than those traditionally made public: project proposals, data, methods, workflows, software, project reports and research articles together on a single collaborative platform offering one of the most transparent, open and public peer-review processes.