Openly published Open Science Prize Grant Proposal builds on ContentMine and Hypothes.is to bridge scientists and facts

Public health emergencies such as the currently spreading Zika disease might be successfully necessitating open access for the available biomedical researches and their underlying data, yet filtering the right information, so that it lands in the hands of the right people, is what holds up professionals to bring the adequate measures about.

Submitted to the Open Science Prize contest, the present grant proposal, prepared with the joint efforts of scientists affiliated with Hypothes.is, ContentMine, University of CambridgeCottage Labs LLP and Imperial College of London, suggests a new scholarly assistant system, called amanuens.is, based on the existing ContentMine and Hypothes.is prototypes. Its aim is to combine machines and humans, so that mining critically important facts and making them available to the world can be made not only significantly faster, but also less costly. Through their publication in the open access journal Research Ideas and Outcomes (RIO), the scientists, who are also well-known open access and open data proponents, are looking for further support, feedback and collaborations.

While Hypothes.is is a mixture of software and communities, which together annotate the available literature, ContentMine are building an open source pipeline to extract facts from scientific documents, thus making the literature review process cheaper, more rigorous, continuous and transparent. The role of amanuens.is is meant to bring these two systems together.

As a result, Hypothes.is is to display ContentMine facts as annotations on the online document, therefore increasing their visibility. In turn, the large Hypothes.is community, comprising users ranging from devoted and experienced Wikipedia editors to dedicated citizen scientists, would be able to provide manually their own annotations, which could be then fed back into the ContentMine facts store.

“Facts are important – but science is performed by people – so ContentMine are partnering with Hypothes.is to bring communities together around facts in the scholarly literature,” sums up Dr Peter Murray-Rust. “Through combining machines and humans in a tight, iterating, loop, amanuens.is will be able to mine critically important facts and make them available to the world.”

In their proposal, the authors give a hypothetical, yet foreseeable example with a Hypothes.is community, centered around research and discussions regarding a bacterium, already proven to restrain some mosquitoes from transmitting various viruses, and its potential use against Zika. There, amanuens.is downloads all open access papers on Zika from a multitude of sources within 3 minutes. In a matter of a couple of seconds a total of 123 files are downloaded. Then, amanuens.is delivers a data table of the extracted data, including species, human genes, DNA primers and top word frequencies.

Within the community and thanks to the literature, made available via ContentMine, the users would be able to collaborate and build on the existing research outcomes. As a result, it could take only fifteen minutes and a brief proposal to mobilise the related scholarly resources and test for Zika resistance in infected with the virus mosquitoes.

“Finding facts to finding people took 15 minutes and this is how modern collaborative science should work,” Prof Peter Murray-Rust says about the given example. “The people then create knowledge from the facts. The knowledge creates communities. The communities explore science- and people-based solutions.”

In conclusion, the proposal states that similarly to the content and software provided by ContentMine and Hypothes.is, the outputs produced by amanuens.is will also be openly available. All of its data and annotations are to be public domain under a CC0 waiver.

###

Original source:

Martone M, Murray-Rust P, Molloy J, Arrow T, MacGillivray M, Kittel C, Kasberger S, Steel G, Oppenheim C, Ranganathan A, Tennant J, Udell J (2016) ContentMine/Hypothes.is Proposal.Research Ideas and Outcomes 2: e8424. doi: 10.3897/rio.2.e8424