While more researchers are adopting open access, open data, open peer review and open projects, some significant barriers are hindering progress
Twenty years ago the debate surrounding open science focused on access to journals. By 2020 around 25% of all chemistry papers published were open access, and now most of the major publishers of chemistry journals offer some version of open access. But more researchers are starting to realise that other elements of open science are ripe for development. There is a tantalising future where chemists share their data in ways that allow easy reuse, awakening a new era of innovation.
One of the biggest culprits slowing this down is the humble pdf file, often the format for supplementary data submitted to journals. ‘Google and all those internet indexes have trouble reading pdfs and understanding what’s in them,’ says chemist Simon Coles from the University of Southampton. ‘Discoverability of data is really hampered by the fact that this is the way we operate.’ Coles is director of the UK Physical Sciences Data-science Service (PSDS), which is working to create an interconnected lake of data from UK physical science research.
The urgency to do this is in part linked to the recent explosion in machine learning methods. ‘There’s no hope of us even getting out of the starting blocks with all these fancy new technologies if we don’t have the right data to train the algorithms,’ says Coles. ‘We’re like a Porsche with shopping trolley wheels on – we can’t get anywhere because of the foundations on which we’re building.’