FORCE11 – report from Data in the Scholarly Communications Life Cycle workshop
By Patrycja A Barczynska, on 20 August 2018
This summer saw the second FORCE 11 Scholarly Communication Institute (FSCI) held at the University of California, San Diego, which I had an amazing opportunity to attend. Participants of the FORCE 11 summer camp selected three courses from an extensive course list; morning classes ran through the whole week, afternoon ones took place over two days.
In the mornings I attended Data in the Scholarly Communications Life Cycle workshops. The class was expertly and entertainingly* run by Natasha Simmons, Program Leader, Skills Policy and Resources at Australian National Data Service (ANDS). The course was structured on the 23 (research data) Things, a self-directed learning programme developed by ANDS, suitable for everyone, regardless of their skills and prior knowledge. The programme is full of resources and fascinating data, have a look yourself here.
We started with an introduction to research data (of course!) and discussed data in the scholarly communications lifecycle – this offers a framework for understanding research processes, and a good (interactive) example is available here. We also talked about data sharing models, and challenges around data sharing.
For Tuesday’s session Natasha invited Stephanie Simms from California Digital Library, who presented an introduction to data management plans and DMPonline tool. We also heard some open data stories, from Australia Telescope National Facility that makes available images of the sky collected at the facility, and from The PetaJakarta Data Sharing Project that gathers data from social media (in this case Twitter) to collect information about flooding in Jakarta.
On Wednesday Reid Otsuji from UC San Diego talked about the Open Science Framework and The Carpentries – a way of acquiring new coding and data skills for researchers and librarians. We also talked about making research data FAIR (Findable, Accessible, Interoperable, Reusable), and used FAIR data assessment tool to look at some openly available data. This provoked a discussion on how easy it is to make research data FAIR (not that easy!) and how institutions could provide the infrastructure and support that are required.
The following morning we had guest speakers from UC Berkley, Rachael Samberg and Maria Gould, who presented on licensing research data. This was an extremely interesting talk, and discussed copyright and licencing of data both generated and used by researchers. Later in class we discussed issues around personal and sensitive data. On Friday Gustavo Durand introduced Dataverse – an open source platform developed at Harvard that allows researchers to publish, cite and archive their research data. At the end of the workshop Natasha introduced persistent identifiers and their use in data citation, and we explored different citation styles.
Hands on exercises throughout the course allowed me to experience working with research data, and see issues around data managment from researcher’s perspective. Guest speakers provided me with an opportunity to gain expert insight into many aspects of research data management, and the course structure allowed for numerous discussion and debates. This in turn made me reflect on how nuanced managing research data can be, not only when it comes to copyright and licencing.
*I learnt a lot about Australian wildlife too!