Love Data Week Research Data Case studies: The British Sign Language (BSL) Corpus
By ucyldva, on 12 February 2018
This week is Love Data Week an international event ‘to raise awareness and build a community to engage on topics related to research data management, sharing, preservation, reuse, and library-based research data services.’ As part of Love Data Week, a number of free events are taking place across Bloomsbury. We will also be publishing short research data case studies as part of Love Data Week. These case studies cover a range of disciplines and types of data generated accross UCL.
The first case study looks at the British Sign Language (BSL) Corpus.
The British Sign Language (BSL) Corpus
Research Area: Deafness Cognition and Language Research
Dr Kearsy Cormier is a Reader in Sign Language Linguistics at DCAL and affiliated with the UCL Linguistics research department. Dr Cormier is interested in the linguistic structure of sign languages, especially British Sign Language (BSL) and in visual aspects of language more generally
About the project
The British Sign Language (BSL) Corpus is a collection of video clips showing Deaf people using BSL, together with background information about the signers and written descriptions of the signing.
The video clips were collected as part of the original BSL Corpus Project, funded between 2008 and 2011 by the Economic and Social Research Council.
The data are all video clips (and associated metadata) showing 249 deaf people from all over the UK using British Sign Language.
During the project the data was stored on a secure server within Deafness Cognition and Language Research Centre at UCL.
The data are archived with UCL Digital Collections under the British Sign Language Corpus Project (CAVA Repository). There are two points of entry for working with the BSL Corpus data – one for researchers (via UCL CAVA) and the other that’s more user friendly for casual visitors (intended for the Deaf community). This approach offers an example of how data can be shared effectively for different audiences and make the research data more accessible to communities represented in the data.
The data collected in the CAVA repository has been used extensively for teaching of Masters students.
There were a number of challenges to the project. These include challenges related to data collection and the disclosure of names in the interview data by participants. As a result of some of these potential discoloures of names data had to be restricted to registered researchers who sign a user licence before accessing data.