Chapter 2: Social media for digital humanities and community engagement

Chapter Overview

As well as studying users, and trying to design more suitable resources for their needs, digital resources can also integrate user-generated content, using social media and crowdsourcing techniques, as Claire Ross shows in Chapter 2. Social media has attracted millions of users, many of whom have integrated these sites into their daily work practices. Although this is sometimes seen as an ephemeral leisure activity – being on Facebook as a distraction from real work – social media is increasingly attracting the attention of academic researchers, who are intrigued by its affordances and reach. Social networks, blogs, podcasts and crowdsourcing are now central to our work in digital humanities. Because of their ease of use, they offer an opportunity for powerful information sharing, collaboration, participation and community engagement. Yet, we know too little about who is accessing and using social media and crowdsourcing applications and for what purpose, in an academic or cultural heritage context. This chapter discusses the use of social media in digital humanities research, highlights the projects at the heart of UCLDH and stresses the opportunities and challenges of utilizing such techniques, both in an academic context and to enhance community engagement.

Case Study: Transcribe Bentham: crowdsourcing in practice

T. Causer, Faculty of Laws, University College London , J. Tonra, Department of English, University of Virginia, and V. Wallace, Center for History and Economics, Harvard University

The Bentham Project at University College London has harnessed the power of crowdsourcing to facilitate the transcription of the manuscript papers of Jeremy Bentham (1748–1832), the great philosopher and reformer. UCL Library houses 60,000 of Bentham’s manuscripts, the majority of which have never been properly studied. The purpose of the Bentham Project is to produce a new authoritative edition of Bentham’s works – the Collected Works of Jeremy Bentham – based partly on transcripts of these papers. The Project has been active since 1959, and, since then, around 20,000 of Bentham’s papers have been transcribed and 28 volumes of his works have been published. The Project aims to produce around 70 volumes in total, and around 40,000 papers remain untranscribed.

The Bentham Papers Transcription Initiative, or Transcribe Bentham for short, was established in 2010 to quicken the pace of transcription, speed up publication, widen access to Bentham’s papers, raise awareness of Bentham’s ideas and contribute to the long-term preservation of this priceless collection in UCL Library’s digital repository. Transcribe Bentham outsources manuscript transcription – a task originally performed by skilled researchers – to members of the public, who require no special training or background knowledge to log on and participate. Transcribe Bentham was funded by the Arts and Humanities Research Council for one year.

In order to begin the process of crowdsourcing the transcription of Bentham’s manuscripts, two components were vital: high-resolution digital images of the manuscripts, which were photographed by UCL Learning and Media Services, and a tool to allow users to transcribe the text. The transcription tool was developed with simplicity in mind. Users type their submissions into a plain-text box, with the option of adding some basic formatting to their transcriptions. By highlighting a piece of text or a position in the text, and by clicking a button on the transcription toolbar, users are able to identify particular characteristics of the manuscripts. These include spatial and organizational features, such as line breaks, page breaks, headings and paragraphs; linguistic features, like notes, unusual spellings and foreign-language text; compositional features, such as additions and deletions; and interpretive decisions about questionable readings and illegible text. This TEI (Text Encoding Initiative) XML encoding adds a further layer of depth and complexity to the transcripts, helping to render them searchable in a thorough and categorical fashion.

When a user completes a transcription and submits it for moderation, it is checked for textual accuracy and encoding consistency by a member of the project staff. If the transcript is deemed to be completed to a satisfactory degree, the transcript is locked to prevent further editing (though the formatted transcript remains available to view). If the moderator decides that the submitted transcript is incomplete, and could be improved with further editing from users, it remains available for editing on the Transcription Desk. Completed transcripts are uploaded to the digital collection of Bentham’s papers maintained by UCL Library and are viewable alongside the respective manuscript images; they will also, eventually, form the basis of printed volumes of Bentham’s collected works.

As manuscript transcription, particularly the transcription of Bentham’s difficult handwriting, is a complex task, the project team aimed to create a cohesive and dedicated community of mutually supportive and loyal transcribers, rather than a crowd of one-time users. The strategy to build a dedicated user community was twofold. First, the team devised a far-reaching publicity campaign to raise awareness of the project and to recruit transcribers; second, the team designed a user-friendly, easily navigable interface, in order to retain users, while facilitating communication between users and staff. The interface which hosts the manuscript images and transcription tool is a customized Mediawiki. It not only provides the means of integrating the essential components of the Transcription Desk, but also allows for the inclusion of guidelines for users, project documentation, a discussion forum and social media that enables interaction and discussion. A reward system and progress bars help to sustain user motivation.

During its six-month testing period, Transcribe Bentham attracted 1207 registered users (excluding administration and project staff and seven blocked spam accounts), who cumulatively transcribed 1009 manuscripts, of which 569 – or 56% – were deemed to be complete and, thus, locked to prevent further editing. Progress has continued since the end of the testing period, and, as of 3 June 2011, 1273 volunteers have registered with the project. One thousand four hundred and seventeen manuscripts have been transcribed, of which 1179 – or 83% – are complete; the proportion of completed transcripts has risen, partly due to the growing experience of volunteers and partly due to project staff working through and signing off on previously incomplete transcripts.

During the six-month testing period, the Transcription Desk received a total 15,354 visits from 7441 unique visitors or an average of 84 visits from 41 unique visitors per day. The publication of a feature article in the New York Times (NYT) on 27 December 2010 had a vital and enduring impact upon Transcribe Bentham. It is helpful, therefore, to consider the project’s testing period as having two distinct parts: period one, or the pre-NYT period, covering 8 September 2010 to 26 December 2010 (110 days); and period two, or the post-NYT period, covering 27 December 2010 to 8 March 2011 (72 days). Remarkably, 30% of all visits to the transcription desk during the six-month testing period came between 27 December 2010 and 4 January 2011.

Over the six-month testing period as a whole, volunteers transcribed an average of 35 manuscripts each week. It is estimated that if this rate were maintained, around 1800 transcripts could be produced by Transcribe Bentham volunteers in 12 months. These figures might seem unremarkable when compared to the results of other crowdsourcing initiatives, such as Galaxy Zoo, which has successfully built up a community of 250,000 users, who have classified over 100 million galaxies. However, transcribing Bentham’s papers is complex and time-consuming. Volunteers are asked to transcribe and encode manuscripts which are usually several hundred – and, occasionally, several thousand – words in length, in which text is frequently at various angles and which can be complicated further by deletions, marginalia, interlinear additions and so on. During the six-month testing period, Transcribe Bentham’s volunteers have produced around 5% of the 20,000 manuscripts transcribed by Bentham Project staff during 50 years; assuming that the average length of a manuscript is 250 words, volunteers have transcribed an estimated 250,000 words. As of 3 June 2011, volunteers have – on the same estimation – transcribed about 355,000 words.

Transcribe Bentham’s long-term future is secure, and the Transcription Desk will remain available for the foreseeable future. The project will continue, therefore, to have a significant impact in several fields. It has raised awareness of Bentham’s life and thought and produced transcripts which will contribute to the editorial work of the Bentham Project, including publication of printed editions of important Bentham texts and the creation of an invaluable, fully searchable digital collection, freely available on the web. The transcription tool behind the project will be released as a package on an open-source basis for other projects to customize. Transcribe Bentham has also publicized collaborative manuscript transcription widely, garnering a great deal of attention from the media and blogs, and has recently been honoured with an Award of Distinction in the prestigious Prix Ars Electronica.

Case Study: Babble or backchannel: conference tweeting in practice

Claire Ross, Melissa Terras, Claire Warwick, and Anne Welsh, UCL Department of Information Studies
Microblogging, with special emphasis on Twitter.com¹ – the best known microblogging service – is increasingly used as a means of extending commentary and discussion during academic conferences. This digital ‘backchannel’ communication (non-verbal, real-time communication, which does not interrupt a presenter or event: Ynge, 1970) is becoming more prevalent at academic conferences, in educational use and in organizational settings, as it allows for the ‘spontaneous co-construction of digital artefacts’ (Costa et al., 2008 , p.1). Such communication usually involves notetaking, sharing resources and individuals’ real-time reactions to events. The study of digital humanities conference tweets provides an insight into the digital humanities community of practice and into precisely how academics use Twitter in a conference based setting.

Formal conference presentations still mainly occur in a traditional setting; a divided space with a ‘front’ area for the speaker and a larger ‘back’ area for the audience, implying a single focus of attention. There is a growing body of literature describing problems with a traditional conference setting: lack of feedback, nervousness about asking questions and a single speaker paradigm (Anderson et al., 2003; Reinhardt et al., 2009). The use of a digital backchannel, such as Twitter, positioned in contrast with the formal or official conference programme, can address this, providing an irregular or unofficial means of communication (McCarthy and Boyd, 2005), which changes the dynamics of the room from a one-to-many transmission to a many-to-many interaction, without disrupting the main channel communication.

Digital humanists have, historically, been quick to adopt emergent media to aid their own tasks. This study analysed the use of Twitter as a backchannel for digital humanities’ conferences, focusing on three different physical conference settings held from June to September 2009 (Digital Humanities, 2009; That Camp, 2009; and Digital Resources in the Arts and Humanities, 2009). During the conferences, unofficial Twitter backchannels were established, using conference specific hashtags (#dh09, #thatcamp and #drha09, #drha2009)² to enable visible commentary and discussion. The resulting corpus of individual ‘tweets’ provides a rich dataset, allowing analysis of the use of Twitter in an academic setting.

It is possible to gain an insight into the user intentions of the digital humanities Twitter community through open-coded content analysis. To understand the interactions and user intentions of Twitter backchannel users, it was necessary to categorize the tweets. Tweets were manually labelled into seven categories: asking organizational questions; comments on presentations; discussions and conversations; establishing an online presence; jotting down notes; sharing resources; and unknown. The majority of tweets in the corpus fell into the category of jotting down notes, triggered predominately by the front channel presentation, suggesting that participants are sharing experiences and, to a degree, coconstructing knowledge. What is surprising is the lack of direct commentary on presentations. Although Reinhardt et al. (2009) argue that Twitter enables thematic debates and offers a digital backchannel for further discussion and commentary, the tweet data suggests that this does not appear to have happened to a significant extent at the digital humanities’ conferences. This raises the question of whether a Twitter-enabled backchannel promotes more of an opportunity for users to establish an online presence and enhance their digital identity, rather than encouraging a participatory conference culture. Nevertheless, jotting down notes can be considered an active contribution to the community, enabling the expansion of communication and participation in the event.

Participation inequality has been observed in other collaborative online environments for more than a decade (Nielson, 2006; Anderson, 2008 ; Nonnecke and Preece, 2000; Kiesler et al., 1984) and would seem to apply to Twitter. A high amount of users produced only one Tweet during the three conferences, which lends support to the notion of a 90:9:1 rule (Nielsen, 2006) for new social media, where 90% of users are lurkers, 9% of users contribute from time to time and 1% participate a lot and account for the majority of contributions. The fact that this is demonstrated in the corpus suggests that despite the close-knit nature of the fairly small digital humanities researcher community, it may also be somewhat intimidating for those new to the field, conference or Twitter itself.

When looking at the corpus of Tweets, one striking characteristic of the content is that conference hashtagged Twitter activity does not constitute a single distributed conversation, but, rather, multiple monologues and a few intermittent, discontinuous, loosely joined dialogues, which users enter and exit at will. It is possible to suggest that beyond being a tool for writing and communicating, microblogging platforms may serve as foundations for building or enhancing a community of practice. Digital technology is often suggested as a tool to support communities of practice (see Wenger, White and Smith, 2009; Yardi, 2008 ; Adams, Blandford and Lund, 2005). Microblogging as a digital backchannel can be suggested as being such a tool, by facilitating a forum for community related discussion, resulting in great levels of reflections, discourse, deep content knowledge (Yardi, 2006) and distributed expertise throughout the community. Such collective interaction and learning results in the improvement of the knowledge of each individual in the community and contributes to the development of the knowledge within the domain. For this reason, this method can be regarded as promising for academic environments, in facilitating informal communication, learning and the coconstruction of knowledge.

The use of Twitter as a platform for conference backchannels enables the community to expand communication and participation of events amongst its members. This enhanced participation allows the digital humanities community to cocreate knowledge, ensuring that the ‘collaborative knowledge of the community is greater than any individual knowledge’ (Johnson, 2001, 31). The Twitter enabled backchannel constitutes a complex multidirectional discursive space, in which the conference participants make notes, share resources, hold discussions and ask questions, as well as establishing a clear, individual online presence. The predominance of notetaking suggests that the digital humanities community could be classed as social reporters, commenting on the conference presentations for outsiders, rather than collaborating during the conference. There was also a tendency for a small group of users to produce the majority of tweets, interacting with each other about other matters. This suggests the small, friendly nature of the digital humanities researcher community, but may also be somewhat intimidating for those new to the field or conference.

With the increasing prevalence of Twitter in academic conference environments, it is possible to present digital backchannel communication as a viable tool for the coconstruction of knowledge within a community of practice. However, this argument is by no means complete or definitive. Those who participate in digital backchannel communication at conferences, whether organizers, speakers or attendees, must understand and confront their visibility, issues of user awareness and potential negative factors, in order to influence the use of the Twitter enabled backchannel as an effective conference tool which fully encourages a participatory conference culture. The Twitter enabled backchannel thus raises some interesting questions about the nature of conference participation and whether or not it is helped or hindered by a digital backchannel. Rather than pointless babble, the Twitter record produced at each conference provides important evidence regarding how digital humanities – as a community of practice – function and interact.

QRator Project: enhancing co-creation of content in practice

C. Ross, UCL Department of Information Studies , and S. Gray, UCL Centre for Advanced Spatial Analysis

Emergent mobile technologies and the proliferation of social media tools offer museum professionals new ways of engaging visitors with their collections. Museum audiences are no longer ‘passive recipients of wisdom from on high, but want to participate, to question, to take part as equals, and to receive as high standard of service as would be offered at any other type of leisure site’.³ UCL’s QRator project is exploring how handheld mobile devices, social media software and interactive digital labels can create new models for public engagement, personal meaning-making and the construction of narrative opportunities inside museum spaces.

The QRator project is located within the emerging technical and cultural phenomenon known as ‘The Internet of Things’: the technical and cultural shift that is anticipated as society moves to a ubiquitous form of computing, in which every device is ‘on’ and connected, in some way, to the internet. The project is based around technology developed at the Centre for Advanced Spatial Analysis, UCL, and is an extension of the ‘Tales of Things’ project, which has developed a ‘method for cataloguing physical objects online which could make museums and galleries a more interactive experience’ (Giles, 2010), via means of QR tags.

QRator provides the opportunity to move the discussion of objects from the museum label onto users’ mobile phones, allowing the creation of a sustainable, world-leading model for two-way public interaction in museum spaces. UCL’s Grant Museum of Zoology houses one of the country’s oldest and most important natural history collections. The Grant Museum has a strong history as a teaching collection, but also functions as a key gateway for the public to engage with academic issues in innovative ways. The project aims to genuinely empower members of the public within the Grant Museum, by allowing them to become the ‘curators’. QRator is an iPad based system that allows visitors and academic researchers to share their views on an exhibition and discuss provocative questions about the ways museums operate and the role of science in society. The iPads are linked to an online database, allowing the public to view ‘curated’ information and, most notably, to send back their own interpretation and views, via an iPad application. Unique to the UCL technology is the ability to ‘write’ back to the QR codes. This allows members of the public to type in their thoughts and interpretations of the object and click ‘send’. Similar in nature to sending a text message or a tweet, the system will enable the Grant Museum to become a true forum for academic-public debate, using low cost, readily available technology, enabling the public to collaborate and discuss object interpretation with museum curators and academic researchers. QRator encourages visitors to tackle big questions in the life sciences and engage with the way museums work. Questions include: ‘Should human and animal remains be treated any differently?’ And ‘every medicinal drug you have ever taken was tested on animals. Is this a necessary evil?’ Visitors can examine museum specimens, before leaving their interpretation on an iPad to create a digital ‘living’ label that other visitors can read and respond to. Visitor narratives subsequently become part of the museum objects’ history and, ultimately, the display itself, via the interactive label system, allowing the display of comments and information directly next to the museum objects.

Many visitors expect, or want, to engage with a subject physically, as well as personally (Adams, Luke and Moussouri, 2004; Falk and Dierking, 2000). Visitors see interactive technology as an important stimulus for learning and engagement (Falk et al., 2002; Black, 2005), empowering users to construct their own narratives, in response to museum exhibits. Beyond expected content synthesis, these immersive activities can stimulate learning. Engaged within this immersive environment, museum objects become rich sources of innovation and personal growth (Fisher and Twiss-Garrity, 2007). When visitors experience a museum which actively encourages individual narrative construction, their activity is directed not towards the acquisition or receipt of the information being communicated by the museum, but rather towards the construction of a very personal interpretation of museum objects and collections. The unpredictability of multiple narrative forms, created by the use of mobile devices and interactive labels, introduces new considerations to the process by which museums convey object and collection interpretation and opens up museums to become a more engaging experience.
The participation in collaborative narrative creation, centred on museum objects, can provoke creative, independent analysis, promoting a personal connection with museum exhibition subject matter that is unparalleled in more traditional and passive approaches (Silverman, 1995; Roberts, 1997; Hooper-Greenhill, 2000; Fisher and Twiss-Garrity, 2007).

References

• Adams, A., Blandford, A. and Lunt, P. (2005) Social Empowerment and Exclusion: a case study on digital libraries, ACM Transactions on CHI, 12 (2), 174–200

• Anderson, C. (2008). The Long Tail. 2nd ed. New York: Hyperion

• Anderson, R. J., Anderson, R., Vandegrift, T., Wolfman, S., & Yasuhara, K. (2003). Promoting interaction in large classes with computer-mediated feedback. In B. Wasson, S. Ludvigsen, & U. Hoppe (Eds.), Designing for Change in Networked Learning Environments: Proceedings of the International Conference on Computer Support for Collaborative Learning (pp. 119-123). Dordrecht, Netherlands: Kluwer Academic Publishers.Becta (2008) Analysis of Emerging Trends Affecting the Use of Technology in Education,Becta Research Report.

• Black, G. (2005) The Engaging Museum: developing museums for visitor involvement, Routledge.

• Costa, C. & Beham, G. & Reinhardt, W & Sillaots, M. (2008). Microblogging In Technology Enhanced Learning: A Use-Case Inspection of PPE Summer School 2008. In Vuorikari, R, Drachsler, H, Manouselis, N, Koper, R. (eds), Proceedings of the Workshop on Social Information Retrieval in Technology Enhanced Learning. Maastricht, The Netherlands.Crook, C., Cummings, J., Fisher, T., Graber, R., Harrison, C. and Lewin, C. (2008) Web 2.0 Technologies for Learning: the current landscape – opportunities, challenges and tensions, http://partners.becta.org.uk/uploaddir/downloads/page_documents/research/web2_technologies_learning.pdf

• Falk, J. H. and Dierking, L. D. (2000) Learning from the Museum: visitor experiences and making meaning, AltaMira.

• Falk, J. H., Cohen Jones, M., Dierking, L. D., Heimlich, J., Scott, C. and Rennie, L. (2002) A Multi-Institutional Study of Exhibition Interactives in Science Centers and Museums, Institute for Learning Innovation.

• Fisher, M. and Twiss-Garrity, B. A. (2007) Remixing Exhibits: constructing participatory narratives with on-line tools to augment museum experiences. In Trant, J. and Bearman, D. (eds), Museums and the Web 2007: proceedings. Toronto: archives & museum informatics, www.archimuse.com/mw2007/papers/fisher/fisher.html

• Giles, J. (2010) Barcodes Help Objects Tell their Stories, New Scientist, April, http://www.newscientist.com/article/dn18766-barcodes-help-objects-tell-their-stories.html.

• Hooper-Greenhill, E. (2000) Museums and the Interpretation of Visual Culture, Routledge.

• Johnson, L., Witchey, H., Smith, R., Levine, A. and Haywood, K. (2010) The 2010 Horizon Report: Museum Edition, The New Media Consortium.

• McCarthy, J. F. and Boyd, D. (2005) Digital Backchannels in Shared Physical Spaces: experiences at an academic conference. In Gerrit C. van der Veer, Carolyn Gale (Eds.): Proceedings of the 2005 Conference on Human Factors in Computing Systems, CHI 2005, held on April 2-7, 2005 Portland, Oregon, USA,. ACM 2005 Millen, D. R., Fontaine, M. A. and Muller, M. J. (2002) Understanding the Benefits and Costs of Community of Practice, Communications of the ACM: Special Issue on Online Communities, 45 (4), 69 – 73.

• Silverman, L. H. (1995) Visitor Meaning Making in Museums for a New Age, Curator, 38 (3), 161–9.

• Reinhardt, W., Ebner, M., Beham, G. and Costa, C. (2009) How People are using Twitter during Conferences. In Hornung-Prähauser, V (ed), Creativity and Innovation Competencies on the Web: proceedings of 5. EduMedia conference held on May 4th and 5th, 2009, organized by Salzburg Research Forschungsgesellschaft and St.Virgil Salzburg, Salzburg, St.Virgil Salzburg .

• Roberts, L. C. (1997) From Knowledge to Narrative: educators and the changing, Smithsonian Institution Press, London, Washington, D.C.

• Wenger, E., White, N. and Smith, J. (2009) Digital Habitats: stewarding technology for communities, CPsquare.

• Yardi, S. (2006) The Role of the Backchannel in Collaborative Learning Environments. In , Proceedings of the 7th International Conference of Learning Sciences held on , June 27-July 1, 2006. organized by, International Conference on Learning Sciences, Indiana University, IN, USA, International Society of the Learning Sciences.

• Ynge, V. (1970) On Getting a Word in Edgewise. In , In MA Campbell (ed.), Papers from the Sixth Regional Meeting of the Chicago Linguistic Society held on , April 1-18, 1970 organized by , Chicago: Chicago Linguistic Society, 567-78.

Web Resources

• www.ucl.ac.uk/transcribe-bentham
• www.qrator.org
• www.wikipedia.org/wiki/Social_media
• www.slideshare.net/
• www.scribd.com/
• www.nature.com/nature/peerreview/debate/nature05535
• mediacommons.futureofthebook.org/mcpress/ShakespeareQuarterly_NewMedia/2010/02/13/scanningcurrent-practices-of-scholarly-peer-review/
• www.nla.gov.au/ndp/
• www.galaxyzoo.org/
• www.zooniverse.org/
• www.casa.ucl.ac.uk/
• www.ucl.ac.uk/museums/
• www.ucl.ac.uk/public-engagement/