Two years on, and progress update, 1 to 7 September 2012

By Tim Causer, on 7 September 2012

On 7 September 2010, Transcribe Bentham was officially launched upon an unsuspecting public, with the aim of recruiting volunteers from around the world, whatever their background, to help transcribe the unpublished manuscripts of Jeremy Bentham. Our initial progress was steady, if not necessarily spectacular: by the end of our testing period on 8 March 2011 (funded by the Arts and Humanities Research Council), 1,009 manuscripts had been transcribed or partially transcribed, of which 569 (56%) were complete. The end of April 2011 also saw the end of our twelve-month AHRC grant, and the project’s future did not look particularly promising.

A recent Chronicle of Higher Education article (citing a paper by TB staff published in Literary and Linguistic Computing earlier this year), notes our finding that had the two full-time members of staff employed for Transcribe Bentham instead been devoted to transcription alone for twelve months, then they could have produced around two and-a-half times as many transcripts as volunteers would have done in the same period. This finding is certainly true of the state of play at the end of the testing period, but due to the vagaries of academic publishing schedules, it by no means represents what TB and its volunteers have done since, nor the current state of progress. After the testing period ended, Transcribe Bentham won a major international award, its ‘Transcription Desk’ software (developed by the University of London Computer Centre) was released on an open source basis for others to reuse and customise, and (we hope) the project has helped to promote scholarly crowdsourcing and Bentham studies over the past two years.

Most impressive of all, however, has been the sheer volume of work carried out by (an admittedly small core of) TB volunteers, despite the project being run on minimal funding for the past eighteen months, and having rather less staff time devoted to it than is ideal. As of today, 4,255 manuscripts have been transcribed or partially transcribed, which is an increase of 87 on last week’s total. 4,033 (94%) of these transcripts—up 84 on last week—are now complete. This means that they are of the required standard for uploading to UCL’s digital repository, and for use, ultimately, in the production of future volumes of The Collected Works of Jeremy Bentham.

This means that volunteer transcribers have achieved yet another landmark, having completed over 4,000 transcripts. There are often worries about the quality ofthe products of crowdsourced tasks, but we are delighted to report that despite having to deal with Bentham’s handwriting, syntax, style, and occasional habit of almost obliterating a page with deletions, marginalia and interlineal additions, the work of transcribers is of an extremely high standard. The amount of effort, care and attention that TB volunteers put in to ensure that their work is accurate is second to none. They have transcribed over 2.1 million words, plus extensive XML markup. They are also transcribing at a faster rate than one full-time member of staff could manage if she or he were devoted solely to transcribing; an unlikely scenario, given our multifarious duties! We are very lucky that they have chosen to participate, and that they continue to do so.

The more detailed state of progress is as follows:

  • Box 2: 336 manuscripts transcribed of 532 (63%)
  • Box 27: 348 of 350 (99%)
  • Box 35: 266 of 439 (60%)
  • Box 50: 60 of 92 (65%)
  • Box 51: 203 of 940 (21%)
  • Box 62: 53 of 565 (9%)
  • Box 70: 172 of 250 (49%)
  • Box 71: 651 of 665 (97%)
  • Box 72: 606 of 664 (91%)
  • Box 73: 151 of 151 (100%)
  • Box 79: 149 of 199 (74%)
  • Box 95: 80 of 147 (54%)
  • Box 96: 526 of 539 (97%)
  • Box 97: 32 of 288 (10%)
  • Box 98: 10 of 499 (2%)
  • Box 100: 93 of 433 (21%)
  • Box 115: 263 of 307 (80%)
  • Box 116: 218 of 864 (25%)
  • Box 139: 38 of 38 (100%)
  • Overall: 52% of the 8,068 manuscripts uploaded to the website have been transcribed thus far.

So, where next? Transcribe Bentham will at the heart of the Consolidated Bentham Papers Repository, a project funded by the Andrew W. Mellon Foundation which begins officially on 1 October, and is funded for two years. This work will see much of the UCL Bentham Papers digitised and made available for transcription and research, as well as all of the Bentham material held by our new partner institution, the British Library. Significant improvements will be made to the transcription interface to make the process more straightforward for volunteers, and we will be consulting with transcribers as to what kind of improvements they would find useful (this software will, again, be released on an open source basis for other projects to utilise). We hope that this new interface will prove to be more attractive to current and potential volunteers, and further increase the rate of transcription. We have also recently received some further exciting TB-related news, and look forward to talking about that in the coming months.

As ever, we end by thanking our volunteers as without them, there would be no Transcribe Bentham. We remain hugely grateful for their time and efforts, and think that the work they do should be celebrated.