Transcription: then and now

By Justin Tonra, on 13 May 2010

Though scholars often balk at the perceived connotations of the digital part of Digital Humanities, one of the discipline’s most interesting and useful outcomes is the way in which it sheds new light on centuries-old practices. In so doing, Digital Humanities encourages us to examine afresh the basic assumptions, conventions, and practices of an activity like transcription, strengthening our understanding of the process and allowing us to better imagine how technology might play a role therein.

At Transcribe Bentham, one particular challenge involves reconciling the use of wiki technology to transcribe Jeremy Bentham’s manuscripts with the transcription methods that have been used for the same purpose in the decades that have passed since the publication of the first volumes of the new Bentham Collected Works in 1968. In recent years, Bentham’s manuscripts have been transcribed directly into Microsoft Word, with another member of the Bentham Project then checking a hard copy of the transcript and making corrections in pencil.

Scholars at the Bentham Project, like Professor Philip Schofield (left), have developed specific methods for transcribing various features of Bentham’s manuscripts, such as additions, deletions, marginal notes and summaries, and illegible text. For example, where Bentham provides an alternative reading above a word in the text, it is recorded between forwards slashes:

[ . . . ] in this way /manner/ it is possible [ . . . ]

This amounts to a shorthand notation system, where forward slashes are used to identify alternative text. Indeed, in Microsoft Word, it would be possible to transcribe the appearance of the manuscript by formatting the text to make the word ‘manner’ appear in superscript. However, neither of these methods identifies the meaning or function of the word ‘manner’ – instead, they provide a visual key for someone who is familiar with the transcription methods of the Bentham Project. By moving these transcriptions into TEI-compliant XML, the meaning of this authorial operation can be explicitly encoded into the text:

in this way <add type=”alternative”>manner</add>it is possible

The purpose of the Bentham transcripts which have already been completed (some 20,000 of about 60,000) was to aid editors in the preparation of printed volumes for the Collected Works; thus, much deleted text which would not have appeared in the printed volumes was left out of the transcriptions. It will be the aim of Transcribe Bentham to reinstate this deleted text (which may be of interest to some scholars) into the transcriptions, and encode it as such. Once completed, this type of encoding will facilitate much more refined searching than a simple full-text search: for instance, a user may choose to see every instance in which Bentham deleted the work ‘panopticon’.

Eventually, these TEI-encoded transcriptions of Bentham’s manuscripts will increase the utility of the already-invaluable Bentham Papers Database by providing facsimile images and encoded transcriptions of the manuscripts that it currently lists.  In this way, it will become an important complementary resource to the printed volumes of the Collected Works.