By Louise Seaward, on 9 November 2016
Since 2010, Transcribe Bentham has collected a huge amount of data in the form of transcripts submitted by volunteers. A research seminar taking place in December at the UCL Centre for Digital Humanities will explore how this data can be analysed and searched to help people find out more about the content of Bentham’s writings. The details are as follows:
Visualizing the Transcribe Bentham corpus
Dec 06, 2016, 5:30pm
G31, Foster Court, UCL, Malet Place, London, WC1E 7JG
How can we gain an overview of the 17,000 pages of Bentham’s manuscripts made available by Transcribe Bentham? Methods to provide an overview of the corpus may help domain-experts find corpus areas relevant for their research. In this work we have applied computational techniques to visualize the corpus, providing a general view of its content.
First, a lexical extraction was performed to choose terms to model the corpus. Then, term clusters were created based on similarity between the terms’ contexts, and visualized as corpus maps. The maps provide an overview of the corpus as a whole, as well as of corpus terms more prominent in different corpus periods. The issue of evaluating these corpus maps will also be discussed.
All welcome and there will be drinks and discussion after the talk. Please note that registration is required.