Chapter 3: Digitization and digital resources in the humanities
In Chapter 3, Melissa Terras discusses digitization – the conversion of an analogue signal or code into a digital signal or code. This is the bedrock of both digital library holdings and digital humanities research. It is now commonplace for most memory institutions to create and deliver digital representations of cultural and historical documents, artefacts and images to improve access to, and foster greater understanding of, the material they hold. This chapter focuses on the developing role of digitization to provide resources for research within the digital humanities, highlighting issues of cost, purpose, longevity and use and providing a round-up of sources for guidelines and standards. The recent interest from, and investment by, commercial information providers is juxtaposed with institutional concerns about the creation of digital resources for the humanities.
Case study: Europeana Travel
Lesley Pitman, Librarian and Director of Information Services, UCL SSEES Library at University College London
In May 2009, UCL Library Services became one of the lead partners in Europeana Travel – a two-year digitization project funded by the Commission of the European Union, through their eContentplus programme, following a call the previous year for cultural content in the area of digital libraries.
The European Digital Library Foundation proposed a number of themes for possible projects for this call, including cities, crime and punishment, and travel and tourism. These were all intended to be of broad general interest and to expand the range of high quality cultural resources available through Europeana, which was then in its early development stage. In response, a partnership of national and university libraries from across Europe were brought together, united by two things: strong collections on the theme of travel and tourism and considerable experience in high quality digitization. This partnership, which became the Europeana Travel Project, was significant in bringing together university and national libraries from across Europe. It was supported by the European Digital Library (EDL) and two founding members of the EDL Foundation: CENL, representing European national libraries; and LIBER, representing a broader range of research libraries across Europe, including many university libraries.
The range of libraries and countries represented in the project was impressive and allowed for an extraordinarily wide range of digitized content, within the overall theme of travel and tourism. The partners were the following: the National Libraries of Estonia (the project co-ordinators), Finland, Latvia, Poland, Austria, Slovakia, The Netherlands and Wales; the National and University Library of Slovenia, the University Libraries of Lund, Regensburg, Innsbruck, Trinity College Dublin and UCL, the State and University Library of Lower Saxony, and the University and National Library of Debrecen, plus the EDL Foundation, as the provider of services to end-users through Europeana. Finally, Eremo srl were appointed as scientific managers.
UCL’s role was a significant one, as content provider, but also as work package leader for two distinct work packages; library staff, therefore, found themselves deeply committed to the project from the very beginning. The Commission provided 50% of the funding, and the rest was to be contributed by partner institutions.
Material to be digitized by project partners covered numerous different subjects, languages and formats, including historic maps, postcards, travel tales, diaries, photograph albums, folk songs, rare books and archives. All content was accepted, on the basis that it was free of intellectual property rights issues. UCL contributed three categories of content from the collections of the UCL School of Slavonic and East European Studies Library. They were a collection of almost 300 rare travel books, dating from 1557 to 1860, with a focus on travels in Eastern and central Europe and Russia; a collection of 200 maps of the same region and period and some archive materials relating to Sir Arthur Evans, the archaeologist, who had bequeathed his writings on south-eastern Europe to the School on his death in 1941.
Europeana Travel had five principal aims: to digitize library content on the theme of travel and tourism for use in Europeana; to establish an aggregator, through which Ligue des Bibliothèques Européennes de Recherche – Association of European Research Libraries (LIBER ) libraries, which require such a service can provide content to Europeana and to see a sustainable basis for the aggregator’s continuing functioning; to deepen collaboration between the Conference of European National Librarians (CENL ) and LIBER, in support of Europeana; to mobilize the efforts of the research libraries in support of Europeana; and to provide examples of best practice in digitization methods and processes, constituting a learning opportunity for all libraries wishing to supply digitized material to Europeana.
To achieve its aims, the project was divided into a number of work packages: UCL led the first work package, which was concentrated in the first six months of the project and had as its objective, the planning of the digitization processes and the mechanisms for sharing good practice. The processes and mechanisms put in place in that work package were then used to monitor progress in the second work package, which was concerned with the actual digitization in partner libraries and was led by the Slovenian National Library. An important feature of the project, as with all Europeana projects, is that the content is digitized locally, and the digital images are held on local systems, with only the metadata and thumbnails harvested by Europeana. Both work packages were greatly facilitated by the high level of experience and expertise among project partners.
UCL’s other major role was as leader of the third work package, which was tasked with creating an aggregator for the content coming from the university libraries, to parallel the service provided by the European Library for the national libraries. This was important both to encourage further participation in Europeana by university libraries and to support Europeana itself, as its processes relied on dealing with aggregators, rather than numerous, individual, contributing libraries. The close work on aggregation between UCL (on behalf of LIBER) and the European Library, throughout Europeana Travel, has allowed for further projects to be developed to bring the two aggregators together: the Europeana Libraries Project has since been developed, with substantial input from UCL Library Services. Finally, two further work packages dealt with dissemination to a professional audience and the wider public, and with project management. The public deliverables of the project can all be found on the project website (www.europeanatravel.eu). They include reports on aggregation issues for university libraries, digital preservation and minimum technical standards required for participation in Europeana, including the report Best Practice Examples in Library Digitisation, March 2011 (Sotošek, 2011), which is useful for others involved in large-scale digitization projects, as it documents current best practice in the areas of image capture and equipment, handling of originals, metadata generation, OCR, access, workflow, quality assurance and cost modelling for digital preservation and user evaluation – all as practised by the project partners.
UCL Library Services has benefited hugely from its participation in this project. It has had the opportunity to work with, and learn from, some of Europe’s leading experts in digitization and to contribute its own expertise in project planning and in digital repositories. The funding has allowed some of its rarest and most interesting special collections to be made easily accessible, via its own digital library and to the wider community that uses Europeana, helping to support UCL’s work in research, teaching and public engagement.
Case Study: Assyrian empire builders: governors, diplomats and soldiers in the service of Sargon II, King of Assyria
Karen Radner, Professor in Ancient Near Eastern History, Department of History, University College London
With 1200 original texts in the form of clay tablets inscribed in the cuneiform script, the correspondence between Sargon II, King of Assyria (721–705 BC), and his governors is the largest corpus of state letters known from the ancient world. It provides exceptional insight into the mechanisms of communication between the top levels of authority and the delegation of power in an ancient empire that stretched from the Mediterranean coast to Western Iran and from Anatolia to the Persian Gulf. Until now, this exciting correspondence has only been known by specialist researchers. The website Assyrian Empire Builders aims to change that, making available Sargon’s state letters, together with resources for their study and materials on their historical and cultural context.
This web resource is part of the dissemination strategy for the research project ‘Mechanisms of Communication in an Ancient Empire: the correspondence between the King of Assyria and his magnates in the 8th century BC’. Funded by the Arts and Humanities Research Council (2008–12), the project has its home in UCL’s History Department, where Ancient Near Eastern history has been taught since the 1930s. The website’s aim is twofold: to provide reliable and easily accessible information on Assyria in the second half of the 8th century BC and to promote the use of Sargon’s letters as a source for political, administrative, cultural and military history. The site addresses a broad audience and, while requiring no previous knowledge, it has much to offer, even for specialist researchers.
There are three key sections of the site: ‘About the Project’ (providing information on scope, funding and copyright), ‘Essentials’ (providing highlights and overviews to attract and appeal to a broad audience without any previous expertise) and ‘Royal Correspondence’ (providing access to the text corpus).
All 1200 letters are available in transliteration (renderings of the cuneiform text in Latin script) and English translation, adapted from the editions in four volumes of the State Archives of Assyria series (published 1987–2003) and reproduced with the authors’ permission. The series’ editor, Professor Simo Parpola of the University of Helsinki, kindly supplied the ASCII heritage dataset used to create these publications. This data was converted into XHTML/UTF-8 unicode files and used to generate a fully searchable online text corpus that is linked to the interactive glossaries and indices distilled from it. This has created a powerful research tool that far exceeds the possibilities of the print editions. All letters in Assyrian language can also be displayed in cuneiform script, using the font CuneiformNA, which can be downloaded from the site; this is especially useful for teaching.
The text corpus was created in collaboration with Professor Steve Tinney of the University of Pennsylvania, Philadelphia, who designed the underlying programing and is part of the Open Richly Annotated Cuneiform Corpus (ORACC) – a workspace and toolkit for the development of a complete corpus of cuneiform that is quickly finding worldwide acceptance as the discipline standard for the online presentation and management of cuneiform sources. The Sargon Letters data can be directly harvested and utilized by all other projects operating within the ORACC umbrella, e.g. the site Cuneiform Texts Mentioning Israelites, Judeans, and Related Population Groups, maintained by scholars at the University of Tel Aviv.
While this part of the site is the most attractive for researchers, the section ‘Essentials’ is designed to appeal to a more general audience. Containing introductions to the political and cultural history of the second half of the 8th century BC, the content is grouped into the subdivisions: Kings, Governors, Diplomats, Soldiers, Countries, Cities and Archives. Each essay is about 1200 to 1500 words and divided into three to four parts, each of which is illustrated by a photo, drawing or map. Suggestions for further reading are provided; the bibliography features editions of primary sources and key studies of the last two decades in English, German, French and Italian, grouped in thematic subdivisions. When available, links to online publications, Google Books and related websites are given. In addition, a number of colleagues have given permission to make their work available.
Thirty-seven of Sargon’s letters are presented in a ‘Highlights Section’. The clay tablets resemble a mobile phone in size and shape and are inscribed in cuneiform, either using the Assyrian or Babylonian language. Our high resolution photographs of text samples in the British Museum are composites that show the front, back, top, bottom and sides in a single image, with measurements and scales. The individual photographs were taken over a few days in autumn 2008 in the Students’ Room of the Middle East Department and the resulting .jpg images combined into composites using Photoshop. The tablets were chosen for their excellent state of preservation content and provide a good starting point for studying the original sources.
The website provides interactive lists of all words used in the translations that are not self-explanatory and all proper names mentioned in Sargon’s letters, giving etymologies for all personal names and mini biographies for all individuals. Wherever possible, place names are identified with their modern equivalents and linked to Google Earth and Google Maps.
The site template was created by web designer Ruth Horry. The content is written and maintained by the UCL project team (Karen Radner, Mikko Luukko and Silvie Zamazalova), who regularly add pages for the ‘Essentials’ section and update the bibliography. The site went online on 23 December 2009. Google Analytics is used to monitor access rates and user interests. So far, the site has attracted 23,980 visits by 14,994 visitors from 143 countries, whose average time on the site is 2:17 minutes (20 October 2011) [Update: So far, the site has attracted 46,415 visits by 30,991 visitors from 157 countries, whose average time on the site is 1:59 minutes (1 October 2012)]. These numbers signal a dramatic change in the letters’ accessibility, especially if one bears in mind that the print run of the four State Archives of Assyria volumes was 1000 copies each. The site is among Google’s first ten search results for all obvious search terms, such as ‘Assyrian Empire’ and ‘Sargon II’.
Assyrian Empire Builders: governors, diplomats and soldiers is, therefore, increasing access to a resource previously known only to specialist researchers, utilizing internet technologies and digitized content to encourage academic debate and novel research, whilst both fostering teaching with, and raising the profile of, this unique collection.
• Sotošek, K. S. (2011) Best Practice Examples in Library Digitisation, Europeana Travel, http://europeanatravel.eu/downloads/ETravelD2%202final.pdf
• Introduction to Imaging: issues in constructing an image database (Besser and Trant, 1995). This was updated in 2003 and remains a useful, if short, introduction to digitization basics.
• Moving Theory into Practice: digital imaging for libraries and archives (Kenney and Rieger, 2000). This provides an overview of digitization projects and the issues which confront them.
• Digital Imaging: a practical handbook (Lee, 2002). This mostly deals with managerial issues, rather than the implementation of specific technologies, but is an excellent reference to aid those in charge of making institutional decisions regarding digitization.
• Digitizing Collections: strategic issues for the information manager (Hughes, 2004). This details both practical and strategic issues that staff in memory institutions need to understand, before making cultural material available online.
• Digital Heritage: applying digital imaging to cultural heritage (Macdonald, 2006). This contains various chapters written by those working on important, well established projects in digitization across all aspects of culture and heritage.
• Digital Images for the Information Professional (Terras, 2008) focuses on the use of digital images across the library and archive sector, including all aspects of image based digitization.
• Digital Perspectives (Rukowski, 2010). This is an edited collection, discussing various managerial and practical aspects of digitization.
• Preparing Collections for Digitization (Bülow and Ahmon, 2011). This is a useful source of guidance on preparatory activities that are a necessary part of any digitization project.
• Evaluating and Measuring the Value, Use and Impact of Digital Collections (Hughes, 2011). This is an excellent overview of the issues surrounding how to measure the impact of digitization.
• Cornell University Library’s online tutorial, Moving Theory Into Practice, Digital Imaging Tutorial (2003). This provides an illustrated overview of the digitization process (www.library.cornell.edu/preservation/tutorial/contents.html)
• The JISC Digital Media Service. This is an online resource and remains the first port of call for up-to-date, well explained, authoritative information and guidance for those wishing to undertake a digitization programme. (www.library.cornell.edu/preservation/tutorial/contents.html)