Archive for the 'Uncategorized' Category

This week: UCL Laptop Orchestra (UCLOrk) at the UCL Festival of the Arts

By Nicolas E Gold, on 18 May 2015

The UCL Laptop Orchestra (UCLOrk) is performing this week on Wednesday 20th May at 1pm in the Quad Events Space as part of the UCL Festival of the Arts.  The one-hour lunchtime session will comprise a lecture/recital on the history and practice of laptop orchestras, combined with performances of three pieces written by members of the ensemble.  Tickets are free and available here.

Digital Classicist London seminars 2015

By Simon Mahony, on 12 May 2015


The programme for the Digital Classicist London & Institute of Classical Studies Summer 2015 seminar series is now published.

Meetings are on Fridays at 16:30 in room G21A (except where otherwise specified), Senate House, Malet Street, London WC1E 7HU.

Full listings together with abstracts are available on the Digital Classicist seminar page.

All are welcome and no registration is needed.

Bluclobber, or: Enabling complex analysis of large scale digital collections

By Melissa M Terras, on 7 May 2015

Recently, Jisc announced  their Research Data Spring programme, which is providing funding to a variety of pilot projects in order to find “new technical tools, software and service solutions, which will improve researchers’ workflows and the use and management of their data”. We’re delighted that our pitch, ‘Enabling Complex Analysis of Large Scale Digital Collections‘, in collaboration with the British Library Digital Research team, and UCL Research Computing, is one of the  funded projects. The idea that we pitched is that

The British Library (BL) has numerous digital datasets, but not the processing power for users to run advanced queries against or analyse them. We will use UCL’s world leading Research Computing to open up this digital data, investigating the needs and requirements of a service that will allow researchers to undertake complex searching of the BL’s digital content. [more]

Over a three month period, we’ll be exploring how to get the BL’s dataset of 65,000 digitised, out of copyright books onto UCL’s High Performance Computing facilities, and we’ll then work with a range of researchers from across the Arts and Humanities in running “easy”, then more complex, then really quite tricky searches across this corpus (which represents 4% of the British National Bibliography) in order to both aid those researchers, but work out how we can help facilitate access to this type of computing for a much wider research audience. James Baker, a Curator of Digital Research has already blogged about the overview of the project, so, a month down the line into a three month project, how are we getting on?

The first thing, of course, was to get the data across to UCL from the BL. We are just across the road, really, and still, as is the nature with big datasets, its easier to run that over on physical media than over the network. The first part of the project is really about ingestion, mounting the data, and understanding its structure, and we’re very lucky to have James Hetherington from UCL Research Computing on the case. The data itself comprises of 224Gb of compressed ALTO XML representing 60k+ 17th, 18, and 19th century books, and one of the interesting features of the data is that each word is expressed as a single line including where it is on the page (from top and from left) in pixels. We are therefore not just dealing with vanilla OCR, but can reconstitute the layout of the page: this means we also have to reconstitute the text before doing any text mining. For example, the word “plaza” in one of the books is encoded thus with its place on the page:

<String ID=”P56_ST00016″ HPOS=”367″ VPOS=”152″ WIDTH=”76″ HEIGHT=”35″ CONTENT=”plaza” WC=”0.92″ CC=”03000″/>

– this is exciting as we can do some interesting things with placement of text and images further down the line, but first, comes the reconstituting, and getting things ready to process. There are other quirks – the data for each individual book is provided in one zip file, which means we’ve got 65,000 of the blighters. We need to do things like restructure the data so we can use parallelism in the reads, etc etc.

Now, UCL has fantastic infrastructure for research computing, but we’re the first project that has used it for the Arts and Humanities (woohooo, that has been am ambition of mine for a few years). Legion, our centrally funded resource for running complex and large computational queries across a large number of cores, is normally used across the sciences, and from a technical point of view its not been set up for this type of data, so time has had to be taken to install XML libraries on Legion given the scientists had no requirements for them previously. Choices follow – what language should we use to query? We’ve chosen python so we can express queries in a language comprehensible to domain experts. We need to develop efficient mathematical models for querying, to find out how many cores are needed for processing: in the first tests we’ve analysed 1/150th of the corpus across one node, of 16 cores, and our early tests show that 100 machines should be able to analyse the whole corpus in a few seconds, but we need to test that further.


And here we have a screenshot of the text of the book, reconstituted from the XML. What you are seeing here is a paragraph of a text, and below, we can repaint the exact placement of words on the page (although it looks a bit like a ransom note letter cut from magazines at the moment, but hey, it works!). We’re now placed to be running some one word searches across the corpus, testing efficiency, and proving that we can do global queries across the corpus successfully. We have plenty of choices to make about what information we can and need to report back to those who requested the search, and what format is most useful. We’ve got the data up and running, now comes refinement, and at our next team meeting in a couple of weeks, we should have results from a couple of real queries: keeping note of technical issues we face along the way. We have two other members of the core team: UCLDH’s David Beavan, is keeping us on track with the computational linguistics element of the project, given his previous research background in this area, and UCL CASA‘s Martin Austwick is helping with data visualisation of the search results, given his background in this area.

Alongside the technical aspects of getting the data and infrastructure to a stage where we can do what we promised (run searches across all of the books!) we’ve also identified four early career researchers who have detailed queries to do, and we’ll be undertaking two days of workshops in June with them, learning more about their needs, and what we need to do to make this useful for Arts and Humanities researchers. But discussing those searches is for another blogpost! I’ll leave it here with a note on the title of this blogpost: the long title is the formal name of the project, but James H, in the screenshot above, decided to call the project Bluclobber: British Library UCL Open Books something something something (he forgets) … and well, the twitter handle was free, so it may well stick.

Our next meeting is at the British Library in a few weeks time, and we’ll have more to report back then. A final word – we need to pitch for the next round of funding, and if you wanted to vote our idea up the ranks a bit for second round funding, feel free to upvote us here, thanks! We have big hopes for rolling this out to a much, much wider audience in the future…








We turn five!

By Melissa M Terras, on 23 March 2015


Five years ago, we were in the final run up to the official launch of UCL Centre for Digital Humanities, at an event that happened in May 2010. And such a lot has happened since then! We’ve worked on a range of projects, from helping sort out linguistic issues with domain names and how best to provide access to UK Census data, from peering into the Slade Archive, to looking back at the history of Digital Humanities as a discipline, from looking at cultural aspects of European identity through large volumes of textual materials, to reading archival materials using Handwriting Recognition Technology. These, and more, can be seen over at our projects page. We’ve even won awards for our work on Transcribe Bentham, the Great Parchment Book, and the Day of Digital Humanities (2012). What a five years we have had! Time for a party, dont you think?

But how shall we celebrate? We wanted to launch something that was forward facing, as well as having a “well done us” event that is backward looking. But what to do?

One of the things our field as a discipline – given it is so new – lacks, is named lectures: annual invited series where esteemed academics can have a public platform to speak about their work. So we decided that, for our fifth birthday, we would launch one! And who else to name it after but Professor Susan Hockey? Susan Hockey is the Emeritus Professor of Library and Information Studies at UCL, and was Director of what was then the School of Library, Archive and Information Studies (now the Department of Information Studies) from 2001 to 2004. Before coming to UCL, she held positions at Oxford University, where she is an Emeritus Fellow of St Cross College, at Rutgers University and at the University of Alberta. As chair of the Association for Literary and Linguistic Computing from 1984 to 1997 she founded the journal Literary and Linguistic Computing, now the Journal of Digital Scholarship in the Humanities. She has published widely on text analysis applications, markup, teaching computing in the humanities and the role of libraries in managing digital resources. In 2004 she became the third holder of the Busa Award, a lifetime achievement award for the application of information and communications technologies to humanities research. We are absolutely delighted that to mark #UCLDH5, Susan Hockey will be giving the first inaugural lecture of the annual Susan Hockey Lecture in Digital Humanities.

The lecture, Digital Humanities: Perspectives on Past, Present and Future,  will take place on Wednesday 27 May 2015, at 6pm (exactly five years after UCLDH was formally launched!) in the Sir Ambrose Fleming Lecture Theatre, Roberts Building, UCL. Everyone is welcome, but spaces are limited so please do register to attend.

There will be cake!  And we hope you like our 5th birthday graphics, done (as ever) in our inimitable house style, by our designer at large, Rudolf Ammann. Cant have a UCLDH5 party without a logo! – we hope to see you there.



PanoptiCam launched

By Rudolf Ammann, on 23 March 2015


PanoptiCam is a new project running a surveillance camera on Jeremy Bentham’s cabinet in the university’s South Cloisters:

Seeing Jeremy Bentham’s auto-icon can evoke a wide array of emotions from surprise and shock to mirth. PanoptiCam captures people’s reaction using a webcam mounted above the auto-icon, with the camera feed posted to our website in real time, and time lapse photography generating days in the life of Jeremy Bentham’s current, yet eternal, viewpoint.

We’ve rounded up a few captures and provided a bit of background on the project logo.

There is a Twitter feed for project news and another for hourly scheduled captures.

PanoptiCam has been noted in a few articles from around the world:

The project is a collaboration between UCL Centre for Advanced Spatial Analysis, UCL Centre for Digital Humanities, UCL Public and Cultural Engagement, and UCL’s Bentham Project.

Audiences & Cultural Experiences in the Digital Age – workshop, Thursday 7 May

By Sarah Davenport, on 13 March 2015

A team of DIS/UCLDH PhD students has been awarded a grant by the Joint Faculty Institute of Graduate Studies to hold a participatory workshop on Audiences and Cultural Experiences in the Digital Age. The workshop will be run in collaboration with PhD students from the City University London and Middlesex University. It will take place on Thursday 7 May and will bring together scholars and students from the broader area of Arts & Humanities and Social Sciences.

Further details and registration:

For more information, please contact Christina Kamposiori directly at​


By Chris J Dillon, on 3 March 2015

Much interest in Universal Acceptance at ICANN‘s recent meeting in Singapore.
Universal Acceptance means getting all domain names to work, however long they are and whatever script they’re in.
Traditionally most Top Level Domains have been ASCII three digit if generic names (e.g. .com) and two digit if country codes (e.g. .uk). Now more or less anything goes and there will be extensions such as .versicherung and .삼성 (Samsung). And so the title of this article could be an email address: Chris.Dillon@handphone.Samsung. “Handphone” is Konglish (Korean English) for “mobile phone”.
Incidentally, if your browser does not display the hangul in this article correctly, you have a Universal Acceptance issue.

UCLDH5: The First Susan Hockey Lecture in Digital Humanities

By Sarah Davenport, on 25 February 2015

susan-hockey-lecture-darkpurpleThe UCL Centre for Digital Humanities was founded in 2010, and to celebrate the achievements of the centre over the last five years we are launching a named lecture series, The Susan Hockey Lecture in Digital Humanities. We are especially pleased to announce that Professor Susan Hockey will be giving the inaugural lecture.

Digital Humanities: Perspectives on Past, Present and Future

Wednesday 27 May, 6pm

Sir Ambrose Fleming Lecture Theare, Roberts Building, UCL

In our first installment of the series, Professor Susan Hockey discusses the trajectory of digital humanities from its many years on the fringes to its current position at the centre of the humanities scholarly arena, and its future challenges. Today, conferences, courses and publications in digital humanities abound, and jobs are advertised almost every week. The advent of the World Wide Web shifted emphasis from analytical software to communication and publication tools bringing humanities resources to a much wider audience in classrooms and in the home. The groundwork for many of these new ways of working is in place now. Some time in the future, humanities information sources will be mainly digital. What are the implications of a much broader user community for these resources, and for libraries and archives, the traditional custodians of humanities information sources? How can research in digital humanities contribute to future developments? And what should our students learn in order to build successfully on what has already been achieved?

All welcome, the lecture will be followed by a drinks reception in the Roberts Foyer.  Please note that registration is required.


Translation and Transliteration of Contact Information initial report open to public comment

By Chris J Dillon, on 17 December 2014

For the last year I’ve been co-chairing an ICANN working group on Translation and Transliteration of Contact Information,
an issue which will arise once the current ASCII-based Whois directory of domain name contacts is replaced by a system
allowing domain name holders to input data in their own languages and scripts.

Yesterday was the big day, as the WG’s initial report was opened to
public comment and will stay open till February 1.

The UCL Laptop Orchestra (UCLOrk)

By Nicolas E Gold, on 17 November 2014

UCLOrk Speaker Array

At the UCL DigiFest 2014 (, the Music Systems Engineering Research Team led by Dr Nicolas Gold (UCL Computer Science/UCLDH) ran a 1.5hr session for people interested in digital music.  Participants learned about building digital instruments using the Pure Data programming language, explored the sonic possibilities of synthesised sounds, and then came together as a laptop orchestra to perform a new work composed by one of the team.

Laptop orchestras are an emerging type of ensemble (beginning about ten years ago with Princeton’s PLOrk) characterised by novel musical controllers, custom-built digital instruments, and hemispherical speakers that give each instrumentalist a strong sense of their own instrument and how it fits within the orchestra.  In keeping with this tradition, the UCL Laptop Orchestra (UCLOrk) uses custom-built hemispheres that were constructed by the Computer Science Department’s Technical Support Group.

We are hoping to make the ensemble a more permanent group as a platform for artistic work and performance research.  More information can be found at and anyone interested in future participation is warmly invited to contact Nicolas Gold ( for more information.