X Close




Archive for the 'Uncategorized' Category

Hangeul and hanja in domain names

By Chris J Dillon, on 17 June 2015

King Sejong

King Sejong stands in the middle of Sejong St in front of the Sejong Cultural Center in Seoul. Obviously a key historical figure, then. He was behind the creation of the Korean “hangeul” script in the mid-15th Century. It was done scientifically, so that many consonants were pictures of where they are articulated and are systematically related to similar consonants. So ㄱ, g is related to ㅋ, k with the additional dash and to ㄲ, kk by duplication. ㄲ, incidentally, represents a tense k sound which may be unique to the Korean language. The vowels are a system of dots and dashes, and so ㅣ is i and ㅏ is a.

I was in Seoul for the ICANN Variant Issues Project meeting on the Chinese, Japanese and Korean Label Generation Rules (i.e. which characters should be allowed in Top Level labels). China made huge progress early on and Japan has caught up. Korea has not used Chinese characters, “hanja”, much for several decades but is keen to be able to use them for business reasons – .現代 “Hyundai” would be a lot more widely understood in East Asia than the hangeul way of writing it: .현대.

Just before the meeting I had decided to read through the proposed Japanese table and found various things I shan’t trouble you with – I learnt at an early age that linguists’ ideas of “interesting” don’t necessarily correspond with those of non-linguists’. Then I found something of rather greater interest: 卍 and the reverse form. Currently there is no mechanism to stop the use of the reverse form in a domain name. However, it has been referred for a policy decision. The solution may be that it is removed from the table, or an evaluation panel takes on this sort of role.

I’m back from Seoul keen to do a Korean project and have set up a Korean wiki with a colleague at SOAS: Korean Wiki

If you speak or are learning Korean and are based at UCL, I’d love to hear from you.

Digital Classicist London 2015 seminar series

By Simon Mahony, on 4 June 2015

Digital Classicist seminar logoThe Digital Classicist London summer seminar series starts this Friday with a PhD student from UCL Ancient History.

As in previous years, the seminars will be recorded with video, audio and slides made available on the DC seminar pages. See last year’s programme. In addition the video recordings are also uploaded to the Digital Classicist London YouTube channel.

This year the seminars will also be live streamed and the link will be available on the programme page.

Seminar: From lost archives to digital databases

Friday June 5 at 16:30 in room G21A, Senate House, Malet Street, London, WC1E 7HU

Jen Hicks (UCL)
From lost archives to digital databases

Of the leather documents used by the administration and individuals of the Seleukid empire (ca 312- 63 BC), all that remains are the small pieces of clay that were used to seal them; these however survive in their tens of thousands in Mesopotamia and the Levant. In this paper I will consider the potential and limitations of using these lumps of mud, through the construction of digital databases and statistical analysis, to reconstruct these lost archives, and to understand the imperial structures of the Seleukid power.

The full abstract is available on the programme page.
The seminar will be followed by wine and refreshments.


The full 2015 programme is available on the Digital Classicist London seminar page.

5 things we’ve learned about Digital Humanities in the last 5 years

By Melissa M Terras, on 24 May 2015


At the end of May, 2015, it will be exactly five years since the formal launch of UCL Centre for Digital Humanities. Our mission is “is to champion, catalyse, promote, facilitate, undertake, advise and publicise activities in Digital Humanities (with as wide an interpretation of that phrase as possible) throughout the founding Faculties and UCL, in all areas of teaching, research, enabling, and public engagement”. We’ve covered a phenomenal amount of ground in the past five years, most notably with the establishment of our ground-breaking MA/MSc in Digital Humanities, and the building up a list of (often prize winning) research projects with associated funding that runs into the tens of millions of pounds. 5 years! It’s all at once no time at all, and a chance to pause and consider what we’ve achieved in that time, whilst planning ahead for the future. In a fast paced research environment such at UCL, what have we learnt about Digital Humanities itself during that time?

1. DH is about the Digital as much as the Humanities

When UCLDH was founded, we were a research centre in a department (Information Studies) in the UCL Arts and Humanities Faculty. Although we did always have encouragement from UCL Faculty of Engineering, it became quite clear that our remit, did, and should, extend well beyond this one Arts and Humanities department, and that our constituency wasn’t just the Arts and Humanities: to do this properly we needed core support and buy in (not just financial) from the Engineering Sciences at UCL. In the last year we’ve now formalized that agreement, and I believe our DH centre is rare in that it reports to, and is supported by, both the Engineering Faculty and the Arts and Humanities at UCL, equally.

This isn’t lip service. We have equal membership across computer science and information studies, and input from, say, Civil and Geomatic engineering as well as Dutch, and work closely with Medical Physics as well as Information Science at UCL. We’re a bona-fide cross faculty hub, now: the place you go to find project partners, to get advice, to start on the interdisciplinary path, and to present work in our (often sold out) research-in-progress seminars. Our core team members are developing computational projects in the Arts and Humanities that benefit our understanding of human culture and society, as well as bringing gnarly cultural and heritage problems to the computational sciences, developing new approaches and techniques there. This duality is also reflected in our MA/MSc programme: students sign up to a core set of courses, but graduate with either a master of arts or a master of sciences, depending on the optional courses and dissertation study we tailor for their background and aspirations. We also have a range of doctoral students operating across this disciplinary divide quite successfully. DH is, and should, operate across the computational sciences and humanities space, and we’re doing something special at UCLDH by being equally present with a foot in both camps.

2. Starting a DH centre? Start local!

As UCLDH has become more well known over the past few years, I’ve been asked to give a range of talks about how to set up a Digital Humanities centre, and my main take home message is this: start local, with a project appropriate and important to your institution, which shows people what DH is in the doing, rather than the telling. One of the first major successes we had with UCLDH (working with the Laws Faculty and the Library, and others) was the Transcribe Bentham crowdsourcing project, which, as well as being a successful research project, received a lot of coverage and has won various prizes. However, it turned out to also be an institutionally strategic move: Jeremy Bentham is such a totemic figure to UCL, that people “got” what we were trying to do with Digital Humanities whilst avoiding the “what is Digital Humanities anyway?” conversation. “Oh, you’re the people who do stuff with digitization, and platform building, and playing, and support, and infrastructure, and the research is all that… oh I get it!” Institutional support and goodwill is incredibly important when building such a cross-faculty research centre, and Transcribe Bentham gave us an immediately understandable project on which to hang the “we do that sort of thing” tag. My advice is therefore: start local. What can your institution do, that no one else can? What do you have in your collections, your library, your architecture, your vicinity? Build your first DH project around that, so people get what you are trying to do, and the space in which you inhabit. It means you’re centre will be trusted with the next mad-cap idea which perhaps isn’t so obvious (such as the Panopticam, showing the world what Bentham sees, which is our latest wheeze…)

3. DH is about supporting the micro as well as the macro

D’you want big infrastructural projects that will change the world! We got ‘em! UCLDH are leading the development and coordination for rolling out non-English character URLs across the whole Internet. We’re leading the building of infrastructure to allow access to historical and modern census datasets in the UK, and supporting the use of the census to study migration issues. We’re involved in large scale, long scale European projects to provide Handwritten Text Recognition Technology for libraries and archives and to explore cultural aspects of European identity using text mining. But building up a capacity for DH within an institution is, aside from these projects which reflect the research aims of the academics involved, also about the small scale, and it is necessary to undertake building blocks activities to help boost the research culture in the space of using computing technologies in the humanities across the institution. As well as the big projects, we’re equally as proud as the small ones, such as the Slade Archive Project, a pilot project which has explored how to use digital technologies to explore the archive of a world leading art school at UCL, and support an area – Art History – which has historically avoided using digital research methods. Or the projects which are undertaken by individual scholars, such as Julianne Nyhan’s Hidden Histories project, documenting the early scholars and teams working in Humanities Computing. This gig isn’t all about the biggies, but its also about supporting and encouraging the research environment, and individual scholars in their research interests. It’s incredibly important to do both, to establish the research centre as a serious concern, whilst also encouraging a change of culture across college that supports and encourages this type of activity. In that regard, one of the things I’m most proud of doing in the past few years is the creation of the UCL Multi-Modal Digitisation Suite, providing a facility for research in and teaching of digitisation technologies, supporting a range of activities at UCL and resulting in prize winning outputs such as the Great Parchment Book project. Watch this space, there’s more about to come that has resulted from building up that capacity…

4. DH is about digitally walking the digital talk, or, it ain’t what you do, it’s how you do it

When talking to UCLDH’s Designer at Large about organising #UCLDH5, he remarked upon the fact that I shouldn’t be printing up mugs and producing geegaws and tchotchkes and stuff… we are digital, and that should speak for itself. And its true: if we want to be taken seriously in the Digital Humanities space we have to show that we know digital and we do digital and we understand digital and that we are delivering high quality digital products, in the way we promote and hold ourselves, and everything that we do. We take design matters very seriously at UCLDH (often contributing to UCL’s webspace along the way – in the internal style guide we are listed as being a centre to look to for exemplary web presence, and we’re now involved in helping the institution roll out its next CMS). We like to think we are showing how playful and creative digital design can be. Look at the gorgeous designs for UCLDH5, produced by Rudolf, and see how it harks back to our own culture of design, such as our logos, and in our related facilities and projects… We’re taking our digital identity seriously, as we should. Design and digital identity is an often overlooked aspect of the Digital Humanities – but not at UCLDH.

5. There ain’t no party like a DH partaaaay

Much of the work we do in DH is linking people: people to things, people to institutions, people to funding, publishing, and presentation opportunities. To do so, you need to know enough people – but UCLDH operates in an institution where most people live a couple of hours commute away, and there isnt a culture of hanging around to see what is happening of an evening. Our planned social program is incredibly important to our success as a centre, as we provide the space for people to hear more about DH, whilst meeting others, and discussing DH projects (such as in our seminar series) or seeing where DH – or humanities, or computing – happens, in our visits to likeminded places around college. Our “Friends of DH list” – people around UCL who want to hear what we are up to – now numbers over 300, and as the community keeps changing, its important to keep that broad understanding of the who, the what, and the why around college, so you can pounce on opportunities as they strike. “Oh, you should talk to…” and “Does anyone know anyone who works on X” are the two most common phrases spoken around here. If you want to be part of a hub, you have to make the hub happen.

With that in mind, we wanted to throw a party for our #UCLDH5 celebrations, but we had a choice to make: either do something to show off our achievements, or do something a bit more nuanced that would look to both the past and where we came from, and plan in the future for a large event every year showcasing the best in DH (not just from UCL). We’re looking forward to establishing the Susan Hockey Lecture in Digital Humanities as an annual event in which to come together to discuss and share ideas about DH, and build up the DH community across London – and beyond – further.

6. With DH, there’s always more than you thought…

Yeah yeah, I said 5 things for 5 years, and here’s 6. But that’s the thing with DH, there’s always opportunities flying past on the wind, and room for cramming in just one more experimental project, one more meeting, one more research paper. The area is evolving and changing rapidly, and you have to be agile and respond to things are much as possible, and expect the unexpected. Who knew, for example, when we started out with Transcribe Bentham, that we would end up using the crowdsourced transcripts to develop holistic Handwritten Text Recognition (HTR) technology? You gotto roll with the punches. The environment we are working in is much changed to the one that UCLDH was founded in, and as we confidently say we’ve done everything we set out to do 5 years ago – and then some! – we now need to look to the future, and see how UCLDH can contribute to different initiatives on our horizon such as UCL East, the Alan Turing Institute, and new possibilities of collaboration with the UCL Institute of Education. This place, and this subject, doesnt stand still… and there’s always room for one more thing…

So there we have it. 5 – nay, 6! – things we have learnt about DH at UCLDH in the past 5 years. These hints and tips are very local to us, showing how we work within the institutional context at UCL. But that’s where we are: what are the things you’ve learnt about the Digital Humanities in the last five years, from where you stand? We’d be interested in hearing! Do share, either online, or in person, at #UCLDH5.

With thanks to @profserious for the idea of what to write about when writing about #UCLDH5.

This week: UCL Laptop Orchestra (UCLOrk) at the UCL Festival of the Arts

By ucacngo, on 18 May 2015

The UCL Laptop Orchestra (UCLOrk) is performing this week on Wednesday 20th May at 1pm in the Quad Events Space as part of the UCL Festival of the Arts.  The one-hour lunchtime session will comprise a lecture/recital on the history and practice of laptop orchestras, combined with performances of three pieces written by members of the ensemble.  Tickets are free and available here.

Digital Classicist London seminars 2015

By Simon Mahony, on 12 May 2015


The programme for the Digital Classicist London & Institute of Classical Studies Summer 2015 seminar series is now published.

Meetings are on Fridays at 16:30 in room G21A (except where otherwise specified), Senate House, Malet Street, London WC1E 7HU.

Full listings together with abstracts are available on the Digital Classicist seminar page.

All are welcome and no registration is needed.

Bluclobber, or: Enabling complex analysis of large scale digital collections

By Melissa M Terras, on 7 May 2015

Recently, Jisc announced  their Research Data Spring programme, which is providing funding to a variety of pilot projects in order to find “new technical tools, software and service solutions, which will improve researchers’ workflows and the use and management of their data”. We’re delighted that our pitch, ‘Enabling Complex Analysis of Large Scale Digital Collections‘, in collaboration with the British Library Digital Research team, and UCL Research Computing, is one of the  funded projects. The idea that we pitched is that

The British Library (BL) has numerous digital datasets, but not the processing power for users to run advanced queries against or analyse them. We will use UCL’s world leading Research Computing to open up this digital data, investigating the needs and requirements of a service that will allow researchers to undertake complex searching of the BL’s digital content. [more]

Over a three month period, we’ll be exploring how to get the BL’s dataset of 65,000 digitised, out of copyright books onto UCL’s High Performance Computing facilities, and we’ll then work with a range of researchers from across the Arts and Humanities in running “easy”, then more complex, then really quite tricky searches across this corpus (which represents 4% of the British National Bibliography) in order to both aid those researchers, but work out how we can help facilitate access to this type of computing for a much wider research audience. James Baker, a Curator of Digital Research has already blogged about the overview of the project, so, a month down the line into a three month project, how are we getting on?

The first thing, of course, was to get the data across to UCL from the BL. We are just across the road, really, and still, as is the nature with big datasets, its easier to run that over on physical media than over the network. The first part of the project is really about ingestion, mounting the data, and understanding its structure, and we’re very lucky to have James Hetherington from UCL Research Computing on the case. The data itself comprises of 224Gb of compressed ALTO XML representing 60k+ 17th, 18, and 19th century books, and one of the interesting features of the data is that each word is expressed as a single line including where it is on the page (from top and from left) in pixels. We are therefore not just dealing with vanilla OCR, but can reconstitute the layout of the page: this means we also have to reconstitute the text before doing any text mining. For example, the word “plaza” in one of the books is encoded thus with its place on the page:

<String ID=”P56_ST00016″ HPOS=”367″ VPOS=”152″ WIDTH=”76″ HEIGHT=”35″ CONTENT=”plaza” WC=”0.92″ CC=”03000″/>

– this is exciting as we can do some interesting things with placement of text and images further down the line, but first, comes the reconstituting, and getting things ready to process. There are other quirks – the data for each individual book is provided in one zip file, which means we’ve got 65,000 of the blighters. We need to do things like restructure the data so we can use parallelism in the reads, etc etc.

Now, UCL has fantastic infrastructure for research computing, but we’re the first project that has used it for the Arts and Humanities (woohooo, that has been am ambition of mine for a few years). Legion, our centrally funded resource for running complex and large computational queries across a large number of cores, is normally used across the sciences, and from a technical point of view its not been set up for this type of data, so time has had to be taken to install XML libraries on Legion given the scientists had no requirements for them previously. Choices follow – what language should we use to query? We’ve chosen python so we can express queries in a language comprehensible to domain experts. We need to develop efficient mathematical models for querying, to find out how many cores are needed for processing: in the first tests we’ve analysed 1/150th of the corpus across one node, of 16 cores, and our early tests show that 100 machines should be able to analyse the whole corpus in a few seconds, but we need to test that further.


And here we have a screenshot of the text of the book, reconstituted from the XML. What you are seeing here is a paragraph of a text, and below, we can repaint the exact placement of words on the page (although it looks a bit like a ransom note letter cut from magazines at the moment, but hey, it works!). We’re now placed to be running some one word searches across the corpus, testing efficiency, and proving that we can do global queries across the corpus successfully. We have plenty of choices to make about what information we can and need to report back to those who requested the search, and what format is most useful. We’ve got the data up and running, now comes refinement, and at our next team meeting in a couple of weeks, we should have results from a couple of real queries: keeping note of technical issues we face along the way. We have two other members of the core team: UCLDH’s David Beavan, is keeping us on track with the computational linguistics element of the project, given his previous research background in this area, and UCL CASA‘s Martin Austwick is helping with data visualisation of the search results, given his background in this area.

Alongside the technical aspects of getting the data and infrastructure to a stage where we can do what we promised (run searches across all of the books!) we’ve also identified four early career researchers who have detailed queries to do, and we’ll be undertaking two days of workshops in June with them, learning more about their needs, and what we need to do to make this useful for Arts and Humanities researchers. But discussing those searches is for another blogpost! I’ll leave it here with a note on the title of this blogpost: the long title is the formal name of the project, but James H, in the screenshot above, decided to call the project Bluclobber: British Library UCL Open Books something something something (he forgets) … and well, the twitter handle was free, so it may well stick.

Our next meeting is at the British Library in a few weeks time, and we’ll have more to report back then. A final word – we need to pitch for the next round of funding, and if you wanted to vote our idea up the ranks a bit for second round funding, feel free to upvote us here, thanks! We have big hopes for rolling this out to a much, much wider audience in the future…








We turn five!

By Melissa M Terras, on 23 March 2015


Five years ago, we were in the final run up to the official launch of UCL Centre for Digital Humanities, at an event that happened in May 2010. And such a lot has happened since then! We’ve worked on a range of projects, from helping sort out linguistic issues with domain names and how best to provide access to UK Census data, from peering into the Slade Archive, to looking back at the history of Digital Humanities as a discipline, from looking at cultural aspects of European identity through large volumes of textual materials, to reading archival materials using Handwriting Recognition Technology. These, and more, can be seen over at our projects page. We’ve even won awards for our work on Transcribe Bentham, the Great Parchment Book, and the Day of Digital Humanities (2012). What a five years we have had! Time for a party, dont you think?

But how shall we celebrate? We wanted to launch something that was forward facing, as well as having a “well done us” event that is backward looking. But what to do?

One of the things our field as a discipline – given it is so new – lacks, is named lectures: annual invited series where esteemed academics can have a public platform to speak about their work. So we decided that, for our fifth birthday, we would launch one! And who else to name it after but Professor Susan Hockey? Susan Hockey is the Emeritus Professor of Library and Information Studies at UCL, and was Director of what was then the School of Library, Archive and Information Studies (now the Department of Information Studies) from 2001 to 2004. Before coming to UCL, she held positions at Oxford University, where she is an Emeritus Fellow of St Cross College, at Rutgers University and at the University of Alberta. As chair of the Association for Literary and Linguistic Computing from 1984 to 1997 she founded the journal Literary and Linguistic Computing, now the Journal of Digital Scholarship in the Humanities. She has published widely on text analysis applications, markup, teaching computing in the humanities and the role of libraries in managing digital resources. In 2004 she became the third holder of the Busa Award, a lifetime achievement award for the application of information and communications technologies to humanities research. We are absolutely delighted that to mark #UCLDH5, Susan Hockey will be giving the first inaugural lecture of the annual Susan Hockey Lecture in Digital Humanities.

The lecture, Digital Humanities: Perspectives on Past, Present and Future,  will take place on Wednesday 27 May 2015, at 6pm (exactly five years after UCLDH was formally launched!) in the Sir Ambrose Fleming Lecture Theatre, Roberts Building, UCL. Everyone is welcome, but spaces are limited so please do register to attend.

There will be cake!  And we hope you like our 5th birthday graphics, done (as ever) in our inimitable house style, by our designer at large, Rudolf Ammann. Cant have a UCLDH5 party without a logo! – we hope to see you there.



PanoptiCam launched

By Rudolf Ammann, on 23 March 2015


PanoptiCam is a new project running a surveillance camera on Jeremy Bentham’s cabinet in the university’s South Cloisters:

Seeing Jeremy Bentham’s auto-icon can evoke a wide array of emotions from surprise and shock to mirth. PanoptiCam captures people’s reaction using a webcam mounted above the auto-icon, with the camera feed posted to our website in real time, and time lapse photography generating days in the life of Jeremy Bentham’s current, yet eternal, viewpoint.

We’ve rounded up a few captures and provided a bit of background on the project logo.

There is a Twitter feed for project news and another for hourly scheduled captures.

PanoptiCam has been noted in a few articles from around the world:

The project is a collaboration between UCL Centre for Advanced Spatial Analysis, UCL Centre for Digital Humanities, UCL Public and Cultural Engagement, and UCL’s Bentham Project.

Audiences & Cultural Experiences in the Digital Age – workshop, Thursday 7 May

By Sarah Davenport, on 13 March 2015

A team of DIS/UCLDH PhD students has been awarded a grant by the Joint Faculty Institute of Graduate Studies to hold a participatory workshop on Audiences and Cultural Experiences in the Digital Age. The workshop will be run in collaboration with PhD students from the City University London and Middlesex University. It will take place on Thursday 7 May and will bring together scholars and students from the broader area of Arts & Humanities and Social Sciences.

Further details and registration:  https://www.eventbrite.com/e/audiences-cultural-experiences-in-the-digital-age-tickets-16381350059

For more information, please contact Christina Kamposiori directly at christina.kamposiori.11@ucl.ac.uk.​


By Chris J Dillon, on 3 March 2015

Much interest in Universal Acceptance at ICANN‘s recent meeting in Singapore.
Universal Acceptance means getting all domain names to work, however long they are and whatever script they’re in.
Traditionally most Top Level Domains have been ASCII three digit if generic names (e.g. .com) and two digit if country codes (e.g. .uk). Now more or less anything goes and there will be extensions such as .versicherung and .삼성 (Samsung). And so the title of this article could be an email address: Chris.Dillon@handphone.Samsung. “Handphone” is Konglish (Korean English) for “mobile phone”.
Incidentally, if your browser does not display the hangul in this article correctly, you have a Universal Acceptance issue.