X Close

Transcribe Bentham

Home

A Participatory Initiative

Menu

Archive for the 'Users' Category

Project Update – gamifying the transcription of Bentham’s writings

By uczwlse, on 28 February 2019

In the second volume of his ‘Writings on the Poor Laws’ as part of a discussion of pauper education, Bentham described a game of marbles as ‘little more than a contrivance for killing time’. I’m sure he would have been much more impressed by the inventive digital games devised at our latest event…

Over the weekend of 23-24 February we collaborated with The National Archives to deliver Hacking the Past: An Archives Game Jam. The event was kindly hosted by UCL Innovation and Enterprise in their fantastic Base KX space.

For those not familiar with the term, a Game Jam is an event where people work in teams to design and create video games over a short space of time.

As with our 2017 Bentham Hackathon, which was organised with the technology company IBM, this event was focused on experiments with digital technology and historical documents. But this time, the key objective was the gamification of the task of transcription.

The participants were challenged to:

  • Create an imaginative game that makes the task of transcribing historical documents fun and efficient
  • Based on handwritten documents from The National Archives and the Bentham Collection

The idea was to encourage attendees to invent Games With a Purpose (GWAP) that would combine computer processing with human expertise to make transcription as easy, accessible and entertaining as possible.

We provided document images, metadata and transcriptions generated by both human transcribers and machine learning models via the Transkribus platform for Handwritten Text Recognition.

The gamification objective is particularly relevant for Transcribe Bentham because the work undertaken by our volunteers is quite difficult and time-consuming. If transcribing Bentham became more fun, it is likely that more people would take part and our transcription rate would increase. This would contribute significantly to the Bentham Project’s wider mission of publishing the entirety of Bentham’s Collected Works.

On the morning of Saturday 23 February we welcomed about 40 attendees, along with supporting staff from The National Archives and UCL. We were also lucky to have two Transcribe Bentham volunteers (Gill Hague and Annie Brindle) taking part.

Introductory presentations covered the rationale for the challenge, the work of UCL Centre for Digital Humanities and a summary of the available documents from The National Archives and the Bentham Collection (held at UCL Special Collections and The British Library).

Then it was time for the hacking to begin! Participants joined together in teams and spread out across the venue. Fuelled by Coca Cola, cookies and the occasional piece of fruit, they worked for more than 13 hours to design and develop their games. Staff from The National Archives offered inspiration to the teams with a workshop on palaeography and the Twine storytelling software.

The Game Jam culminated in a showcase where each team presented their game. I was privileged to sit on the judging panel alongside Mark Bell from The National Archives and Rob Miles from The British Library. And we had a tough job of judging the creative results!

Ultimately, the winning team was The Chase. This team created a game for PC and mobile where players are on the run from the police and have to transcribe documents from The National Archives to solve crimes from the past. The game was a reworking of the 1980s arcade game Frogger, mixed with the TV game show ‘The Chase’. Handwritten words float across the screen and players have to jump onto each word to transcribe it, whilst avoiding the oncoming police officers. The game had difficulty levels, a scoring system and a workflow for using transcripts as data for machine learning. This game was imaginative, useful and fun and captured the attention of all of us judges.

Image credit: Team The Chase

The other teams presented the following inventive games…

Sébastien Brisbois

Participating and presenting live on Skype from Strasbourg, Sébastien designed a game to train people to read Bentham’s handwriting by asking them to transcribe individual letters. He used Phaser software to create real game play based on the transcription of one word.

Image credit: Sébastien Brisbois

Mercy

As a player of this game you become a black cat living in a Victorian prison, transcribing and tagging prisoner records from The National Archives. This team created a WordPress site for their game and platform game play using flowlab and the Transkribus web interface. Players improve automated transcriptions and create useful metadata for the archives.

Image credit: Team Mercy

Aspretto

This team constructed a sophisticated proof-of-concept for a mobile game called ‘Bentham’s Kitchen’. It was described as a mix of the dating app Tinder and the language learning program Duolingo – a curious combination! Players try to escape Bentham’s Panopticon by impressing the prison guard with a delicious meal. They collect ingredients and recipes from Bentham’s Prison Cookbook as rewards for their transcriptions. The app had various difficulty levels that involved validation and correction of automated transcriptions from Transkribus, as well as transcription from scratch.

Image credit: Team Aspretto

Supertechs

This team was a group of young people from a coding club in Essex. Their game ‘Scribe Kingdom’ tasked players with killing evil demons who were holding onto historical documents.  Players collected coins by completing transcriptions and could open up access to mini games like hangman or a crossword. The team worked with p5.js, HTML and JSON.

Team Supertechs presenting their game. Image credit: Louise Seaward

Image credit: Team Supertechs

Annie Brindle

Annie presented her idea of ‘Transcription Ready’, drawing on her experience as a Transcribe Bentham volunteer. In this game transcription was linked to quiz questions drawn from the content of Bentham’s manuscripts, which could be created by both subject experts and the player community.  She also created a stop-motion video of a play dough Bentham envisaged as a little reward for transcribers.

Video credit: Annie Brindle

Jorge Worje

Jorge Worje is a forgetful wizard trying to protect his castle and he needs help to read his spells in order to fight off monsters. This team created a game using Unity where players corrected machine-generated transcriptions of documents from The National Archives, whilst being bombarded with incoming monsters. Survive as long as you can!

Image credit: Team Jorge Worje

Canterbury

This team came up with an inventive game called ‘Trackscribe’, where players received a short and sweet pay-off for their transcription work.  After four words have been transcribed, the player gets the chance to ‘ride’ the loops of an image of a handwritten word, with a bike icon. A demo built in Scratch showed how satisfying this reward could be!

Image credit: Team Canterbury

My fellow judges and I agreed that everyone had put an enormous amount of effort and thought into their games. The Chase team went away with a lot of pride and a swag bag of books and goodies from The National Archives.

The Chase are the winners! Image credit: Louise Seaward

As with our 2017 Hackathon, this event provided a lot of food for thought for the future of Transcribe Bentham. Now that we have migrated our online Transcription Desk to UCL servers, we are in a good position to consider funding applications  that will help us make transcribing Bentham more enjoyable and productive.

If you’re interested in playing some of these games, The National Archives have made a showcase of all of the entries:

You can also catch up with the event on Twitter – check out #TranscribeGames

A big thank you goes out to everyone who took part in our first Game Jam.  I would also like to thank my co-organiser, Francesca Mackenzie at The National Archives as well as the staff from The National Archives and UCL who supported us.

Project update – master Bentham’s handwriting with Transkribus Learn

By uczwlse, on 18 January 2019

Over the years, our volunteers have developed an enviable expertise in deciphering Bentham’s decidedly difficult handwriting.  They can even transcribe pages like this! By contrast, many newcomers are understandably daunted by Bentham’s scrawl – they may start transcribing a page one day but then never return.

A new e-learning website, produced by the University of Innsbruck as part of the READ project, promises to help anyone and everyone get to grips with all kinds of historical handwriting. Transkribus Learn does not replace systematic paleography training but it allows users to practice reading and transcribing individual words, learning as they go.

To try it out:

Transkribus Learn has two transcription modes – ‘Study’ and ‘Test’.

In the former, users can guess and then reveal the transcription of individual words in a manuscript.  In the latter, users will be prompted to transcribe the missing word in a series of examples. At the end, you receive your score and a list of correct and incorrect answers. You can keep studying and testing yourself, as often as you like.

There are two Bentham collections on the site – categorised as ‘easier Bentham (containing writing by Bentham and his secretaries) and ‘difficult Bentham’. Both collections are an ideal training ground for new volunteers, offering the opportunity to practice transcribing different words in rapid succession.

We recommend that new volunteers start with the ‘easier Bentham’ and move onto the more difficult pages once they feel ready.  I hope some of our long-standing volunteers might also have a play and challenge themselves to ready some of Bentham’s nastiest handwriting!

What does that say?? Practice transcribing Bentham and more with Transkribus Learn

Access the Bentham material in Transkribus Learn:

As one of the partners in the READ project, the Bentham Project helped to develop Transkribus Learn. But there’s so much more than Bentham to discover on the site. The site currently contains scripts from the 12th to the 19th centuries in a range of languages. Users can also upload their own documents to the platform as a training exercise for students or volunteers.

The Transkribus team look forward to helping a broad range of people learn valuable new transcription skills!  They welcome any feedback or questions (learn@transkribus.eu).

Project Update – Searching Bentham’s manuscripts with Keyword Spotting!

By uczwlse, on 15 October 2018

The Bentham Project has been experimenting with the Handwritten Text Recognition (HTR) of Bentham’s manuscripts for the past five years, first as a partner in the tranScriptorium project and now as part of READ.

Read about our progress with HTR and the Transkribus platform in blog posts from June 2017 and  February 2018.

Keyword Spotting

Our results have thus far been impressive, especially considering the immense difficulty of Bentham’s own handwriting.  But automated transcription is not yet at a point where it is sufficiently accurate to be used by Bentham Project researchers as a basis for scholarly editing.  It would be too time-consuming (and probably too irritating!) for us to correct the errors in the computer-generated transcripts of papers written in Bentham’s hand.

However, the current state of the technology is strong enough for keyword searching!  And thanks to a collaboration with the PRHLT research center at the Universitat Politècnica de València (another partner in the READ project) we have some exciting new results to report.  It is now possible to search over 90,000 digital images of the central collections of Bentham’s manuscripts, which are held at Special Collections University College London and The British Library.

A Keyword Spotting search for the word ‘pleasure’

 

Appeal for volunteers!

I have prepared a Google sheet with some suggested search terms in 5 different spreadsheet tabs (Bentham’s neologisms, concepts, people, places and other).

It would be fantastic if people filled in the spreadsheet to record some of their searches, using my suggested search terms and some of their own.  Transcribers could search for subjects they are interested in and then cross-reference to material on the Transcription Desk that they might like to transcribe.

Who knows what we might find??  I hope to share some of these results in my upcoming presentation at the Transkribus User Conference in November 2018.  Thanks in advance for your participation.

Background

The PRHLT team have processed the Bentham papers with cutting-edge HTR and probabilistic word indexing technologies.  This sophisticated form of searching is often called Keyword Spotting.  It is more powerful than a conventional full-text search because it uses statistical models trained for text recognition to search through probability values assigned to character sequences (words), considering most possible readings of each word on a page.

We delivered thousands of images and transcripts to the team in Valencia and gave them access to the data we had already used to train HTR models in Transkribus.  After cleaning our data and using Transkribus technology to divide the images into lines, the team in Valencia trained neural network algorithims to recognise and index the collection.

The result is that this vast collection of Bentham’s papers can be efficiently searched, including those papers that have not yet been transcribed!  The accuracy rates are impressive.  The spots suggest around 84-94% accuracy (6-16% Character Error Rate) when compared with manual transcriptions of Bentham’s manuscripts.  More precisely speaking, laboratory tests show that the word average search precision ranges from 79% to 94%.  This means that, out of 100 average search results, only as few as 6 may fail to actually be the words searched for. The accuracy of spotted words depends on the difficulty of Bentham’s handwriting – although it is possible to find useful results in Bentham’s scrawl!  There could be as many as 25 million words waiting to be found.

Use cases

This fantastic site will be invaluable to anyone interested in Bentham’s philosophy.  It will help Bentham Project researchers to find previously unknown references in pages that have not yet been transcribed.  It will allow researchers to quickly investigate Bentham’s concepts and correspondents.  I hope that it will also help volunteer transcribers to find interesting material.

This interface is a prototype beta version.  In the future we want to increase the power of this research tool by connecting it to other digital resources, allowing users to quickly search the manuscripts at the UCL library repository, the Bentham papers database and the Transcription Desk and linking these images to our rich existing metadata.

Similar Keyword Spotting technology (based  on research by the CITlab team at the University of Rostock, another one of the READ project partners) is currently available to all users of the Transkribus platform.  Find out more at the READ project website.

I welcome any feedback on our new search functionality at: transcribe.bentham@ucl.ac.uk

My thanks go to the PRHLT research center, the University of Innsbruck and Chris Riley, Transcription Assistant at the Bentham Project for their support and assistance.

The Transcription Desk is open again! Migration completed successfully.

By uczwlse, on 10 October 2018

I am pleased to announce that the Transcribe Bentham Transcription Desk is open for business once more.

You can now access the site at a new web address: http://transcribe-bentham.ucl.ac.uk/

Please update your bookmarks with this new link!

The old site will have a redirect in place to direct users to the new site.

Huge thanks go to Tom Couch and his team at UCL Research IT Services, who have successfully migrated the platform onto a new server at UCL and made various updates and fixes to the Mediawiki.  We also thank Co-Sector, University of London for supporting the platform since 2010.

For our volunteers, the site will hopefully not have changed much.  You can log in and continue transcribing as normal.  All previous edits should have been preserved.

Important changes:

One noticeable difference is that the Javascript viewer is now the default mode of viewing images on the platform.  The Flash viewer is due to become obsolete by 2020 and is already blocked by default on most web browsers.  At the moment, it is still possible to view the manuscripts in the Flash viewer by clicking the options at the bottom left of an image.

Known issues:

Please bear in mind there are a few known issues with the new site.  We are working to fix the following bugs as soon as possible:

  • Preview option when transcribing

The option to view a ‘clean’ version of your transcript, without visible TEI tags is currently broken.

  • Emails from TB Editor

Any email notifications from TB_Editor are currently marked as being sent by an account called ‘ccea038’.  This is an admin account which is linked to the server.

I would like to thank our volunteers for their continued patience.  Good luck exploring the new site and please get in touch if you have any feedback, issues or questions (transcribe.bentham@ucl.ac.uk).

Now that the site has been migrated, we can start to daydream about further improvements that we hope to make to the platform in the future too!

Update on migration of Transcription Desk website

By uczwlse, on 24 September 2018

Thanks to the hard work of UCL Research IT Services and Co-Sector, University of London, we are almost ready to migrate the Transcribe Bentham Transcription Desk to a new server at UCL.

The planned date of the migration is 9 October 2018. 

I hope that the migration process will not affect volunteers too much.  There will be a day or two of downtime, after which point volunteers will be able to continue transcribing as normal.

UCL Research IT services will update the Mediawiki and eliminate any bugs.  Having the platform hosted at UCL should also put us in a better position to make further technical improvements to it in the future.

Migration Timeline

8 October

  • Volunteers can carry on working up until midnight on 8 October.
  • All edits made by volunteers up until the end of 8 October will be preserved.

9 October

  • The site will go into maintenance mode.  There will be a notice to this effect on the current site.
  • Data will be migrated and tests carried out.
  • Volunteers should not make any edits to the current or new site at this time – any changes made after 8 October will not be saved.

10 October

I am very much looking forward to working with you all on the updated site.  If you have any questions about the migration, please get in touch.