Archive for the 'Transcription' Category

Progress update, 18 to 24 July 2015

By Tim Causer, on 24 July 2015

Welcome to the progress update for the period 11 to 17 July 2015, during which time further excellent progress has been made by Transcribe Bentham volunteers.

13,392 manuscripts have now been transcribed or partially-transcribed, which is an increase of 41 on this time last week. Of these transcripts, 12,532 (93%) have been checked and approved by Transcribe Bentham staff. (This week, we had a look through the list of partially-transcribed manuscripts and were able to check and approve 174 of them. If you would like to work on a partially complete transcript, please do consult the big list).

The more detailed state of progress is as follows:

Box No. of manuscripts worked on No. of manuscripts in box Completion
Box 1 254 794 32%
Box 2 474 753 62%
Box 4 2 694 1%
Box 5 200 290 68%
Box 7 3 167 1%
Box 8 6 284 2%
Box 9 41 266 15%
Box 15 79 914 10%
Box 18 4 192 2%
Box 27 350 350 COMPLETE
Box 29 22 122 18%
Box 30 2 193 1%
Box 31 19 302 6%
Box 34 40 398 10%
Box 35 286 439 65%
Box 36 34 418 8%
Box 37 32 487 6%
Box 38 125 424 30%
Box 39 12 282 4%
Box 41 87 528 14%
Box 42 89 910 9%
Box 44 53 201 26%
Box 50 171 198 85%
Box 51 385 940 40%
Box 57 18 420 4%
Box 62 72 565 12%
Box 63 155 345 44%
Box 70 306 350 87%
Box 71 663 663 COMPLETE
Box 72 614 664 92%
Box 73 151 151 COMPLETE
Box 79 199 199 COMPLETE
Box 95 126 147 85%
Box 96 534 539 99%
Box 97 139 296 46%
Box 98 224 499 44%
Box 100 194 422 43%
Box 106 132 581 22%
Box 107 502 538 93%
Box 110 7 671 1%
Box 115 276 307 89%
Box 116 507 864 58%
Box 117 365 853 42%
Box 118 255 880 28%
Box 119 535 990 54%
Box 120 29 686 4%
Box 121 134 526 24%
Box 122 304 717 41%
Box 123 42 443 9%
Box 124 15 383 3%
Box 139 40 40 COMPLETE
Box 141 4 381 1%
Box 149 5 581 1%
Box 150 972 972 COMPLETE
Box 169 193 728 26%
Add MS 537 730 744 98%
Add MS 538 774 858 90%
Add MS 539 834 948 87%
Add MS 540 176 1012 17%
Add MS 541 396 1258 31%
Overall 13,392 31,560 42%

The big news this week is that the transcription of Box 150 is now complete! Since the turn of the year, volunteers have transcribed around 500 manuscripts from this box alone, assisting our colleague Dr Michael Quinn in producing the next volume of Bentham’s Collected Works, and making some terrific discoveries along the way. Expect a round-up update in due course from Dr Quinn about these manuscripts.

We hope, in the coming days, to add fresh material to the website for transcription, and will let you know once it’s up and running.

And finally, Bentham made an appearance on The Guardian website yesterday, in an article which asks “What does the panopticon mean in the age of digital surveillance?”

Thank you, as always, to everyone who has contributed to Transcribe Bentham during the last seven days. It remains as greatly appreciated as ever by us all.

Progress update, 11 to 17 July 2015

By Tim Causer, on 17 July 2015

Welcome to the progress update for the period 11 to 17 July 2015, during which time further steady progress has been made by Transcribe Bentham volunteers.

13,351 manuscripts have now been transcribed or partially transcribed, an increase of 28 on this time last week. Of these transcripts, 12,358 (92%) have been checked and approved by Transcribe Bentham staff. A substantial number of partially-transcribed manuscripts have been submitted, so once we have been through these the approval percentage should increase somewhat.

The more detailed state of progress is as follows:

Box No. of manuscripts worked on No. of manuscripts in box Completion
Box 1 254 794 32%
Box 2 474 753 62%
Box 4 2 694 1%
Box 5 200 290 68%
Box 7 3 167 1%
Box 8 6 284 2%
Box 9 41 266 15%
Box 15 79 914 10%
Box 18 4 192 2%
Box 27 350 350 COMPLETE
Box 29 22 122 18%
Box 30 2 193 1%
Box 31 19 302 6%
Box 34 40 398 10%
Box 35 286 439 65%
Box 36 34 418 8%
Box 37 32 487 6%
Box 38 125 424 30%
Box 39 12 282 4%
Box 41 87 528 14%
Box 42 89 910 9%
Box 44 53 201 26%
Box 50 171 198 85%
Box 51 385 940 40%
Box 57 18 420 4%
Box 62 72 565 12%
Box 63 155 345 44%
Box 70 306 350 87%
Box 71 663 663 COMPLETE
Box 72 614 664 92%
Box 73 151 151 COMPLETE
Box 79 199 199 COMPLETE
Box 95 126 147 85%
Box 96 534 539 99%
Box 97 139 296 46%
Box 98 224 499 44%
Box 100 194 422 43%
Box 106 130 581 22%
Box 107 502 538 93%
Box 110 7 671 1%
Box 115 276 307 89%
Box 116 507 864 58%
Box 117 365 853 42%
Box 118 255 880 28%
Box 119 535 990 54%
Box 120 29 686 4%
Box 121 134 526 24%
Box 122 304 717 41%
Box 123 42 443 9%
Box 124 15 383 3%
Box 139 40 40 COMPLETE
Box 141 4 381 1%
Box 149 5 581 1%
Box 150 961 972 98%
Box 169 193 728 26%
Add MS 537 730 744 98%
Add MS 538 774 858 90%
Add MS 539 834 948 87%
Add MS 540 153 1012 15%
Add MS 541 391 1258 31%
Overall 13,351 31,560 42%

Thank you, as always, to everyone who has contributed to Transcribe Bentham during the last seven days. It remains as greatly appreciated as ever  by all of us here at Bentham HQ.

Progress update, 4 to 10 July 2015

By Tim Causer, on 10 July 2015

Welcome along to the progress update for the period 4 to 10 July 2015, during which time further steady progress has been made by Transcribe Bentham volunteers.

13,323 manuscripts have now been transcribed or partially-transcribed, which is an increase of 37 on this time last week. Of these transcripts, 12,346 (92%) have been checked and approved by TB staff.

The more detailed state of progress is as follows:

Box No. of manuscripts worked on No. of manuscripts in box Completion
Box 1 254 794 32%
Box 2 472 753 62%
Box 4 2 694 1%
Box 5 200 290 68%
Box 7 3 167 1%
Box 8 6 284 2%
Box 9 41 266 15%
Box 15 79 914 10%
Box 18 4 192 2%
Box 27 350 350 COMPLETE
Box 29 22 122 18%
Box 30 2 193 1%
Box 31 19 302 6%
Box 34 40 398 10%
Box 35 286 439 65%
Box 36 34 418 8%
Box 37 32 487 6%
Box 38 125 424 30%
Box 39 12 282 4%
Box 41 87 528 14%
Box 42 89 910 9%
Box 44 53 201 26%
Box 50 171 198 85%
Box 51 385 940 40%
Box 57 18 420 4%
Box 62 72 565 12%
Box 63 155 345 44%
Box 70 306 350 87%
Box 71 663 663 COMPLETE
Box 72 614 664 92%
Box 73 151 151 COMPLETE
Box 79 199 199 COMPLETE
Box 95 126 147 85%
Box 96 534 539 99%
Box 97 139 296 46%
Box 98 224 499 44%
Box 100 194 422 43%
Box 106 126 581 21%
Box 107 502 538 93%
Box 110 7 671 1%
Box 115 276 307 89%
Box 116 507 864 58%
Box 117 365 853 42%
Box 118 255 880 28%
Box 119 535 990 54%
Box 120 29 686 4%
Box 121 134 526 24%
Box 122 304 717 41%
Box 123 42 443 9%
Box 124 15 383 3%
Box 139 40 40 COMPLETE
Box 141 4 381 1%
Box 149 5 581 1%
Box 150 961 972 98%
Box 169 193 728 26%
Add MS 537 730 744 98%
Add MS 538 774 858 90%
Add MS 539 834 948 87%
Add MS 540 137 1012 13%
Add MS 541 385 1258 30%
Overall 13,323 31,560 42%

Thank you, as always, to everyone who has so generously donated their time to Transcribe Bentham during the last seven days. It is enormously appreciated by us all.

 

Transcribing Bentham … with the help of a machine?

By Tim Causer, on 9 July 2015

Since the start of 2013 we – we being the Bentham Project, the UCL Centre for Digital Humanities, and the University of London Computer Centre (ULCC) – have participated in the exciting EC-funded tranScriptorium project with some fantastic colleagues from around Europe. The tranScriptorium consortium’s key aim is to develop and further handwritten text recognition technology which can index, search, and transcribe historic manuscript images.

When we were invited to take part in the project way back in 2012, the idea that a machine might be able to automatically transcribe a Bentham manuscript seemed, to us – knowing very little about HTR – a little fanciful. And yet here we are, with experiments carried out by our colleagues showing transcription word error rates of between 15% and 8% for some manuscripts, admittedly on some of the less complex Bentham papers, but remarkable nevertheless.

Our main task in tranScriptorium is to bring this technology to users, by developing and testing a crowdsourced transcription platform, known as TSX. This platform, put together by ULCC, incorporates the HTR technology and puts the user firmly in control of it, letting you take part in three interconnected ways depending upon your level of experience, preference, and/or amount of available free time.

In the first instance you can transcribe and encode a manuscript as you would do in Transcribe Bentham, though while taking advantage of useful features such as the segmentation of manuscript images into lines, and colour-coded TEI mark-up making it more straightforward to distinguish between it and the text of your transcript.

TSX: transcription and encoding

TSX: transcription and encoding

If you are new to transcription or don’t have much time to spare, you might like to request from the HTR engine a full transcript of a given manuscript. As it is unlikely that any HTR transcript will ever be entirely right, you can then correct it against the image.

TSX: HTR transcript correction

TSX: HTR transcript correction

Finally, and perhaps most excitingly, is the facility to request from the HTR engine suggestions for certain words.There is nothing more frustrating when transcribing to come across a word (or several words!) which you can’t decipher, and then losing the flow and context of what follows. Being able to ask the HTR engine for possible suggestions could help fill in at least some of these gaps and reduce the level of frustration when trying to decipher just what Bentham wrote.

C

TSX: word suggestions

If you would like to try out transcribing Bentham manuscripts using TSX and its HTR technology, please do visit the website. (TSX
currently works best running in Chrome on Windows 7 and upwards. There are known issues with it running on MacOS – please do bear with us).

We must stress that this HTR technology could never replace volunteer transcribers or their superb work, but it may help to make their life a little easier. Future development of the platform will include improving the way in which word suggestions are delivered to users, addition of a user page to keep track of your contributions, and introducing a what-you-see-is-what-you-get transcription interface in which the TEI mark-up is hidden from view, leaving you to concentrate on transcription alone. A forthcoming follow-on project to tranScriptorium also promises many more exciting developments, but that’s for another day…

We are very excited about the potential for HTR in widening access to digitised manuscript material, and hope that it – and the Transkribus infrastructure and tool developed by our colleagues at the University of Innsbruck, on which TSX runs – will support scholars and institutions in establishing their own crowdsourcing initiatives.

We hope in the meantime that you enjoy trying out TSX. Please do let us know if you have any feedback on it (good or bad!)

Progress update, 27 June to 3 July 2015

By Tim Causer, on 3 July 2015

Welcome to the progress update for the period 27 June to 3 July 2015, during which time further steady progress has been made by Transcribe Bentham volunteers.

13,286 manuscripts have now been transcribed or partially-transcribed, which is an increase of 31 on this time last week. Of these transcripts, 12,266 (92%) have now been checked and approved by TB staff.

The more detailed state of progress is as follows:

Box No. of manuscripts worked on No. of manuscripts in box Completion
Box 1 254 794 32%
Box 2 472 753 62%
Box 4 2 694 1%
Box 5 200 290 68%
Box 7 3 167 1%
Box 8 6 284 2%
Box 9 41 266 15%
Box 15 79 914 10%
Box 18 4 192 2%
Box 27 350 350 COMPLETE
Box 29 22 122 18%
Box 30 2 193 1%
Box 31 19 302 6%
Box 34 40 398 10%
Box 35 286 439 65%
Box 36 34 418 8%
Box 37 32 487 6%
Box 38 125 424 30%
Box 39 12 282 4%
Box 41 87 528 14%
Box 42 89 910 9%
Box 44 53 201 26%
Box 50 171 198 85%
Box 51 385 940 40%
Box 57 18 420 4%
Box 62 72 565 12%
Box 63 155 345 44%
Box 70 306 350 87%
Box 71 663 663 COMPLETE
Box 72 614 664 92%
Box 73 151 151 COMPLETE
Box 79 199 199 COMPLETE
Box 95 126 147 85%
Box 96 534 539 99%
Box 97 139 296 46%
Box 98 224 499 44%
Box 100 194 422 43%
Box 106 123 581 21%
Box 107 502 538 93%
Box 110 7 671 1%
Box 115 276 307 89%
Box 116 506 864 58%
Box 117 365 853 42%
Box 118 255 880 28%
Box 119 535 990 54%
Box 120 29 686 4%
Box 121 134 526 24%
Box 122 304 717 41%
Box 123 42 443 9%
Box 124 15 383 3%
Box 139 40 40 COMPLETE
Box 141 4 381 1%
Box 149 5 581 1%
Box 150 960 972 98%
Box 169 193 728 26%
Add MS 537 730 744 98%
Add MS 538 774 858 90%
Add MS 539 834 948 87%
Add MS 540 120 1012 11%
Add MS 541 370 1258 29%
Overall 13,286 31,560 42%

Box 150 is now pretty much complete! Thanks to everyone who has transcribed something from that set of manuscripts.

Thank you, as always, to everyone who has contributed to Transcribe Bentham during the last seven days. It remains as greatly appreciated as ever by all of us here at Bentham HQ.