Archive for the 'Transcription' Category

Guest post: a placement student’s experience

By Tim Causer, on 1 June 2015

Below is a guest post from Milandeep Singh, a UCL Digital Humanities MSc student, who has been working with us during the past three weeks. It has been a pleasure to host Milandeep, and here she describes what she has been up to during her time with us.

————————————————————————–

Hello and welcome to a personal update on behind the scenes work at Transcribe Bentham. As a current MSc Digital Humanities placement student, I’ve had the honour to be supervised by Dr Tim Causer to work on this amazing project. I cannot believe that this is already the third week of placement! Having had the privilege to digitise Bentham manuscripts, process them, transcribe them, use the latest Handwritten Text Recognition and Transkribus software, and proof check transcriber’s submissions and encode these in XML, it has been a fairly steep learning.

A special thanks to all the Transcribe Bentham project volunteers for the steady progress made on the project. Thankfully, a lot more manuscripts have been transcribed and checked since I’ve had the pleasure to work on this wonderful initiative.

One of the most daunting tasks I encountered was transcribing Bentham’s manuscripts from Box 150 on the first day. I have no previous experience working with manuscripts, and learning that Bentham had written 60,000 manuscripts was a real surprise! It is unbelievable how a busy, incredibly talented individual like Bentham would have the time to write an estimated 30,000,000 words! Between us, reading Bentham’s handwriting, as well as his use of newly invented words, spelling errors and blotches of ink is indeed a major skill in itself.

However, Bentham’s ideas are fascinating and the phenomenal effort volunteer transcribers display on a daily basis is highly commendable, and evident in the quantity and quality of contributions every week.. This consistent, high rate of transcription is testament to the immense amount of work the volunteers have put in. The most heavily transcribed boxes of material during the three weeks of my time here were boxes 150, 538, and 541. In Bentham’s own words, your effort gives ‘the greatest happiness’ and undoubtedly encourages everyone else.

Prior to my placement on Transcribe Bentham, I had no idea just how dedicated were volunteer transcribers, nor how much goes on behind the scenes. I am grateful to UCL Creative Media Services for allowing me to visit, and providing the opportunity to visit their digitisation suite and single-handedly digitise one hundred untranscribed manuscripts from Box 64, and to electronically process them! Digitisation takes place in a dark room, with minimal light and humidity to ensure the manuscripts are looked after. I must say, this was an interesting experience, so do keep an eye out for when Box 64 manuscripts are uploaded to the Transcription Desk! The manuscripts I was working with each had at least one pin through them, and many were grouped together by an old paperclip, probably dating from when a cataloguer took on the daunting task of sorting Bentham’s manuscripts.

Being confronted with the task of removing each pin and paperclip was no easy challenge, and required being extremely careful to avoid doing any damage to the precious manuscripts. Many of these paperclips were bent and had caused some curling to the manuscripts, so I used the lifesaving tool – the ‘Munich Finger’. This non-reflective, transparent tool saved me from taking countless photographs of a single manuscript, to increase readability. Just by placing this tool on the edge of each manuscript, the appearance of each manuscript improved dramatically when photographed. Digitisation was, I found, an extremely satisfying task.

mksinghtb3

Digitising Bentham manuscripts (photograph by Milandeep Singh)

The most challenging part during the digitisation process for this particular set of manuscripts, was photographing the manuscripts with the least amount writing. To overcome this issue, one had to play around with the camera setting including the ISO settings, each time you were faced with such a manuscript. This required special attention. Processing of the images was much more straightforward, as one simply rotated the images, and labelled them using the batch processing feature of the Lightbox software. This process also involved cropping the images as well as removing unnecessary shadows, to maximise readability on each manuscript, using Photoshop. Once the images were processed, they were compressed to a suitable format to be uploaded to the Bentham image server and, ultimately, for display on Transcribe Bentham for volunteers to view.

Besides this, another interesting subject I encountered whilst working on the project is the Handwritten Text Recognition technology being developed for tranScriptorium. It is astonishing that this technology can present up to 90% accuracy for Bentham images (at least for neatly written manuscripts). I look forward to seeing how volunteers embrace this amazing technology in the future. As you may be aware, some public trials are being conducting using the TSX tool, and I did have the pleasure of using it. Transcribing with the assistance of this technology is fantastic! In order to enable the HTR technology to display accurate results for each manuscript line, the manuscripts are initially managed by a behind the scenes management tool known as – Transkribus. Using this tool, we draw text regions, text lines and baselines on each line of the manuscript. At first, this may sound incredibly easy; however this process is semi-automated, and correction of erroneous baselines can take a lot of concentration and effort to ensure there is not overlapping on other lines. Once each batch has been baselined, it is then abroad to the Spanish computer scientists who generate word clouds based on a particular set, to be used to generate accurate results in the HTR.

Nevertheless, my favourite task while on my placement was in checking the encoding of transcripts submitted by volunteer transcribers. It is truly rewarding to finalise the outstanding work of volunteers, lock the transcripts, save the encoded XML file, and send a message of thanks to the transcriber. This triumph would not have been possible without the sheer hard work, enthusiasm and dedication of Transcribe Bentham’s brilliant volunteers. Once again, my most sincere thanks go to all of them for their contributions to Transcribe Bentham’s ongoing success.

Milandeep Kaur Singh

Progress update, 23 to 29 May 2015

By Tim Causer, on 29 May 2015

Welcome to the Transcribe Bentham progress update for the period 23 to 29 May 2015, during which time further excellent progress has been made by volunteer transcribers.

13,120 manuscripts have now been transcribed or partially-transcribed, which is an increase of 48 on this time last week. Of these transcripts, 12,090 (92%) have been checked and approved by TB staff.

The more detailed state of progress is as follows:

Box No. of manuscripts worked on No. of manuscripts in box Completion
Box 1 253 794 32%
Box 2 472 753 62%
Box 4 2 694 1%
Box 5 200 290 68%
Box 7 3 167 1%
Box 8 6 284 2%
Box 9 41 266 15%
Box 15 79 914 10%
Box 18 4 192 2%
Box 27 350 350 COMPLETE
Box 29 22 122 18%
Box 30 2 193 1%
Box 31 19 302 6%
Box 34 40 398 10%
Box 35 286 439 65%
Box 36 34 418 8%
Box 37 32 487 6%
Box 38 117 424 28%
Box 39 12 282 4%
Box 41 87 528 14%
Box 42 88 910 9%
Box 44 53 201 26%
Box 50 171 198 85%
Box 51 385 940 40%
Box 57 18 420 4%
Box 62 67 565 11%
Box 63 155 345 44%
Box 70 306 350 87%
Box 71 663 663 COMPLETE
Box 72 614 664 92%
Box 73 151 151 COMPLETE
Box 79 199 199 COMPLETE
Box 95 126 147 85%
Box 96 534 539 99%
Box 97 139 296 46%
Box 98 223 499 44%
Box 100 194 422 43%
Box 106 113 581 19%
Box 107 502 538 93%
Box 110 7 671 1%
Box 115 276 307 89%
Box 116 506 864 58%
Box 117 365 853 42%
Box 118 255 880 28%
Box 119 535 990 54%
Box 120 29 686 4%
Box 121 134 526 24%
Box 122 304 717 41%
Box 123 42 443 9%
Box 124 15 383 3%
Box 139 40 40 COMPLETE
Box 141 4 381 1%
Box 149 5 581 1%
Box 150 884 972 90%
Box 169 193 728 26%
Add MS 537 730 744 98%
Add MS 538 753 858 87%
Add MS 539 834 948 87%
Add MS 540 94 1012 9%
Add MS 541 353 1258 28%
Overall 13,120 31,560 40%

Thank you, as always, to everyone who has contributed to Transcribe Bentham during the last seven days. It remains as greatly appreciated as ever by us all!

Progress update, 16 to 22 May 2015

By Tim Causer, on 22 May 2015

Welcome along to the Transcribe Bentham progress update for the period 16 to 22 May 2015, during which time further excellent progress has been made by volunteer transcribers.

13,072 manuscripts have now been transcribed or partially-transcribed, which is an increase of 52 on this time last week. Of these transcripts, 12,044 (92%) have been checked and accepted by TB staff.

The more detailed state of progress is as follows:

Box No. of manuscripts worked on No. of manuscripts in box Completion
Box 1 253 794 32%
Box 2 472 753 62%
Box 4 2 694 1%
Box 5 200 290 68%
Box 7 3 167 1%
Box 8 6 284 2%
Box 9 41 266 15%
Box 15 79 914 10%
Box 18 4 192 2%
Box 27 350 350 COMPLETE
Box 29 22 122 18%
Box 30 2 193 1%
Box 31 19 302 6%
Box 34 40 398 10%
Box 35 286 439 65%
Box 36 34 418 8%
Box 37 32 487 6%
Box 38 113 424 27%
Box 39 12 282 4%
Box 41 87 528 14%
Box 42 88 910 9%
Box 44 53 201 26%
Box 50 171 198 85%
Box 51 385 940 40%
Box 57 18 420 4%
Box 62 67 565 11%
Box 63 155 345 44%
Box 70 306 350 87%
Box 71 663 663 COMPLETE
Box 72 614 664 92%
Box 73 151 151 COMPLETE
Box 79 199 199 COMPLETE
Box 95 126 147 85%
Box 96 534 539 99%
Box 97 139 296 46%
Box 98 222 499 44%
Box 100 194 422 43%
Box 106 112 581 19%
Box 107 502 538 93%
Box 110 7 671 1%
Box 115 276 307 89%
Box 116 506 864 58%
Box 117 365 853 42%
Box 118 255 880 28%
Box 119 535 990 54%
Box 120 29 686 4%
Box 121 134 526 24%
Box 122 304 717 41%
Box 123 42 443 9%
Box 124 15 383 3%
Box 139 40 40 COMPLETE
Box 141 4 381 1%
Box 149 5 581 1%
Box 150 875 972 90%
Box 169 193 728 26%
Add MS 537 730 744 98%
Add MS 538 752 858 87%
Add MS 539 834 948 87%
Add MS 540 67 1012 6%
Add MS 541 348 1258 27%
Overall 13,072 31,560 40%

Thank you, as always, to everyone who has contributed to Transcribe Bentham during the last seven days – it remains as greatly appreciated as ever by us all.

 

Progress update, 9 to 15 May 2015

By Tim Causer, on 15 May 2015

Welcome along to the Transcribe Bentham progress update for the period 9 to 15 May 2015, during which time further excellent progress has been made by volunteer transcribers.

13,020 manuscripts have now been transcribed or partially-transcribed, which is an increase of 43 on this time last week. Of these transcripts, 12,008 (92%) have been checked and approved by TB staff. This is a doubly-amazing achievement on the part of volunteer transcribers: not only have they broken through the 13,000th transcript mark, but that such a high proportion of their work continues to be approved is a real testament to the quality of their efforts.

The more detailed state of progress is as follows:

Box No. of manuscripts worked on No. of manuscripts in box Completion
Box 1 253 794 32%
Box 2 472 753 62%
Box 4 2 694 1%
Box 5 200 290 68%
Box 7 3 167 1%
Box 8 6 284 2%
Box 9 41 266 15%
Box 15 79 914 10%
Box 18 4 192 2%
Box 27 350 350 COMPLETE
Box 29 22 122 18%
Box 30 2 193 1%
Box 31 19 302 6%
Box 34 40 398 10%
Box 35 286 439 65%
Box 36 34 418 8%
Box 37 32 487 6%
Box 38 112 424 27%
Box 39 12 282 4%
Box 41 87 528 14%
Box 42 88 910 9%
Box 44 53 201 26%
Box 50 171 198 85%
Box 51 385 940 40%
Box 57 18 420 4%
Box 62 67 565 11%
Box 63 155 345 44%
Box 70 306 350 87%
Box 71 663 663 COMPLETE
Box 72 614 664 92%
Box 73 151 151 COMPLETE
Box 79 199 199 COMPLETE
Box 95 126 147 85%
Box 96 534 539 99%
Box 97 139 296 46%
Box 98 222 499 44%
Box 100 194 422 43%
Box 106 111 581 19%
Box 107 502 538 93%
Box 110 7 671 1%
Box 115 276 307 89%
Box 116 506 864 58%
Box 117 365 853 42%
Box 118 255 880 28%
Box 119 535 990 54%
Box 120 29 686 4%
Box 121 134 526 24%
Box 122 304 717 41%
Box 123 42 443 9%
Box 124 15 383 3%
Box 139 40 40 COMPLETE
Box 141 4 381 1%
Box 149 5 581 1%
Box 150 848 972 87%
Box 169 193 728 26%
Add MS 537 730 744 98%
Add MS 538 746 858 86%
Add MS 539 834 948 87%
Add MS 540 67 1012 6%
Add MS 541 331 1258 26%
Overall 13,020 31,560 40%

Thank you, as always, to everyone who has taken part in Transcribe Bentham during the last seven days. Your continued contributions to the success of the project are as highly appreciated as ever.

 

 

Progress update, 2 to 8 May 2015

By Tim Causer, on 8 May 2015

Welcome along to the Transcribe Bentham progress update for the period 2 to 8 May 2015, during which time further steady progress has been made by volunteer transcribers.

12,977 manuscripts have now been transcribed or partially-transcribed, up 31 on this time last week. Of these transcripts, 11,978 (92%) have been checked and approved. All being well, by this time next week the 13,000th transcript should have been worked on, and we will have checked and approved our 12,00th!

The more detailed state of progress is as follows:

Box No. of manuscripts worked on No. of manuscripts in box Completion
Box 1 252 794 31%
Box 2 471 753 62%
Box 4 2 694 1%
Box 5 200 290 68%
Box 7 3 167 1%
Box 8 6 284 2%
Box 9 41 266 15%
Box 15 78 914 10%
Box 18 4 192 2%
Box 27 350 350 COMPLETE
Box 29 22 122 18%
Box 30 2 193 1%
Box 31 19 302 6%
Box 34 40 398 10%
Box 35 286 439 65%
Box 36 34 418 8%
Box 37 32 487 6%
Box 38 110 424 25%
Box 39 12 282 4%
Box 41 87 528 14%
Box 42 88 910 9%
Box 44 53 201 26%
Box 50 171 198 85%
Box 51 385 940 40%
Box 57 18 420 4%
Box 62 67 565 11%
Box 63 155 345 44%
Box 70 306 350 87%
Box 71 663 663 COMPLETE
Box 72 614 664 92%
Box 73 151 151 COMPLETE
Box 79 199 199 COMPLETE
Box 95 126 147 85%
Box 96 534 539 99%
Box 97 138 296 46%
Box 98 222 499 44%
Box 100 194 422 43%
Box 106 109 581 18%
Box 107 502 538 93%
Box 110 7 671 1%
Box 115 276 307 89%
Box 116 506 864 58%
Box 117 365 853 42%
Box 118 255 880 28%
Box 119 535 990 54%
Box 120 29 686 4%
Box 121 134 526 24%
Box 122 304 717 41%
Box 123 42 443 9%
Box 124 15 383 3%
Box 139 40 40 COMPLETE
Box 141 4 381 1%
Box 149 5 581 1%
Box 150 823 972 84%
Box 169 193 728 26%
Add MS 537 730 744 98%
Add MS 538 737 858 85%
Add MS 539 834 948 87%
Add MS 540 67 1012 6%
Add MS 541 330 1258 26%
Overall 12,977 31,560 40%

Thank you, as always, to everyone who has contributed to Transcribe Bentham during the last seven days. It remains as warmly appreciated as ever.