X Close

LCCOS staff news

Home

News for colleagues within the LCCOS department.

Menu

Adding the Bain Graffy Film Collection to Explore

By Thomas P Meehan, on 12 December 2023

We have just completed a project to convert the Bain Graffy Film Collection metadata from a set of HTML pages on the SSEES website to MARC records on Alma and Explore. This means that the collection can now be searched alongside the other library materials and handled in much the same way as other video records.

What is the Bain Graffy Film Collection?

The Bain Graffy Film Collection consists of approximately 14,500 films on around 11,000 DVDs, Blu-ray discs, recorded discs, and VHS videos at SSEES Library. The collection was created, added to, described, and indexed for many years by Professor Julian Graffy and Cameron Bain, after whom the collection is named. Cameron Bain sadly died in 2014 but Julian Graffy still submits detailed records to the project.

Converting the Metadata

Professor Graffy submits records in Word format. For many years these were converted by Library staff, largely by Vladimir Smith Mesa, to static HTML pages, one for each physical item. While consistent, easily readable, and well indexed by separate index pages, these did not conform to a formal metadata schema or file format.

With the old server needing to be replaced it was decided to try to move the many web pages to some kind of database. We tried several methods including a web form, conversion to an Excel spreadsheet using python, and lastly conversion to MARC records using Marcedit software. Marcedit is commonly used in libraries- including UCL- for editing large numbers of records in bulk, especially ebook records, but is also capable of converting non-MARC data to MARC and performing powerful textual transformations.

The eventual process involved obtaining HTML and Word documents from Gill Long; each HTML file contained data for one physical item with one or more films on it. I then combined these files in Word so there were one hundred per file and passed these Vlad. Vlad did two important things to the Word documents: firstly, he made sure that every film- rather than each video or DVD- could have its own record on Alma; second, he marked up each element by prefixing the necessary line with a label, such as “Director: “. By explicitly marking up data, this made it much easier for any script to “read” the file and what each part of it meant. For example, it is much clearer in the second example what the title is without having to rely on it being formatted a certain way.

Example original HTML file Example Word file edited by Vlad

Call number: DVD-2000


STOLPER, Aleksandr
Dni i nochi [Days and Nights]

Mosfil´m, 1944; released 28 August 1945
Screenplay: Konstantin Simonov, from his story of the same name
Photography: Evgenii Andrikanis
Production design: Morits Umanskii, Sergei Voronkov
Music: Nikolai Kriukov

 

Call number: DVD-2000

Director: STOLPER, Aleksandr
Title: Dni i nochi [Days and Nights]

Production: Mosfil´m, 1944; released 28 August 1945
Date: 1945

Screenplay: Konstantin Simonov, from his story of the same name
Photography: Evgenii Andrikanis
Production design: Morits Umanskii, Sergei Voronkov
Music: Nikolai Kriukov

 

I then converted the word documents to a basic format Marcedit could read (.mrk) then used a number of transformations (essentially, fancy cut-and-pastes) to change the labels to MARC fields. In other cases, I could just use the formatting where it was consistent (e.g. a number followed by “ minutes” is obviously a running time which can be fed into a 300 field, and anything in square brackets in the title form a translated title that can be moved into a 246 field).

The initial MARC file in Marcedit The record imported into Alma.
=LDR  01604ngm a2200445 i 4500
=008  010121s1945####||#|||############vlxx##d
=035  \\$aDVD-2000
=245  00$aDni i nochi
=246  33$aDays and Nights
=264  \1$c1945
=300  \\$a1 videodisc (87 min.) ; $c 4 3/4 in.
=500  \\$aFrom his story of the same name.
=500  \\$aOriginally released in 1945.
=546  \\$a
=546  \\$aIn Russian without subtitles
=590  \\$auclbaingraffy uclbaingraffydvd20002099 mrcbaingraffy20221130l
=700  1\$aAndrikanis, Evgenii,$edirector of photography.
=700  1\$aSimonov, Konstantin,$escreenwriter.
=700  1\$aStolper, Aleksandr,$efilm director
Screenshot of Alma showing the record converted to MARC format

The records were converted to proper MARC format (.mrc) so they could be imported into Alma, which created holdings and item data at the same time. Importing them into Alma automatically makes them available in Explore:

Screenshot of the record in Explore

The Get It section invites students and staff to contact SSEES Library where they will be able to view materials on-site. They are not currently borrowable.

The indexes created by Julian Graffy and Cameron Bain have been continued and are still really useful for finding films by country of origin and by genre. These web pages have been converted by Chris Carrington on Drupal and now point at the Alma record instead of the individual web entries. They will be maintained by Gill Long using data supplied by Julian Graffy.

Searching for Bain Graffy Items

Bain Graffy items can be found in several ways. In Explore, include the tag “uclbaingraffy” in the initial search to only search items from the collection. This is the search offered from the collection’s home page. You can also use the Collection facet to narrow down to the Bain Graffy Film Collection.

Screenshot of a search in Explore with the Bain Graffy collection selected as a facet

The home page also contains the indexes created by Professor Graffy where users can browse lists of films by genre, country of origin, and date of production. The results link to the records on Explore.

Next Steps

The collection is still being added to and making sure the flow of data is maintained to Explore and the indexes is the most important activity. Due to the quick and partly automated way the data was imported, there is some tidying work to be done, which Vlad has already made a good start on. We hope also to make some improvements to take advantage of the more standardised data format, such as implementing authority control, adding coded language information, or genre headings.