Text and Data Mining (TDM) and Your Research: Copyright Implications and New Website Guidance
By Rafael, on 13 May 2024
This the second blog post of our collaborative series between the UCL Office for Open Science and Scholarship and the UCL Copyright team. Here, we continue our exploration of important aspects of copyright and its implications for open research and scholarship. In this instalment, we examine Text and Data Mining (TDM) and its impact on research along with the associated copyright considerations.
Image by storyset on Freepik.
The development of advanced computational tools and techniques for analysing large amounts of data has opened up new possibilities for researchers. Text and Data Mining (TDM) is a broad term referring to a range of ‘automated analytical techniques to analyse text and data for patterns, trends, and useful information’ (Intellectual Property Office definition). TDM has many applications in academic research across disciplines (Intellectual Property Office definition). TDM has many applications in academic research across disciplines.
In an academic context, the most common sources of data for TDM include journal articles, books, datasets, images, and websites. TDM involves accessing, analysing, and often reusing (parts of) these materials. As these materials are, by default, protected by copyright, there are limitations around what you can do as part of TDM. In the UK, you may rely on section 29A of the Copyright, Designs and Patents Act, a copyright exception for making copies for text and data analysis for non-commercial research. You must have lawful access to the materials (for example via a UCL subscription or via an open license). However, there are often technological barriers imposed by publishers preventing you from copying large amounts of materials for TDM purposes – measures that you must not try to circumvent. Understanding what you can do with copyright materials, what may be more problematic and where to get support if in doubt, should help you manage these barriers when you use TDM in your research.
The copyright support team works with e-resources, the Library Skills librarians, and the Office for Open Science and Scholarship to support the TDM activities of UCL staff and students. New guidance is available on the copyright website. TDM libguide and addresses questions that often arise during TDM, including:
- Can you copy journal articles, books, images, and other materials? What conditions apply?
- What do you need to consider when sharing the outcomes of a TDM analysis?
- What do publishers and other suppliers of the TDM sources expect you to do?
To learn more about copyright (including how it applies to TDM):
- Visit the new TDM guidance.
- Register for one of our copyright sessions.
- Contact the copyright team if you have specific questions or would like to organise a session for your department.
- The new UCL Copyright Literacy Community aims to bring together staff and students from across UCL to identify and collaborate on areas where an understanding of copyright should be strengthened. If you are interested in joining, please contact the copyright team.
Get involved!
The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on X, formerly Twitter, LinkedIn, and join our mailing list to be part of the conversation!