X Close

Open@UCL Blog

Home

Menu

Copyright and AI, Part 1: How Does Copyright Apply to AI-Generated Works?

By Rafael, on 21 June 2024

Guest post by Christine Daoutis, UCL Copyright Support Officer. 

This the third blog post of the collaborative series between the UCL Office for Open Science and Scholarship and the UCL Copyright team. Here, we continue our exploration of important aspects of copyright and its implications for open research and scholarship.

An artist’s illustration of artificial intelligence (AI). This illustration depicts language models which generate text. It shows distorted text on a screen seen through a glass container. The visible text at the top reads, "How do large language models work?" The rest is partially obscured, but includes mentions of "neural networks" and "machine learning.

Photo by Google DeepMind.

In a previous post we introduced questions that arise when using and creating materials protected by copyright. What options are available to you if you want to reuse others’ work (e.g. articles, theses, images, film, code) in your research? And what do you need to consider before you share your own research with others? Issues around copyright protection, permissions, exceptions, licences, and ownership need to be examined when creating new works and including others’ materials. These questions are also relevant when we think about works that are created with the use of GenAI tools, such as ChatGPT. However, with the use of these technologies still being relatively new and the legal aspects being shaped as we speak, answers are not always straightforward.

GenAI Training Data: GenAI models are trained on a large number of materials, usually protected by copyright (unless copyright has expired or been waived). Does this mean AI companies are infringing copyright by using these materials? How would copyright exceptions and fair dealing/fair use apply in different countries? How would licence terms – including the terms of open licences – be respected? Answers will come both from legislation and codes of practice introduced by governments and regulatory bodies (such as the EU AI Act) and from the outcomes of court cases (see, for example, Getty Images vs Stability AI, the Authors’ Guild against OpenAI and Microsoft.

User Prompts: The prompts a user provides to the model (instructions, text, images) may also be protected. You should also consider whether the prompts you enter include any confidential/commercially sensitive information that should not be shared. Please see UCL’s IP policy for guidance on this.

A digital illustration depicts a serene-looking young woman with glowing skin and braids that resemble threads. Text overlay reads "Zarya of the Dawn," The background has shades of green, black and blue forming an ethereal environment.

Image Credit: Kris Kashtanova using Midjourney AI, Public domain, via Wikimedia Commons.

AI-Generated Work: Is the AI-generated work an original work protected by copyright? Is it a derivative of other original works, and therefore, possibly infringing? If it is protected, who owns the copyright? The answer to this will vary by case and jurisdiction. In the US, a court ruled that AI-generated images in a comic book were not protected, although the whole comic book and story were. In China, it was ruled that images generated with the use of GenAI tools would be protected, with the owner being the person who provided the prompts. The UK’s CDPA (9.3) states that ‘in the case of a literary, dramatic, musical or artistic work which is computer-generated, the author shall be taken to be the person by whom the arrangements necessary for the creation of the work are undertaken’.

In short, GenAI raises questions about what constitutes an original work, what constitutes infringement, how copyright exceptions and fair dealing/fair use are applied, and how authorship is established. While these questions are still being shaped, here are three things you can do:

  1. Consider any limitations in using GenAI besides copyright (e.g., confidentiality, biases, publishers’ policies). See UCL’s Generative AI hub for guidance.
  2. Be transparent about how you use GenAI. See UCL Library guidance on acknowledging the use of AI and referencing AI.
  3. If you have any copyright-related questions on the use of GenAI, contact the copyright support service.

 While GenAI has opened up more questions than answers around copyright, it also offers an opportunity to think about copyright critically. Stay connected with us for Part 2 of this blog post, which will discuss how new technologies, including GenAI, are changing our understanding of copyright. We look forward to continuing this important conversation with you.

Get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on X, formerly Twitter, LinkedIn, and join our mailing list to be part of the conversation!

 

 

Text and Data Mining (TDM) and Your Research: Copyright Implications and New Website Guidance

By Rafael, on 13 May 2024

This the second blog post of our collaborative series between the UCL Office for Open Science and Scholarship and the UCL Copyright team. Here, we continue our exploration of important aspects of copyright and its implications for open research and scholarship. In this instalment, we examine Text and Data Mining (TDM) and its impact on research along with the associated copyright considerations.

Data processing concept illustration

Image by storyset on Freepik.

The development of advanced computational tools and techniques for analysing large amounts of data has opened up new possibilities for researchers. Text and Data Mining (TDM) is a broad term referring to a range of ‘automated analytical techniques to analyse text and data for patterns, trends, and useful information’ (Intellectual Property Office definition). TDM has many applications in academic research across disciplines (Intellectual Property Office definition). TDM has many applications in academic research across disciplines.

In an academic context, the most common sources of data for TDM include journal articles, books, datasets, images, and websites. TDM involves accessing, analysing, and often reusing (parts of) these materials. As these materials are, by default, protected by copyright, there are limitations around what you can do as part of TDM. In the UK, you may rely on section 29A of the Copyright, Designs and Patents Act, a copyright exception for making copies for text and data analysis for non-commercial research. You must have lawful access to the materials (for example via a UCL subscription or via an open license). However, there are often technological barriers imposed by publishers preventing you from copying large amounts of materials for TDM purposes – measures that you must not try to circumvent. Understanding what you can do with copyright materials, what may be more problematic and where to get support if in doubt, should help you manage these barriers when you use TDM in your research.

The copyright support team works with e-resources, the Library Skills librarians, and the Office for Open Science and Scholarship to support the TDM activities of UCL staff and students. New guidance is available on the copyright website. TDM libguide and addresses questions that often arise during TDM, including:

  • Can you copy journal articles, books, images, and other materials? What conditions apply?
  • What do you need to consider when sharing the outcomes of a TDM analysis?
  • What do publishers and other suppliers of the TDM sources expect you to do?

To learn more about copyright (including how it applies to TDM):

Get involved!

alt=""The UCL Office for Open Science and Scholarship invites you to contribute to the open science and scholarship movement. Stay connected for updates, events, and opportunities. Follow us on X, formerly Twitter, LinkedIn, and join our mailing list to be part of the conversation!