Text and Data Mining potential unleashed

ucylcjh28 November 2014

One of the significant new exceptions introduced this year enables Text and Data Mining (TDM) to be carried out on bodies of copyright material as long as it is for a “non commercial purpose” (and the sources should be acknowledged where possible.)

TDM includes a range of advanced techniques for analysing vast quantities of data in order to draw out new facts or statistical trends, or gather evidence of previously unexplored relationships (for example between chemical substances and medical conditions). The potential uses of TDM are very wide ranging and may occur in all disciplines.

The new exception makes the application of TDM to copyright works possible by removing the copying of material (which is an essential part of the TDM process) from the realms of infringing activity.

A Jisc report on the value and benefits of text mining from 2012 mentions that at that time TDM activities in higher education were mainly focussed on Open Access materials because the latter were more readily available. Since the new TDM exception was introduced in June 2014, the content of a vast number of e-journals to which universities such as UCL subscribe should also be available for TDM. This results from the fact that under the legislation, the terms of our contracts with the publishers of those journals cannot over-ride the TDM Exception:

“To the extent that a term of a contract purports to prevent or restrict the making of a copy which, by virtue of this section, would not infringe copyright, that term is unenforceable”, Copyright, Designs and Patents Act 1988, Section 29A(5).

This means that any terms in suppliers’ contracts which sought to restrict advanced computer analysis of their repertoire will no longer have any weight, at least where non commercial research is concerned.