AI and Copyright: Could Trade Agreements Be Part of the Solution?

Published 5 December 2023

A controversial issue in the realm of AI is the use of copyrighted works to train AI systems. What legally happens if an AI developer uses someone else’s copyrighted works that are protected under national copyright laws to train its creative AI systems like ChatGPT? Secondly, what solutions or uncertainties might trade agreements, including the Comprehensive and Progressive Agreement for Trans-Pacific Partnership (CPTPP), introduce?

Generative AI systems—those that create written text, enhanced images, or audio—require a huge amount of data created by humans, which often includes copyrighted materials such as books, songs, and movies. Specifically, text and data mining (TDM) has emerged as an efficient practice, utilizing computational methods to analyse substantial amounts of data, and to extract patterns and trends. In recent years, we have witnessed AI systems successfully generating high-quality text, images, and audio, even contributing to the release of the final Beatles song and winning an art competition.

So, the question arises: is it legally permissible to use a copyright-protected work in TDM without obtaining permission from the copyright owner? Normally, you would need to get a licence from the copyright owner. However, there are instances where copyright law permits use, even in the absence of a licence. These are called ‘exceptions.’ Under UK copyright law, one such exception is Article 29A of the Copyright, Designs, and Patents Act 1988 (CDPA), which allows for TDM for non-commercial research purposes. Given that Article 29A does not extend to commercial research purposes, the inclusion of copyrighted material in training datasets by profit-seeking AI developers would be considered an infringement in the UK.

However, not every jurisdiction follows the same rules as the UK. Japan, for instance, is renowned for its comparatively liberal copyright regime regarding TDM practices. Specifically, Article 30-4 of the Japanese Copyright Act (JCA) permits the exploitation of a work for: i) data analysis, ii) use in experiments aimed at improving or applying practical use technology related to the recording of sounds or visuals of a work; and iii) use during computer data processing. These clauses cover the exploitation of a work for commercial purposes. Given that Japan has a comparative advantage in robot development, would its wide interpretation of TDM serve as an optimal model for the UK?

The CDPA appears to discourage commercial AI research in the UK by mandating the licensing of copyrighted works, which is both costly and time-consuming. In June 2022, the UK Intellectual Property Office proposed a further exception allowing TDM for commercial purposes. It hoped that many stakeholders—including AI innovators, small and medium-sized enterprises (SMEs), and journalists—would benefit from this new exception. Unfortunately, the proposal backfired. The creative industries in the UK strongly opposed it, emphasizing that intellectual property is the ‘lifeblood’ of their industries and highlighting their reliance on the £335 million in revenue generated from TDM licensing. Ultimately, the UK Government decided not to advance the proposal to the next stage.

Other countries will soon need to take action, as this year has witnessed an increased number of copyright infringement cases. These are mostly in the form of class-action lawsuits by artists against AI developers across the United States, with famous authors such as George Martin and John Grisham joining in. Living in a digitized global village, it would be too naive to think that the same AI practices and the related disputes will not spread to other parts of the world.

What role do multilateral or preferential trade agreements play?

At the multilateral level, Article 13 of the Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS) sets out the three-step test for determining the scope of limitations and exceptions to copyright. They have to be i) limited to certain special cases, ii) do not interfere with a normal exploitation of the work, iii) do not unreasonably undermine the lawful interests of the copyright holder.

However, the three-step test is open-ended, lacks specificity and could lead to varied interpretations by national courts in each jurisdiction. Consequently, a Japanese court may uphold a commercial TDM complaint in accordance with the JCA, whereas a UK court may deem it an infringement according to the CDPA. Each country could interpret these outcomes as being in compliance with the trade agreements to which they are a party.

At the bilateral level, for example, the UK-Japan Comprehensive Partnership Agreement (CEPA) (Article 14.14) and the UK-EU Treaty on Cooperation Agreement (Article 233) merely repeat the content of Article 13 of TRIPS.

At the regional level, Article 18.65 of the CPTPP follows the steps of Article 13 of TRIPS. Article 18.66 obliges Parties to achieve an appropriate balance through limitations or exceptions and provides examples of legitimate purposes, including, but not limited to, news reporting, teaching, and research. However, footnote 79 to Article 18.66 hides a crucial note, stating that a use that has 'commercial aspects' can be regarded to fall within legitimate purposes that potentially include commercial TDM.

The real danger for the UK is to lose commercial development of AI to countries that are liberal towards commercial TDM, hence hurting UK competitiveness. In that sense, the CPTPP even explicitly opens the door for commercial TDM. Of course, the counterargument is that we do not need an international harmonization of national laws on TDM and countries are free to establish their policies. Given the ubiquitous use of and increased economic significance of AI systems around the world, however, it seems more likely to witness an international harmonization in the future.

Why are trade agreements unable to provide international harmonization regarding the use of copyrighted works to train creative AI systems? One important reason, among others, is they are unable to keep pace with the novel legal challenges spurred by rapid technological advancements. No one could foreseeably contemplate the technology storing millions of works and training AI systems in the 1990s. Although a tech-neutral interpretation of rules (prohibition on the reproduction of a work) could arguably extend to copying a work in TDM, the absence of specific rules leads countries to implement a patchwork of laws.


The issue is potentially critical for the UK, listed as the fourth top exporter of creative services in the world in 2020 according to UN data, with significant licensing revenue from TDM practices. However, the UK has also declared its intention to strengthen its position as a global leader in AI in a White Paper. In order to achieve this policy goal and prevent losing commercial development to other countries, the UK may consider expanding the scope of exceptions for commercial text and data mining. However, neither permitting full exploitation of works nor entirely protecting conventional industries is a solution.

A middle ground might be achievable through international agreement, preferably at the multilateral level but perhaps more realistically at a regional or bilateral level, that would effectively lead to harmonisation in national laws. The middle-ground legal solution would be to allow the use of material for commercial purposes but with an exception for TDM with an ‘opt-out’ or ‘contract-out’ system as in the EU (Directive 2019/790), which allows copyright holders to remove their works from TDM. If coherency between national laws is still not achieved, further solutions ranging from resorting to dispute settlement mechanisms (such as Article 28.3 of the CPTPP) to providing technical assistance (Article 18.13(g) of the CPTPP) may be invoked.

Another solution might be a market-driven one by means of private contracts, discussed by Burkhard Schafer, which follows YouTube/Spotify’s business models. These models could financially incentivise copyright holders to allow their works to be used in training datasets – as with musicians for Spotify and content creators for YouTube – that will bypass the complexities of concluding multilateral (most difficult), regional (mild difficult) and bilateral (most likely) trade agreements.

We do not know which potential solution will prevail in the future, it may also be a combination of both. For example, those choosing to opt out their works may prefer compensation from Spotify/Youtube. What we can foresee now is that the first solution will necessitate a consensus among conventional and emerging industries, whereas the second solution will be achieved through the invisible hand of the market.

