Penguin Random House says no AI training on its books

Penguin Random House (PRH) has taken a significant step in response to rising concerns about the use of intellectual property to train AI systems.

The publisher has introduced a new statement to the copyright pages of both new and reprinted books, stating, “No part of this book may be used or reproduced in any manner for the purpose of training artificial intelligence technologies or systems.” This change is supplemented by a section that excludes PRH’s works from the European Union’s text and data mining exception, in accordance with applicable copyright laws.

As one of the first major publishers to address the issue of AI training explicitly, PRH is responding to the broader debate about how tech companies use copyrighted content to train large language models (LLMs), like those used in chatbots and other AI tools. Publishers have become increasingly concerned about the possible misuse of their intellectual property in recent years, especially after reports arose that copyrighted books were utilised by AI firms to enhance these technologies.

PRH’s move to amend its copyright page is an attempt to protect its content ahead of time, even though such comments have no bearing on the legal framework of copyright. The clauses work similarly to a “robots.txt” file, which websites employ to request that their content not be scraped by bots or AI systems. While these notices indicate the publisher’s intent, they are not legally binding, and existing copyright protections apply in the absence of such disclaimers.

PRH’s move also emphasises the ongoing tension between content creators and the AI industry, as more authors, publishers, and other creatives ask for stronger protections. The Authors’ Licensing and Collecting Society (ALCS) has been outspoken in its support for PRH’s actions. ALCS CEO Barbara Hayes expressed approval of the updated copyright language, emphasising the need for publishers to protect their works from unauthorised use in AI training.

However, some contend that simply changing copyright pages may not be enough. The Society of Authors (SoA) applauds PRH’s efforts, but believes more needs to be done to guarantee that authors’ rights are properly protected. SoA CEO Anna Ganley has called on publishers to go beyond these statements and incorporate explicit protections in author contracts, making sure that writers are informed before their work is used in AI-related initiatives.

As AI advances, the debate over its usage of copyrighted content remains far from over. PRH’s action could herald a larger shift in the publishing sector, but how other publishers and the legal system react remains to be seen.

(Image by StockSnap)

See also: AI governance gap: 95% of firms haven’t implemented frameworks

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: artificial intelligence, ethics, generative ai, law

Source link