ChatGPT et al.: Are AI Companies Violating Copyright Law?
WU Vienna research identifies legal uncertainty and need for reform
Can copyright‑protected works be used to develop and run generative artificial intelligence applications such as ChatGPT or Midjourney? This question lies at the heart of current research by Philipp Homar, professor of Intellectual Property Law and head of the Information Law and Intellectual Property Law Group at WU Vienna University of Economics and Business.
Homar’s doctrinal legal research, which interprets and systematizes existing copyright rules, focuses on a fundamental tension at the core of the debate surrounding AI:
On the one hand, it is important to strengthen innovation in artificial intelligence
On the other hand, the rights of human authors must be protected
“AI systems are trained on enormous amounts of copyright‑protected content,” Homar explains. “The key legal question is therefore whether, and under which conditions, such use is permissible without the consent of the rightholders,” he says.
AI training as text and data mining
A central pillar of Homar’s analysis is the legal classification of AI training processes. Under current EU law, such activities might fall under the text and data mining (TDM) exceptions introduced in 2019. These exceptions allow the use of protected works without the need to obtain licenses, provided that specific conditions are met.
Homar’s research concludes that training AI systems can generally be qualified as text and data mining under EU copyright law. However, this qualification is subject to strict requirements – most notably the condition of lawful access to the data used for training.
Lawful access: A key legal barrier
One of the most significant legal uncertainties concerns the question of what lawful access actually means. It remains unclear whether the fact that content is freely available online also means that access to this content is always lawful as well, even when it originates from unauthorized sources such as so‑called shadow libraries.
EU law draws an important distinction between lawful sources and lawful access to works. As a result, the fact that works come from unlawful sources does not automatically imply that accessing them is unlawful as well. “Whether courts will adopt this distinction remains an open question, however,” says Homar. Existing case law by the Court of Justice of the European Union suggests a more restrictive interpretation, which may limit reliance on the TDM exceptions.
Opt‑out rights are difficult to enforce in practice
Another core issue is the opt‑out right available to authors under EU copyright law. Authors may reserve their rights and object to the commercial use of their works for text and data mining purposes. In such cases, AI companies must obtain authorization before using the content in question.
In practice, however, exercising this right involves significant challenges. “The law states that opt‑outs must be declared in a machine‑readable format, but it remains unclear what exactly counts as machine‑readable,” Homar notes. There is an urgent need for clear technical and legal standards that protect authors’ rights while also providing legal certainty for companies developing AI systems.
Persistent uncertainty and fundamental policy questions
Despite careful legal analysis, substantial uncertainty remains, both for AI developers and rightholders. This uncertainty highlights several broader policy questions that legislators will ultimately need to address:
Does the current legal framework adequately support innovation in artificial intelligence?
Are human creators fairly compensated for the works that underpin AI systems?
Should we replace today’s complex text and data mining regime with clearer and simpler rules?
“Our role as scholars is to critically assess and guide these developments and provide a solid analytical basis for political decision‑making,” Homar emphasizes.
With his research, Philipp Homar contributes an important perspective to the ongoing European debate on the future of copyright law in the age of artificial intelligence.
Publications
Homar, P. 2026 (forthcoming). Text and Data Mining, Lawful Access and the Role of Contracts. In: Bonadio/Mezei/Alonso (eds.), The Cambridge Handbook of Generative AI and IP in Europe, Cambridge University Press.
Homar, P. 2022. § 42h UrhG. In: Thiele/Burgstaller (eds.), UrhG4.
Video
Meet Our Researchers: Philipp Homar. ChatGPT et al.: Are AI Companies Violating Copyright Law? (YouTube, 5:18 min.)
Interview on WU.ac.at