EXTRA Universal Music Group chairman & CEO Lucian Grainge has had it up to here with generative AI.
“The recent explosive development in generative AI will, if left unchecked, both increase the flood of unwanted content hosted on platforms, and create rights issues with respect to existing copyright law, in the U.S. and other countries, as well as laws governing trademark, name and likeness, voice impersonation, and right of publicity,” he said on UMG’s otherwise upbeat Q1 earnings call this week.
“Much of the latest generative AI is trained on copyrighted material, which clearly violates artists’ and labels’ rights, and would put [streaming] platforms completely at odds with the partnerships with us and our artists,” he continued. “Any way you look at it, this oversupply, whether or not AI-created is, simply, bad. Bad for artists. Bad for fans. And bad for the platforms themselves.”
UMG, of course, is a high-profile recent victim of generative AI with the release of the pseudonymously credited track “Heart on My Sleeve” featuring AI-generated “vocals” by Drake and The Weeknd, both UMG-affiliated artists. Universal managed to get that track taken down from most streaming platforms, at least temporarily, but the label’s concerns about the use of its content to train AIs predate the appearance of deep-fake Drake.
“We have become aware that certain AI systems might have been trained on copyrighted content without obtaining the required consents from, or paying compensation to, the rightsholders who own or produce the content,” UMG wrote in an email sent to DSPs in March that was viewed by the Financial Times. “We will not hesitate to take steps to protect our rights and those of our artists”
Among the steps it wants the DSPs to take is to deny generative AI systems access to their application program interfaces (APIs) so as to limit the AIs’ ability to scrape streaming services’ libraries for training material.
“We have a moral and commercial responsibility to our artists to work to prevent the unauthorized use of their music and to stop platforms from ingesting content that violates the rights of artists and other creators,” UMG said in a statement confirming the email. “We expect our platform partners will want to prevent their services from being used in ways that harm artists.”
Saber rattling aside, UMG’s focus on using commercial agreements as leverage against wanton use of its content to train AI models reflects an emerging trend among rights owners. Earlier this month, the social news aggregation portal Reddit announced changes to its API policies, including new terms of use for its Data API (f/k/a its public API).
Rather than making access to its Data API and its vast archive of user generated content freely available to developers, Reddit will now start charging commercial enterprise developers for access, and will impose limits on the number of API calls such customers can make. The new terms single out AI training as one of those commercial purposes to which the new terms will apply.
Except as expressly permitted by this section, no other rights or licenses are granted or implied, including any right to use User Content for other purposes, such as for training a machine learning or AI model, without the express permission of rightsholders in the applicable User Content.
Because of its size, and its trove of idiomatic, conversational language use, the Reddit archive has been prized by AI developers, including OpenAI and Google, as well as by academic researchers and social listening tool developers. Until now, however, the company had not tried to directly monetize it.
“The Reddit corpus of data is really valuable,” CEO Steve Huffman told the New York Times in an interview. “We don’t need to give all of that value to some of the largest companies in the world for free.”
One likely reason for the focus on commercial agreements to regulate access to copyrighted material by AI models is uncertainty around whether rights-based claims can be sustained.
In UMG’s earnings call, chief digital officer Michael Nash was emphatic on the validity of the label’s copyright position with respect to generative AI.
“First of all, in terms of copyright, to reiterate our very clearly articulated position – and echo Lucian’s excellent summary earlier – sophisticated generative AI that’s enabled by large language models, which is trained on our intellectual property, violates copyright law in several ways,” he said. “Companies have to obtain permission and execute a license to use copyrighted content for AI training or other purposes, and we’re committed to maintaining these legal principles.”
Yet for all the bravado, UMG’s “Heart on My Sleeve” takedown notices to DSPs actually targeted an unlicensed sample from producer Metro Boomin’, not any infringement of its copyright in the music of Drake or The Weeknd.
“We own all sounds captured on a sound recording,” Nash insisted. “That is, in fact, the very nature of sound recording copyright and ownership. And here too, depending on the instance, we may also employ name and likeness, voice impersonation, right of publicity protections as well. Specifically, soundalikes which serve to confuse the public as to the source or origin, or which constitute a commercial appropriation of likeness in the form of a distinctive voice, are all clearly illegal.”
Illegal perhaps, but whether soundalikes violate copyright law is still unsettled.
“I think it’s very important that governments around the world interpret and enforce the existing laws correctively – correctly and actively,” Nash said. “Copyright covers all training of AI and copyrighted music regardless of the technical means employed.”
Maybe. But in the meantime, expect to see record companies and other large rights owners leveraging commercial agreements to try to keep their content out of the hands of AI training datasets.