Technology

Fixing AI’s Market Failure

Large Language Models (LLMs) require large amounts of data for training. Very large. Like the entire textual content of the World Wide Web large. In the case of the largest such models — OpenAI’s GPT, Google’s Gemini, Meta’s LLaMA, France’s Mistral — most of the data used is simply vacuumed up from the internet , if not by the companies themselves then by third-party bot-jockeys like Common Crawl, which provides structured subsets of the data suitable for AI training. Other tranches come from digitized archives like the Books 1, 2 and 3 collections and Z-Library.

In nearly all cases, the hoovering and archive-compiling has been done without the permission or even the knowledge of the creators or rights owners of the vacuumed-up haul.

For Data-Guzzling AI Companies, the Internet Is Too Small

Companies racing to develop more powerful artificial intelligence are rapidly nearing a new problem: The internet might be too small for their plans. Ever more powerful systems developed by OpenAI, Google and others require larger oceans of information to learn from. That demand is straining the available pool of quality public data online at the same time that some data owners are blocking access to AI companies.

Source: For Data-Guzzling AI Companies, the Internet Is Too Small

AI has arrived in Hollywood. It’s a lot more boring than you might think

The new horror film Late Night with the Devil hit theaters late last month amid a lot of really good buzz. It has a 96% on Rotten Tomatoes and has broken box office records for its distributor, IFC Films. It seemed poised to become the indie movie success story of the first half of 2024. But that buzz has curdled quite a bit once word started to circulate that generative AI had been used in the film.

Source: AI has arrived in Hollywood. It’s a lot more boring than you might think

OpenAI deems its voice cloning tool too risky for general release

A new tool from OpenAI that can generate a convincing clone of anyone’s voice using just 15 seconds of recorded audio has been deemed too risky for general release, as the AI lab seeks to minimise the threat of damaging misinformation in a global year of elections. “We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities,” OpenAI said in an unsigned blogpost.

Source: OpenAI deems its voice cloning tool too risky for general release

UK Govt: “Pronounced Inaccuracies” in Press Reports on IP-Related Matters

A study on emerging public perceptions of intellectual property in UK media has found that there are “pronounced inaccuracies in the reporting on IP related matters in the UK Press.” An initial review published by the UK’s Intellectual Property Office notes that inaccurate reporting may be due to a “lack of understanding.” Further investigation would be required to find out the “cause and extent” and the subsequent impact on IP rights as understood by the public.

Source: UK Govt: “Pronounced Inaccuracies” in Press Reports on IP-Related Matters * TorrentFreak

Generative AI ‘FOMO’ is driving tech heavyweights to invest billions in startups

Tech giants aren’t doing much acquiring these days, due mostly to an unfavorable regulatory environment. But they’re finding other ways to spend billions of dollars on the next big thing. Amazon’s $2.75 billion investment in artificial intelligence startup Anthropic, announced this week, was its largest venture deal and the latest example of the AI gold rush that’s prompting the biggest tech companies to fling open their wallets.

Source: Generative AI ‘FOMO’ is driving tech heavyweights to invest billions of dollars in startups

Despite the Panic, Generative AI Won’t Be on the Big Screen Any Time Soon

No one confused early Sora demonstrations with art, but what startled filmmakers is these semi-professional and quasi-lifelike images introduced a doomsday scenario: Studios will use this rapidly evolving tech to replace them. “It’s a fraught time because the messaging that’s out there is not being led by creators,” said producer Diana Williams, a former Lucasfilm executive. “It’s really being led by the business people and by publicly owned companies.”

Source: Despite the Panic, Generative AI Won’t Be on the Big Screen Any Time Soon

OpenAI built a voice cloning tool, but you can’t use it… yet

As deepfakes proliferate, OpenAI is refining the tech used to clone voices — but the company insists it’s doing so responsibly. Today marks the preview debut of OpenAI’s Voice Engine, an expansion of the company’s existing text-to-speech API. Under development for about two years, Voice Engine allows users to upload any 15-second voice sample to generate a synthetic copy of that voice.

Source: OpenAI built a voice cloning tool, but you can’t use it… yet | TechCrunch

‘It’s very easy to steal someone’s voice’: how AI is affecting video game actors

Just as in film and TV, only more so, AI represents a gathering storm for video game actors. Some studios are experimenting with tools that can clone voices, alter voices and generate audio from text. In interactive, multi-choice games, this can generate a potentially endless number of characters and conversations – and is far more efficient than asking performers to record huge quantities of dialogue.

Source: ‘It’s very easy to steal someone’s voice’: how AI is affecting video game actors

‘Machine Unlearning’ May Be the Solution to Problematic AI Data

With the breakneck speed at which artificial intelligence has been progressing, there have undoubtedly been some stumbles. One of the biggest issues has been the use of copyrighted materials to train AI models as well as images that may be inappropriate or raise privacy issues. But a technique referred to as “machine unlearning” developed by researchers at the University of Texas Austin may offer a solution to those concerns.

Source: ‘Machine Unlearning’ May Be the Solution to Problematic AI Data

Get the latest RightsTech news and analysis delivered directly in your inbox every week
We respect your privacy.