Data

Amazon’s secret GitHub data grab

To create powerful AI models, you need mountains of good data. Amazon is going to great lengths to collect this type of valuable information. The company recently told employees to sign up for Microsoft’s GitHub software-development platform and share their accounts so Amazon can scrape data from GitHub more quickly, Business Insider has learned. This is a key step in Amazon’s efforts to train its upcoming in-house AI model.

Source: Amazon’s secret GitHub data grab

Meta to give EU users an opt out for AI data training

Meta will start training its AI models using everyone’s social media posts though European Union users can opt out, a luxury the rest of the world won’t enjoy. The move, which the Facebook parent detailed in an announcement today, is ostensibly to bring its machine-learning systems to Europe. Meta has so far not included its European userbase in its AI training data, presumably to avoid legal conflict with the continent’s privacy regulations.

Source: Meta to give EU users an opt out for AI data training

Europe’s music tastes becoming more local

The amount of royalties generated by European Union artists on Spotify has tripled in the past six years, and listeners’ tastes are becoming increasingly local. So says Spotify’s inaugural European Union-focused Loud & Clear report, which for the first time breaks down Spotify listener and royalty data specifically for the European Union.

Source: Europe’s music tastes becoming more local, as royalties generated by European Union artists triple in 6 years

How today’s artists find sustainable success in a turbulent music industry

Over the past decade, the music industry’s approach to talent discovery, marketing, and artist careers has become too data obsessed, near sighted, and damaging to the industry’s lifeblood. Artists are being sold short and, in turn, this has created a dysfunctional creative-commercial ecosystem.

Source: Stability from chaos: How today’s artists find sustainable success in a turbulent music industry

LinkedIn Begins Labeling AI-Generated Content

LinkedIn has announced it will begin adding labels to in-stream content created by generative AI so its users better understand the posts they are interacting with. To carry out the labeling process, the business-to-business social network is partnering with the Coalition for Content Provenance and Authenticity (C2PA), a project that aims to develop technical standards for certifying the origins of digital content.

Source: MediaDailyNews: LinkedIn Begins Labeling AI-Generated Content

Reddit’s deal with OpenAI will plug its posts into “ChatGPT and new products”

OpenAI has signed a deal for access to real-time content from Reddit’s data API, which means it can surface discussions from the site within ChatGPT and other new products. It’s an agreement similar to the one Reddit signed with Google earlier this year that was reportedly worth $60 million. The deal will also “enable Reddit to bring new AI-powered features to Redditors and mods” and use OpenAI’s large language models to build applications

Source: Reddit’s deal with OpenAI will plug its posts into “ChatGPT and new products”

The Battle Over Using Journalism to Build AI Models is Just Starting

ChatGPT will tell you that the news is factual, includes language variation and cultural awareness, comprises complex sentence structures, includes quotes that convey real-world conversations, excels at summarization and condensation. In fact, the news is so valuable to this endeavor that it makes up half of the top 10 sites incorporated into one of Google’s datasets that is being used to train some of the most popular large language models.

Source: The Battle Over Using Journalism to Build AI Models is Just Starting | Nieman Reports

TikTok is adding an “AI-generated” label to watermarked third-party content

TikTok already automatically applies an “AI-generated” tag to content on its platform made using TikTok’s AI tools, and that same label will now apply to content created on other platforms. Now, TikTok will detect when images or videos are uploaded to its platform containing metadata tags indicating the presence of AI-generated content and says it’s the first social media platform to support the new Content Credentials.

Source: TikTok is adding an “AI-generated” label to watermarked third-party content

Can Regulation Deep Six Deepfakes?

The National Institute of Standards and Technology (NIST), a basic science and research arm of the Commerce Department best known, if at all, for tackling knotty challenges like accurately centering quantum dots in photonic chips and developing standard reference materials for measuring the contents of human poop used in medical research and treatments, last week took up the problem of identifying AI generated and manipulated audio, video, images and text.

Tasked by President Biden’s Executive Order on AI with helping to improve the safety, security and trustworthiness of AI systems, NIST has issued a GenAI Challenge inviting teams of researchers from academia, industry and other research labs to participate in a series of challenges intended to evaluate systems and methods of identifying synthetic content.

This Week in AI: Generative AI and the problem of compensating creators

A recently published research paper co-authored by Boaz Barak, a scientist on OpenAI’s Superalignment team, proposes a framework to compensate copyright owners “proportionally to their contributions to the creation of AI-generated content.” How? Through cooperative game theory.

Source: This Week in AI: Generative AI and the problem of compensating creators | TechCrunch

Get the latest RightsTech news and analysis delivered directly in your inbox every week
We respect your privacy.