Humans have nuanced awareness. They draw on their experiences to make inferences and logical decisions. AI models are, however, only as good as their training data. An AI model’s accuracy doesn’t entirely depend on the underlying algorithms’ technical sophistication or the amount of data processed. Instead, accurate AI performance depends on trustworthy, high-quality data during training and analytical performance tests.
Data
How Google forced publishers to accept AI scraping as price of appearing in search
Google considered allowing publishers to opt out of their data being used for AI grounding and still appear in search results but described it as a “hard red line.” New documents disclosed in the remedies portion of an antitrust trial into Google’s search monopoly in the US reveals the tech giant preferred not to give publishers the option as it was “evolving into a space for monetization.”
Source: How Google forced publishers to accept AI scraping as price of appearing in search
SoundCloud says it isn’t using your music to train generative AI tools
While the company says it hasn’t used user-created content for model training, it doesn’t rule out the possibility that it will in the future. A spokesperson for SoundCloud, provided the following statement. “SoundCloud has never used artist content to train AI models, nor do we develop AI tools or allow third parties to scrape or use SoundCloud content from our platform for AI training purposes.”
Source: SoundCloud says it isn’t using your music to train generative AI tools
WMG launches Pulse app for artists to track streams, earnings, and fan engagement
Warner Music Group has launched a new app called WMG Pulse, described as a “powerful new platform that puts clear, meaningful insights into the hands of artists and songwriters”. The app lets WMG artists, songwriters, their managers, and teams access real-time information about streaming performance, fan engagement, and earnings across various platforms.
Merlin Officially Joins the Music Fights Fraud Alliance
Months after bringing on its first director of content integrity, Merlin has officially joined the Music Fights Fraud Alliance (MFFA). Established in 2023, Music Fights Fraud, in keeping with its name, bills itself as “a global task force aimed at eradicating streaming fraud.” And in pursuit of the objective, MFFA is said to collaborate with the 23-year-old National Cyber-Forensics and Training Alliance (NCFTA) on “a shared database of identified fraud markers.”
Source: Merlin Officially Joins the Music Fights Fraud Alliance
Only 17% of Music Creator College Students Familiar With MLC
MusicAnswers has spent two years conducting a survey of college student music creators to better understand their experience with song registration agencies—specifically the Mechanical Licensing Collective (MLC). The results are pretty grim, with a majority of college students being unaware of the MLC’s purpose or what they do in the industry.
Source: Only 17% of Music Creator College Students Familiar With MLC
Wikipedia is giving AI developers its data to fend off bot scrapers
Wikipedia is attempting to dissuade artificial intelligence developers from scraping the platform by releasing a dataset that’s specifically optimized for training AI models. The Wikimedia Foundation announced on Wednesday that it had partnered with Kaggle — a Google-owned data science community platform that hosts machine learning data — to publish a beta dataset of “structured Wikipedia content in English and French.”
Source: Wikipedia is giving AI developers its data to fend off bot scrapers
Netflix is revamping search with AI to improve discovery
Netflix is building a new search experience aimed at improving the discovery experience, and it’s going to use AI to do it, the company’s CEO Greg Peters said during its first-quarter results conference call. Peters said Netflix is working on “interactive search that’s based on generative technologies” to help people find different titles.
Source: Netflix is revamping search with AI to improve discovery | TechCrunch
Pex acquired by copyright protection and content monetization company Vobile
Los Angeles-based Pex, an audio content identification platform, has been acquired. Pex’s new owner is a company called Vobile, which offers digital content protection and transaction services for entertainment companies, platforms, sports leagues, music labels, and publishers. Vobile has confirmed that Pex COO Amadea Choplin has joined the company as Head of Music Business, while founder Rasty Turek, formerly CEO, will act as a consultant to Vobile going forward.
Source: Pex acquired by copyright protection and content monetization company Vobile
‘Catastrophic overtraining’ could harm large language AI models
Researchers from Carnegie Mellon, Stanford, Harvard, and Princeton are challenging one of AI development’s accepted core beliefs – that the more pre-training data the better the performance. As reported by HPCwire, a new paper discuses the concept of “catastrophic overtraining,” whereby extended pre-training can harm a model’s performance after fine-tuning.