Openai And Google Reportedly Used Youtube Transcripts To Train Their Ai Models

SERVIDORES

YouTube on iPhone — Get ready for a brand new YouTube experience.

Maria Diaz/

Training artificial intelligence models requires a lot of data to help them better understand the context of queries and ultimately provide better responses. In the constant search for more data, both OpenAI and Google have turned to using YouTube videos, created by others, to train their large language models (LLMs), The New York Times reported over the weekend, citing people who claim to have knowledge of the companies' activities.

In 2023, OpenAI developed Whisper, a speech recognition tool that would help the company scrape YouTube, take audio from more than 1 million YouTube videos, and use that to inform GPT-4, according to the Times' sources.

Google, meanwhile, also transcribed YouTube videos, according to the report. What's more, the search giant changed its terms of service in 2023 to make it easier to sweep up public Google Docs, Google Maps restaurant reviews, and other publicly available content for use in its AI models, according to the Times.

Also: Have 10 hours? IBM will train you in AI fundamentals - for free

It's no secret that AI models require significant troves of data to operate efficiently. More data, including text, audio, and videos, gives models the ability to understand human context, human interaction, and other critical communication details that make them more effective.

However, there's increasing tension between the companies developing those models and the content creators. What content, if any, should be permissible to use in training AI models? In a growing number of cases, news outlets, websites, and content creators themselves are calling on OpenAI, Google, Meta, and other tech companies to pay for access to their content before they can be used to train LLMs.

In some cases, model makers have complied and signed agreements with companies, including Reddit and Stack Overflow, to get access to user data. In other cases, not so much.

According to The New York Times' report, for instance, OpenAI's alleged transcription of more than 1 million YouTube videos may run afoul of Google's own terms of service, which prevent third-party applications from using its YouTube videos for "independent" means. Additionally, the companies' decisions to allegedly transcribe videos may run afoul of copyright laws, since YouTube creators who upload videos to YouTube still retain the copyright to the content they create.

To be clear, the Times report cannot be independently verified. Also, neither Google nor OpenAI acknowledged that they scraped data illegally. We do know, however, that the companies are running out of ways to access more content. What's worse, a Times source said that it's possible tech companies will run out of content to ingest into their models by 2026.

Also: I spent a weekend with Amazon's free AI courses, and highly recommend you do too

What then? It's entirely possible - and perhaps, likely - that the tech companies move to sign licensing agreements with content creators, media outlets, and even musical artists to access their creations. It's also possible they will further change their terms of service, or worse, find ways to skirt privacy laws, to access the data they currently can't.

It's clear that the amount of data companies like Meta, Google, and OpenAI will need in the coming years will only increase. It's critical that as they access that data, they do so in a way that doesn't harm the people who created the content in the first place.

Featured

Google finally launches its Find My Device network. Here are the Android models that support it
5 Linux commands you must know to keep your device running smoothly
Apple is finally adding an iOS home screen feature that Android has had for 15 years
I changed this Android setting to instantly double my phone speed
The best AirTag for your wallet is flat, rechargeable, and isn't made by Apple

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVIDORES

NOTÍCIAS QUENTES

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

CloudEngine S6700 Series Switches Naming Conventions & Description

Huawei CloudEngine S6730-H24X6C Datasheet

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

Introduction to the Huawei CloudEngine S6730-S Series Switches

Huawei S6730-H48X6CZ-V2: The Ultimate High-Speed Network Switch

Overview of the S6730-H28X6CZ-V2 Switch

Huawei CloudEngine S6730-H24X4Y4C: A High-Performance Enterprise Switch for Modern Networks

Introduction to Huawei CloudEngine S6730-H Series Switches

Comprehensive Guide to the CloudEngine S6730-H24X6C-V2: Features, Specifications, and Applications

Huawei S6730-S24X6Q: Advanced Ethernet Switch for Modern Networks

Comprehensive Guide to the S6730-H48X6C-V2 High-Performance Switch

Huawei CloudEngine S6730-H28Y4C: High-Performance Switch for Modern Networks

Overview of the S6730-H24X6C-V2

Unveiling the Huawei CloudEngine S6730 Series: Advanced Switching for Modern Networks

Huawei S6730-H48X6C: A Comprehensive Overview

OpenAI and Google reportedly used YouTube transcripts to train their AI models

Featured

Tags quentes : Inovação

Ordering Guide

Recursos

Quem somos

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVIDORES

NOTÍCIAS QUENTES

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

CloudEngine S6700 Series Switches Naming Conventions & Description

Huawei CloudEngine S6730-H24X6C Datasheet

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

Introduction to the Huawei CloudEngine S6730-S Series Switches

Huawei S6730-H48X6CZ-V2: The Ultimate High-Speed Network Switch

Overview of the S6730-H28X6CZ-V2 Switch

Huawei CloudEngine S6730-H24X4Y4C: A High-Performance Enterprise Switch for Modern Networks

​Introduction to Huawei CloudEngine S6730-H Series Switches

Comprehensive Guide to the CloudEngine S6730-H24X6C-V2: Features, Specifications, and Applications

Huawei S6730-S24X6Q: Advanced Ethernet Switch for Modern Networks

Comprehensive Guide to the S6730-H48X6C-V2 High-Performance Switch

Huawei CloudEngine S6730-H28Y4C: High-Performance Switch for Modern Networks

Overview of the S6730-H24X6C-V2

Unveiling the Huawei CloudEngine S6730 Series: Advanced Switching for Modern Networks

Huawei S6730-H48X6C: A Comprehensive Overview

OpenAI and Google reportedly used YouTube transcripts to train their AI models

Featured

Tags quentes : Inovação

Ordering Guide

Recursos

Quem somos

Introduction to Huawei CloudEngine S6730-H Series Switches