How tech giants cut corners to harvest data for AI
ETtech.com -

In late 2021, OpenAI faced a supply problem. The artificial intelligence lab had exhausted every reservoir of reputable English-language text on the internet as it developed its latest AI system. It needed more data to train the next version of its technology -- lots more. So OpenAI researchers created a speech recognition tool called Whisper. It could transcribe the audio from YouTube videos, yielding new conversational text that would make an AI system smarter. Some OpenAI employees discussed...

In related news