Reddit sues Anthropic for allegedly using its data to train AI without permission
Reddit's content is a gold mine for teaching AI to sound human — and Reddit isn’t about to let it go for free.

Reddit sued the artificial intelligence company Anthropic, maker of the chatbot Claude, this week. Reddit accused Anthropic of accessing its data without permission to train AI models.
Large language models are hungry for two things: compute power (that’s all those chips Nvidia is selling) and data — in this case, that’s potentially millions of user posts across Reddit’s thousands of niche communities. That kind of content is a gold mine for teaching AI to sound human, and Reddit isn’t about to let it go for free.
Richard Lachman ran into some problems hooking up a new computer monitor recently, so he took to the internet for advice.
“I'm getting the vendor’s description — not helpful. I'm getting people who have very complex 15 layers deep on some help site, and I'm not finding what I need because the quality is not there,” he said.
Lachman, who’s a professor of media at Toronto Metropolitan University, finally found what he needed on Reddit.
“It's kind of a meme at this point of saying I have some weird, obscure problem, I will do a search, and someone, three years ago in some other part of the world, had exactly the same problem,” he said. “So Reddit really is a store of highly specialized knowledge that is updated very frequently.”
That’s made “Reddit” one of the most popular search keywords as users seek authentic human voices on an internet full of advertising and SEO drivel.
It’s also made the platform attractive to AI companies, said Chirag Shah, a computer science professor at the University of Washington.
“It's not just people posting things, but also people reacting to it,” he said.
So Reddit can teach AI not only facts and information, but how to communicate, too.
“What things get positive reaction, negative reaction. What things get discussed, how people discuss it,” said Shah. “This allows a model to be a lot more fine tuned to what people may be looking for.”
And Reddit has caught on to this, said Baird tech strategist Ted Mortonson. A couple years ago, before the company went public, it restricted free public access to its data.
“I think upper management just realized that, ‘Hey, we're sitting on a gold mine here,’” he said.
Last year, Reddit made deals with Google and OpenAI to license that data for more than $100 million.
It’s a big boost in revenue for a platform, that unlike Facebook or Instagram, has struggled to monetize its user base with advertising.