Major publishers sue Perplexity AI for scraping without paying

Major US news publishers Dow Jones & Co and NYP Holdings have sued AI search engine startup Perplexity for scraping their content without paying for it.

The lawsuit, filed on behalf of The Wall Street Journal and its sister tabloid New York Post by their parent company News Corporation, alleges two counts of copyright infringement and one of false designation of origin and dilution of trademarks. The plaintiffs accuse the AI biz of stealing the hard work of journalists to feed the data requirements of its training models. News Corp's CEO Robert Thomson claimed this could be the first of many such lawsuits against AI developers.

"The perplexing Perplexity has willfully copied copious amounts of copyrighted material without compensation, and shamelessly presents repurposed material as a direct substitute for the original source. Perplexity proudly states that users can 'skip the links' - apparently, Perplexity wants to skip the check," he told The Register in a statement.

"We applaud principled companies like OpenAI, which understands that integrity and creativity are essential if we are to realize the potential of Artificial Intelligence. Perplexity is not the only AI company abusing intellectual property and it is not the only AI company that we will pursue with vigor and rigor. We have made clear that we would rather woo than sue - but, for the sake of our journalists, our writers and our company, we must challenge the content kleptocracy."

News Corp isn't against sharing its intellectual property to train AI systems - but it wants the money upfront. In May it inked a deal with the aforementioned OpenAI for just this purpose, with a reported price tag over $250 million. The machine learning juggernaut also has similar deals in place with Reddit and Stack Overflow.

According to court documents [PDF] filed in the Southern District of New York District Court, News Corp first contacted Perplexity about the matter in July but received no response. It wants $150,000 for every proven infringement - which, if enforced, could severely impact or even bankrupt the startup.

The news giant also isn't just peeved at the data scraping itself, but also that Perplexity doesn't cite its sources. It claimed that Perplexity's AI "answer engine" can "skip the links" and that this deprives publishers of direct revenue. Even worse, it gets things wrong.

"In addition to using Plaintiffs' copyrighted work to develop a substitute product that reproduces or imitates Plaintiffs' original content, Perplexity also harms Plaintiffs' brands by falsely attributing to Plaintiffs certain content that Plaintiffs never wrote or published," the lawsuit claims.

"Not infrequently, if Perplexity is asked about what Plaintiffs' publications reported, Perplexity 'answers' with false information. AI developers euphemistically call these factually incorrect outputs 'hallucinations.' Perplexity's hallucinations can falsely attribute facts and analysis to content producers like Plaintiffs, sometimes citing an incorrect source, and other times simply inventing and attributing to Plaintiffs fabricated news stories."

One case cited is an August 2024 New York Post article about European attempts to "silence great Americans like Elon Musk." It claims Perplexity, when asked for a summary, copied the first 139 words of the piece, and then added five more paragraphs of factually incorrect information.

On the data scraping side, there is a mechanism for website operators to opt out of adding their content to the voracious maw of AI training databases: the robots.txt file, implemented by Google, OpenAI, and Cloudflare. While Perplexity CEO Aravind Srinivas has claimed his business does respect the do-not-scrape command, some third parties it uses might not be so ethical.

Perplexity had no comment at the time of going to press. ®

Search
About Us
Website HardCracked provides softwares, patches, cracks and keygens. If you have software or keygens to share, feel free to submit it to us here. Also you may contact us if you have software that needs to be removed from our website. Thanks for use our service!
IT News
Oct 22
Western Digital wasn't the only one - Windows 24H2 update bluescreens Asus systems

Microsoft blocks updates to avoid giving admins another headache

Oct 22
Tech firms to pay millions in SEC penalties for misleading SolarWinds disclosures

Unisys, Avaya, Check Point, and Mimecast settled with the agency without admitting or denying wrongdoing

Oct 22
Socket plugs in $40M to strengthen software supply chain

Biz aims to scrub unnecessary dependencies from npm packages in the name of security

Oct 22
Clock's ticking on PostgreSQL 12, but not everyone is ready to say goodbye

11% of databases still on aging version with a month of support left

Oct 22
Want to feel old? Excel just entered its 40th year

More senior than Windows itself, and still runs the world

Oct 22
Major publishers sue Perplexity AI for scraping without paying

We sell that to OpenAI - how dare you steal it and make stuff up

Oct 22
Lab-grown human brain cells drive virtual butterfly in simulation

Could organoid-driven computing be the future of AI power?