No reliable way to detect AI-generated text, boffins sigh

The popularity of word salad prepared by large language models (LLMs) like OpenAI's ChatGPT, Google's Bard, and Meta's LLaMa has prompted academics to look for ways to detect machine-generated text.

Sadly, existing detection schemes may not be much better than flipping a coin, raising the possibility that we're destined to ingest statistically composed copy as a consequence of online content consumption.

Five computer scientists from the University of Maryland in the US - Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, and Soheil Feizi - recently looked into detecting text generated by large language models.

Their findings, detailed in a paper titled Can AI-Generated Text be Reliably Detected?, can be predicted using Betteridge's law of headlines: any headline that ends in a question mark can be answered by the word no.

Citing several purported detectors of LLM-generated text, the boffins observe, "In this paper, we show both theoretically and empirically, that these state-of-the-art detectors cannot reliably detect LLM outputs in practical scenarios."

LLM output detection thus, like CAPTCHA puzzles [PDF], seems destined to fail as machine-learning models continue to improve and become capable of mimicking human output.

The boffins argue that the unregulated use of these models - which are now being integrated into widely used applications from major technology companies - has the potential to lead to undesirable consequences, such as sophisticated spam, manipulative fake news, inaccurate summaries of documents, and plagiarism.

It turns out simply paraphrasing the text output of an LLM - something that can be done with a word substitution program - is often enough to evade detection. This can degrade the accuracy of a detector from a baseline of 97 percent to anywhere from 80 percent to 57 percent - not much better than a coin toss.

"Empirically, we show that paraphrasing attacks, where a light paraphraser is applied on top of the generative text model, can break a whole range of detectors, including the ones using the watermarking schemes as well as neural network-based detectors and zero-shot classifiers," the researchers explained in their paper.

In an email to The Register, Soheil Feizi, assistant professor of computer science at UMD College Park and one of the paper's co-authors, explained, "The issue of text watermarking is that it ignores the complex nature of the text distribution. Suppose the following sentence S that contains misinformation is generated by an AI model and it is 'watermarked,' meaning that it contains some hidden signatures so we can detect this is generated by the AI."

"This was actually generated by a watermarked large language model OPT-1.3B," said Feizi. "Now consider a paraphrased version of the above sentence:"

"It contains the same misinformation but this goes undetected by the watermarking method," said Feizi.

"This example points to a fundamental issue of text watermarking: if the watermark algorithm detects all other sentences with the same meaning to an AI-generated one, then it will have a large type-I error: it will detect many human-written sentences as AI-generated ones; potentially making many false accusations of plagiarism."

"On the other hand," Feizi added, "if the watermark algorithm is limited to just AI-generated text, then a simple paraphrasing attack, as we have shown in our paper, can erase watermarking signatures meaning that it can create a large type-II error. What we have shown is that it is not possible to have low type I and II errors at the same time in practical scenarios."

And reversing the application of paraphrasing to a given text sample doesn't really help.

"Suppose reversing paraphrasing is possible," said Vinu Sankar Sadasivan, a computer science doctoral student at UMD College Park and one of the paper's authors, in an email to The Register. "There is a crucial problem in this for detection. A detector should only try to reverse paraphrasing if the sentence is actually generated by AI. Else, reversing paraphrasing could lead to human text falsely detected as AI-generated."

Sadasivan said there are a lot of variations in the way a sentence can be paraphrased so it's not possible to reverse the process, particularly if you don't know the source of the original text.

He explained that watermarking text is more difficult than watermarking images. It requires outputting works in a specific pattern that's imperceptible to humans to assist detection.

"These patterns can be easily removed using paraphrasing attacks we propose in our paper," said Sadasivan. "If they can't be, it's very likely a human-written text is falsely detected as watermarked by a watermarking-based detector."

It gets worse. The boffins describe "a theoretical impossibility result indicating that for a sufficiently good language model, even the best-possible detector can only perform marginally better than a random classifier."

Asked whether there's a path to a more reliable method of detecting LLM-generated text, Feizi said there isn't one.

"Our results point to the impossibility of AI-generated text detection problems in practical scenarios," Feizi explained. "So the short answer is, unfortunately, no."

The authors also observe that LLMs protected by watermarking schemes may be vulnerable to spoofing attacks through which malicious individuals could infer watermarking signatures and add them to generated text to get the person publishing that text falsely accused as a plagiarizer or spammer.

"I think we need to learn to live with the fact that we may never be able to reliably say if a text is written by a human or an AI," said Feizi. "Instead, potentially we can verify the 'source' of the text via other information. For example, many social platforms are starting to widely verify accounts. This can make the spread of misinformation generated by AI more difficult." ®

About Us
Website HardCracked provides softwares, patches, cracks and keygens. If you have software or keygens to share, feel free to submit it to us here. Also you may contact us if you have software that needs to be removed from our website. Thanks for use our service!
IT News
Jun 8
One small Leap for OpenSUSE as 15.5 arrives ahead of business sibling

Will be followed soon after by SLE 15 SP 5 as org continues prep for ALP

Jun 8
Scientists claim >99 percent identification rate of ChatGPT content

Boffins and machines write very differently - and it's easy to tell

Jun 8
Sysadmin and IT ops jobs to slump, says IDG

Brush up on your coding - more tech jobs are going to be hybrids that mix ops and software, or require AI skills

Jun 8
US Senators take Meta to task for releasing LLaMA AI model after token safety checks

Suggest that Zuck has yet again unleashed stuff without a thought for the downsides

Jun 8
About ducking time: Apple fixes up autocorrect in iOS 17

WWDC And makes developer-grade OS betas available to all ducking loyalists

Jun 7
Waymo robo-car slays dog in San Francisco

Deadly accident said to be unavoidable

Jun 7
Atlassian pipes software flaw reports into Jira, so the boss can see them too

This could be a useful way to show what you're up against, or give the clueless a stick to beat you with