Detecting AI-Generated Text
There are no reliable ways to distinguish text written by a human from text written by an large language model. OpenAI writes:
Do AI detectors work?
- In short, no. While some (including OpenAI) have released tools that purport to detect AI-generated content, none of these have proven to reliably distinguish between AI-generated and human-generated content.
- Additionally, ChatGPT has no “knowledge” of what content could be AI-generated. It will sometimes make up responses to questions like “did you write this [essay]?” or “could this have been written by AI?” These responses are random and have no basis in fact.
- To elaborate on our research into the shortcomings of detectors, one of our key findings was that these tools sometimes suggest that human-written content was generated by AI.
- When we at OpenAI tried to train an AI-generated content detector, we found that it labeled human-written text like Shakespeare and the Declaration of Independence as AI-generated.
- There were also indications that it could disproportionately impact students who had learned or were learning English as a second language and students whose writing was particularly formulaic or concise.
- Even if these tools could accurately identify AI-generated content (which they cannot yet), students can make small edits to evade detection.
There is some good research in watermarking LLM-generated text, but the watermarks are not generally robust.
I don’t think the detectors are going to win this arms race.
Clive Robinson • September 19, 2023 8:44 AM
@ Bruce, ALL,
First note, it’s written by OpenAI who have a significant interest in the answer being “NO”.
Secondly note that what LLM’s produce is a variation on an average mimicry…
That is they end up following “the common style” or similar. Thus the creation and detection is always going to be probabilistic not determanistic.
I’ve seen a reasonable amount of LLM generated text, and it usually triggers my “hinky feeling” based on lowest common denominator “style”.
Most of us can recognize “add-pros” or “PR-pros” by it’s general vanilla style.
However when human produced, the vanilla style gets small occasional cracks/defects where the humans personal style breaks through.
This is currently missing in LLM text due to regression to the mean removing any such cracks/defects.
Now I’ve mentioned it I suspect OpenAI et al to run around and make adjustments to put in such style cracks/defects.
Which is why I think,
Would probably be true if left long enough. But as in ECM…ECCCM arms race of the 1980’s and 1990’s two effects will kill the race leaving a Pyrrhic victor of sorts,
1, The exponential rise in design cost.
2, The exponential rise in product resource cost.
Thus the question falls to what is already known to be a “con game” which is plagiarism detectors used in education. Which have a realy bad track record with false positives and false negatives…
I got pulled up for “plagiarism” by one, and when I insisted on it being revealed as to what… It was found I had been accused of plagiarism of myself… Which has given me the idea of maybe one day asking an LLM to generate a piece “in the style of Clive Robinson”…