Detecting AI-Generated Text

There are no reliable ways to distinguish text written by a human from text written by an large language model. OpenAI writes:

Do AI detectors work?

  • In short, no. While some (including OpenAI) have released tools that purport to detect AI-generated content, none of these have proven to reliably distinguish between AI-generated and human-generated content.
  • Additionally, ChatGPT has no “knowledge” of what content could be AI-generated. It will sometimes make up responses to questions like “did you write this [essay]?” or “could this have been written by AI?” These responses are random and have no basis in fact.
  • To elaborate on our research into the shortcomings of detectors, one of our key findings was that these tools sometimes suggest that human-written content was generated by AI.
    • When we at OpenAI tried to train an AI-generated content detector, we found that it labeled human-written text like Shakespeare and the Declaration of Independence as AI-generated.
    • There were also indications that it could disproportionately impact students who had learned or were learning English as a second language and students whose writing was particularly formulaic or concise.
  • Even if these tools could accurately identify AI-generated content (which they cannot yet), students can make small edits to evade detection.

There is some good research in watermarking LLM-generated text, but the watermarks are not generally robust.

I don’t think the detectors are going to win this arms race.

Posted on September 19, 2023 at 7:08 AM34 Comments

Comments

Clive Robinson September 19, 2023 8:44 AM

@ Bruce, ALL,

“There are no reliable ways to distinguish text written by a human from text written by an large language model. OpenAI writes”

First note, it’s written by OpenAI who have a significant interest in the answer being “NO”.

Secondly note that what LLM’s produce is a variation on an average mimicry…

That is they end up following “the common style” or similar. Thus the creation and detection is always going to be probabilistic not determanistic.

I’ve seen a reasonable amount of LLM generated text, and it usually triggers my “hinky feeling” based on lowest common denominator “style”.

Most of us can recognize “add-pros” or “PR-pros” by it’s general vanilla style.

However when human produced, the vanilla style gets small occasional cracks/defects where the humans personal style breaks through.

This is currently missing in LLM text due to regression to the mean removing any such cracks/defects.

Now I’ve mentioned it I suspect OpenAI et al to run around and make adjustments to put in such style cracks/defects.

Which is why I think,

“I don’t think the detectors are going to win this arms race.”

Would probably be true if left long enough. But as in ECM…ECCCM arms race of the 1980’s and 1990’s two effects will kill the race leaving a Pyrrhic victor of sorts,

1, The exponential rise in design cost.
2, The exponential rise in product resource cost.

Thus the question falls to what is already known to be a “con game” which is plagiarism detectors used in education. Which have a realy bad track record with false positives and false negatives…

I got pulled up for “plagiarism” by one, and when I insisted on it being revealed as to what… It was found I had been accused of plagiarism of myself… Which has given me the idea of maybe one day asking an LLM to generate a piece “in the style of Clive Robinson”…

Keith Douglas September 19, 2023 8:52 AM

Self-plagiarism is an interesting item, because it is so vexed already. Some authors are accused of writing the same book or paper n times, and that that is a form of plagerism. I for one really hate papers in journals that cite lots of “unpublished manuscript” – particularly of the authors. (This is an example of where anonymous review works against you!)

As for the arms race, I imagine there will be opposites – races to show that something was AI when it wasn’t, for example, or tools to help students and scholars from accusations. The false accusations I think will bother me the most, but I am unsure.

My own profession of application security is pulled in all directions by this; in some of my practices (e.g., code review) it really doesn’t matter where the code is coming from, so long as the introducer(s) can be accountable for it. In others, however, like when setting up CI, I think some supply chain security is in order. And that’s darn hard, for the reasons in the article.

jones September 19, 2023 9:43 AM

The only way these detectors would work is if you had an unedited output and also knew the exact version of the model used to generate the text, in which case, it would be possible to project the text back into the latent space and search for a match.

Nevertheless, for many use cases, these models do have a distinct style that can be detected by attentive humans. They often write text lacking in specific detail, with perfect grammar, and with very evenly measured paragraphs that are all about the same length.

Anonymous September 19, 2023 10:13 AM

@jones

Nevertheless, for many use cases, these models do have a distinct style that can be detected by attentive humans. They often write text lacking in specific detail, with perfect grammar, and with very evenly measured paragraphs that are all about the same length.

The heuristics you’ve described here so prone to false positives as to be useless.

GregW September 19, 2023 12:23 PM

Besides the Google watermarking research mentioned by Bruce, there is work on watermarking being done by OpenAI that is not covered in their linked “school FAQ” and is not yet publicly released.

Specifically Scott Aaronson (CS professor from UT Austin with long expertise in computational complexity/quantum computing, see https://scottaaronson.blog/), has spent the last year away from academia and is working the watermarking problem for OpenAI and has published some slides on his research findings and approach here: https://www.scottaaronson.com/talks/llmwatermark.pptx

Scott Lewis September 19, 2023 12:59 PM

@Clive

It’s a valid point to question whether every LLM-generated text you’ve encountered would actually trigger a “hinky feeling.” If there are instances where LLM-generated text is written well enough to not evoke that feeling, it’s possible that you may be unaware of its machine-generated nature. Therefore, it becomes challenging to definitively prove that all LLM-generated text would indeed trigger your “hinky feeling.”

Morley September 19, 2023 1:11 PM

I guess we’ll have to reduce trusting anonymous information and fund creative people differently.

I don’t think the content will be regulate-able in the future.

lurkerl September 19, 2023 2:44 PM

Shakespeare? Declaration of Independence? Nobody, but nobody, writes like that today. Of course it must have been an AI.

No, seriously, using a bad example like that switches on my “what else are they fudging?” filter.

Anonymous September 19, 2023 3:55 PM

Well Shakespeare and the Declaration of Independence are likely to be in most AI training sets. It’s not so surprising they are flagged as something they would write

Brad Templeton September 19, 2023 5:17 PM

I differ, Bruce, because the detectors get to improve after the fact. To use an AI to write something where there is a prohibition on doing so (such as a school essay) you must not just escape detection today, but years into the future. You must bet that there won’t be a detector in the future that, since it contains within it a copy of the AIs of the present, can spot its own work.

More to the point, this doesn’t even have to be a certainty, you just have to fear that it’s probable, and that the risk is high you will be caught out, as long as there are consequences you can feel in the future, such as loss of credit for courses, loss of degree, professional shaming etc.

Would you make the bet that no detector in the future will ever have confidence in its assessment of what you did?

Allie Hancock September 19, 2023 8:10 PM

Watermarking reminds me of the idea, from a certain Schwarzenegger film, of marking an Nth-generation human clone with N dots under their eyelid. I think it’s about as realistic.

It might be a fun idea to have a dewatermarking contest. Perhaps, as in a recent contest where an AI was “tricked” into giving up a credit card number, someone could convince the AI to omit or remove its own watermark—or remove a competitor’s.

The whole concept only works at all, though, if everyone’s using the same handful of centralised AI services, and they’re all on board. Maybe some Chinese or Russian company will decide not to do it, or to offer unwatermarked output for an extra fee. If you’re running the AI yourself, you just comment out the watermarking code (or NOP it out—I’m sure more than a few of Bruce’s readers have experience with that). “Robustness” doesn’t much matter.

Clive Robinson September 19, 2023 8:23 PM

@ Steve, ALL,

“Recently, bloggers Amy Castor and David Gerard quoted someone…”

Look back on this blog and you will find I’ve already said it on this blog before them and those they quote as have others…

So as I’ve said before you just need to read here first 😉

Clive Robinson September 19, 2023 8:56 PM

@ Scott Lewis, ALL,

“If there are instances where LLM-generated text is written well enough to not evoke that feeling, it’s possible that you may be unaware of its machine-generated nature.”

As I said it’s probablistic not determanistic, and it’s based on style.

The point is currently LLM’s do little more than average to the lowest common denominator. So synthetic vanilla.

Humans however even when trying to be vanilla are actually not very good at sinking to that level all the time. So the “individuals style” breaks through in places.

Now I’ve said it, I suspect people at OpenAI etc will build individual style in to give it the lift.

However “spotting style telltales” is somewhat similar to doing malware code attribution, whilst it’s hard to tell “who the tells belong to” actually “spotting the tells is a lot easier”.

Can Faux-AI people brush up their LLM’s? Well we already know that what are allegedly “Mechanical Turks” are quite often 50c/answer or less low paid workers in various nations to the North, East, and South of the Eastern end of the Mediterranean past Italy.

As long as their answers are short and they tend to be style tells are much less obvious.

Long answers are not going to earn a human any extra money / keypress, however LLM’s care not about typing time so they tend to splurge out a lot of vanilla-average or nutbar-nonsense some chose to call halucinations.

There are other stylistic cues that can act as marginal distinguishers, with each additional one the probability of saying LLM or Human goes up…

Does it get to 100% confidence no, but how close depends on the number of tells and if the LLMs try to hide, which currently they do not.

If this changes and LLM’s start to have faux-human tells added, then in all probability the “halucination rate” will increase as a direct consequence…

Matt S September 19, 2023 9:09 PM

In the near term, I suspect this can be mitigated by a plugin in Google Sheets or Microsoft Word that monitors inputs from a user.

It can detect typing activity and output a score of human-written authenticity.

Maybe someone will steal this business idea?

KS September 19, 2023 9:15 PM

The problem of detection of AI plagiarism in academic settings can be greatly simplified by taking individual’s written history into account.

RobertT September 19, 2023 10:24 PM

I find the whole AI LLM debate to be absurd
Why do I care if the text was AI generated?
Why do I care if the student “cheated”?
Why do we care?
I suspect the real answer is Credentialism. So many of us have spent a small fortune getting a sheet of paper, this sheet of paper was the key to our future, suddenly we’ve discovered that the sheet of paper is practically worthless and “skills” that earned us the paper are equally worthless but but but we say to ourselves I’ve got the good job and he hasn’t. This is that state of things which needs to be protected. IMHO in the 21st century we’ve seen Credentialism replace meritoracy as the western world’s most popular religion.
But we tell ourselbes it’s all at risk if an AI bot can write better than me, or I suspect that’s the secret feeling underlying all this AI angst.

Just my thoughts

benjamin shropshire September 19, 2023 11:05 PM

My prediction on the eventual resolution of this arms race: “Whoever has more resources wins.” (With a slight advantage to detection.)

Also, the reliability of detections is going to run a distribution and that can be exploited by defense. If a LLM is used to generate 10M tweets at a cost of $100k and someone spends $10k looking for them, I’d expect they could find thousands of the generated tweets that pop as “almost certainly generated”. For some applications, even a <1% detection rate could be useful. (E.g. What topics are being spammed? Are the bots agitating or amplifying? Are the bots leading the mob or following it? etc.)

Winter September 20, 2023 2:02 AM

@Clive

2, The exponential rise in product resource cost.

I think that this will be the real killer. The cost of ensuring the training data is human will skyrocket.

We saw earlier that if you train an LLM on LLM output, the resulting model will be worse than the training material.[1]

This is a real problem for LLMs as, like their names indicates, they need a really large amount of training data. Many hundreds of billions of words. Actually as much as is accessible on the internet. And a lot of text on the internet is now generated by AI. And that amount is increasing fast.

If the makers of AI want to evaluate a sizeable fraction of a trillion words of text on “humanness”, the costs will become prohibitive.

Building AI with less training materials can theoretically be done [2], but not by LLMs in the current sense of a transformer/predictive system.

The solution will be some other type of AI that does rely less on such massive amounts of input. [3]

Which has given me the idea of maybe one day asking an LLM to generate a piece “in the style of Clive Robinson”…

I already tried that with the “original” chatGPT. Did not work.

[1] Many articles, eg,
Model collapse explained: How synthetic training data breaks AI
‘https://www.techtarget.com/whatis/feature/Model-collapse-explained-How-synthetic-training-data-breaks-AI

[2] Humans do it with a minute fraction of this data input.

[3] See suggestion in, eg, ‘https://www.linkedin.com/pulse/nasty-surprises-why-llm-model-collapse-theory-misjudges-daniel-sack

Robin September 20, 2023 3:31 AM

@RobertT

I have some sympathy with your POV. These anxieties surface quite regularly: remember the furore about allowing students to use calculators in exams? And a few decades ago, CAD software to do the detailed design of chemical plant became routine, it became integrated into teaching … but assessors were getting tied in knots about how to certify student engineers’ “knowledge”. The point is that employers wanted to recruit engineers who were absolutely adept at using the latest tools. In effect the availability of powerful tools changed the skill set that landed you with a good job.

AI will do the same.

RobertT September 20, 2023 4:56 AM

@robin, yep that’s true but for me, it’s more along the lines that the top students won’t cheat because they value the education more than just the grade.
Personally. I only interview top students from top universities, I never even consider employing anyone that’s outside the top 20% of their university cohort. That’s just me, that’s my metric, that’s my minimum.

I know that these students didn’t get where they are by cheating, If they used Chatgpt they did so because it’s a valuable tool they used to complement their own knowledge / capabilities. If they have this knowledge (of new AI methods) to be honest I want to know what they know, I want them to be my mentor, as much as I’m their mentor.

I know the top students value my time and they’ll value the opportunity I’m offering, as for the rest well I’ve no idea what they do and I certainly don’t care what AI tools they might have used to “cheat”.

Winter September 20, 2023 5:07 AM

@RobertT

IMHO in the 21st century we’ve seen Credentialism replace meritoracy as the western world’s most popular religion.

When I visit a hospital, I really, really want those working their to have their “sheet of paper”. Having your brain tumor, or knee joint, treated by someone who got their surgeon degree online, or not at all, is something I want to avoid at all costs.

The same holds for all working in health-care, or hardware care. I admire any self-made car mechanic or Integrated Circuit designer, but I really want good references and good work examples before I would employ them. A sheet of paper as evidence they completed accepted courses in the field alleviates a lot of that uncertainty.

The fact that current exams do not test the knowledge and know-how that they are supposed to test is a different matter altogether. Knowing how to use the tools that are available is more important that knowing how to use antiquated tools you won’t get to use anymore in real life.

You should learn the practices that you need to perform, less the tools to do it. I am old enough to have learned to use a slide rule. I have never ever used one in real life. But I have learned programming in Algol68 and Fortran 4 (I am that old), and what I learned is still very applicable[1] and was easily transferred to C, C++, Pascal, Perl, Prolog, and Python.

LLMs are just spelling checkers on steroids. We do not prevent students from using spelling checkers.

The real challenge is not to prevent students from using LLMs, because they will have to use them or their descendants after they finished school. The challenge is to teach them how to produce and evaluate texts. Writing essays is just one way of doing that.

LLMs are better at writing essays than most of the students because students are still in the process of learning how to write essays. But you can still teach the students to produce good essays, using whatever tools, and explain to us why this is a good text. It is just that multiple-choice tests and dumping full text essays will experience an end-of-life moment in student evaluation. Just like math tests asking for the outcome of a specific calculation have been retired decades ago.

[1] Mostly, do as much as possible what Algol68 did, avoid everything Fortran 4 did.

Winter September 20, 2023 5:21 AM

@RobertT

I only interview top students from top universities, I never even consider employing anyone that’s outside the top 20% of their university cohort.

That is the most short sighted view on education I have ever seen. You attribute perfect foresight to educational institutions. You should know that “work” has very little in common with “school”. Being good at the one is only a weak predictor of the other.

My experiences are that grades are only a meager extract of the educational history of a student. Whenever I want to compare a student to a “job”, I will ask myself what in the educational history of this student overlaps with the requirements of this job and how the inevitable “motivation letter” squares with this history.

I have seen too many students that choose their courses and subjects to get the highest grades instead of choosing the subjects that really interest and motivate them, even at the risk of a “sub 80% percentile” grade.

A student who travels to a department at the other side of the continent to do an internship with a specific lab or researcher, and gets a 7/10 there, is much, much more worth to me than a student who gets straight 9/10 with a hopscotch of “light” courses at home.

RobertT September 20, 2023 6:50 AM

@winter
That is the most short sighted view on education I have ever seen

Hmmm so if I understand you correctly, the metric by which I decide who I will interview for a position that I will personally fund, is miopic.

That’s useful, that’s good to know, I appreciate the feedback.
Cheers

Winter September 20, 2023 7:29 AM

@RobertT

… who I will interview for a position that I will personally fund, is miopic.

I have been told that people can spend their money unwisely.[1] Generally, like you, I too am grateful to people who point out to me that I might rethink whether I am doing the best I can.

Wrt the specific subject of educational grades, I have, again, been told and experienced myself, that students can be adept at gaming the grading system.[2]

[1] I believe there are even English proverbs trying to drive this point home.

[2] The local saying was to graduate with honors Summa cum Fraude.

benjamin shropshire September 20, 2023 1:54 PM

Re: Credentialism

It exists as a first order proxy for what people really want to measure; do people have the needed skills? And for that, it’s actually a reasonably effective (but still imperfect) proxy.

One good reason to take issue with cheating is that it hurts everyone:

  • The cheater is still spending money on training but isn’t getting value from it.
  • The employer is less able to predict who has the skills they need (they can’t run their own skills test for every last candidate, and outsourcing that is another term for certification by exam, which is itself credentialism).
  • The educator’s product looses value because the credential becomes less correlated with what people actually need.
  • Other workers also loose because it becomes harder and more expensive to demonstrate their skills.

FWIW, I think this is yet another problem that can be improved by free-market non-monopolistic systems: if there is a large diversity of credentials relevant to a given field that workers can acquire and that employers can trust, then everyone wins. This is particularly true if it’s relatively cheap and easy for a competent worker to get redundant credentials.

Most likely that would require credentials to move towards a proof-of-competency system (e.g. practical tests) and away from proof-of-education systems. That move also has other benefits: if getting credit for the class is divorced from taking the class, then there will be a lot less incentive to cheat on the class work and a lot fewer opportunities to cheat on the credit.

RobertT September 20, 2023 6:18 PM

Re Credentialism
I also like to believe that any surgeon employed at a hospital has a minimum and verifiable skill set, however for me, Credentialism is not about what’s happening in the top quartile as much as what’s happening in the bottom quartile.
Today (in many countries) it is common for a Police constable to have some sort of social science degree. It’s not at all clear why, yet as a consequence, the very nature of our Police force is changing. So add in a little Credentialism and suddenly the guy/girl who 20 years ago would have been the ideal candidate now doesn’t make the cut, all because they don’t have a degree.

The same goes for the Baker at the local pastry shop and I’ve even heard of Cleaners not getting a job because they lack the relevant certifications. This for me is Credentialism, jobs that 20 years ago had no formal certification requirements now (from a practical entry level perspective) require some form of formal Credentials.

When these credentials are not really necessary (to do the job) they become a hurdle which the job applicant needs to find a way over or around, if cheating gets them the sheet of paper (qualification) then they cheat, it’s that simple. Once they’re in the job they’re also inclined to raise the bar, so at first we have High School certificates being required but within a few years the very same job requires a BA degree followed by a Masters etc. None of these degrees are needed to do the underlying job yet the entry level requirement just ramps up and up and up. And of course, it follows that we need to be diligent and weed out the “cheats”.

Winter September 21, 2023 1:37 AM

@RobertT

So add in a little Credentialism and suddenly the guy/girl who 20 years ago would have been the ideal candidate now doesn’t make the cut, all because they don’t have a degree.

Part of this problem stems from the drive to increase “job mobility”, or better, the race to the bottom.

No company wants to pay for on the job training as employees will move to an employer that can afford a higher pay by not offering job training after they completed their education. [1]

Hence the shortcut of offloading the education costs to the employees. Which requires credentialism.

[1] Actually, employers want to be able to fire their employees at random times to “increase shareholder value”. That becomes difficult if you have invested actual money in their training.

Clive Robinson September 21, 2023 6:38 AM

@ Winter, RobertT,

Re : Credentialism and Crime.

“Part of this problem stems from the drive to increase “job mobility”, or better, the race to the bottom.”

As I mentioned a day or so ago, in some places the student has to pay both the academic institution and via scams the lecturers as well, and I noted this was prevelant in India.

Where it turns out the difference credentials make is very large which is why the qualification scams work as well as they do.

But there is a flip side to this rampant credentialism scam, and that’s rampant crime caused by the socio-economic gap.

Fresh of the press today,

“India’s biggest tech centers named as cyber crime hotspots”

https://www.theregister.com/2023/09/21/india_cybercrime_trends_report/

It makes interesting reading for those who have not been keeping an eye on what has been going on in India for the past decade or so, and saw this coming and why (even though told I was discriminating by entities that are unlikely to return to appologise).

Oh the reason to keep an eye on it…

If you think about it the trend in the US and West has been to “export work” because “the labour costs less over there”. Well the reverse is kind of true as with all places jobs have been outsourced to this century (BRICs etc). As the infrastructure appears to facilitate the “export of work” from the West, they can use it to “export crime back” because “the money is over here and they can do it from over there”. Throw in the fact that the chances of criminal investigation and punishment happening have for more than a decade been about as close to zero as makes no preventative difference…

Clive Robinson September 22, 2023 4:21 PM

@ SpaceLifeForm,

“Then human readers may notice.”

Or not as the case maybe 😉

I remember when our host @Bruce used to do a weekly piece for “The Gaurdian” newspaper and bloged it here. Someone pulled him up on his spelling…

It got pointed out that it was “English english” not “USA english” as The Guardian was a UK newspaper…

The funny thing is correct spelling is such a recent thing in human terms. As you might remember I once linked to an article that denonstrated that William Shakespeare was responsable for a significant jump in the number of words in the average vocabulary untill the Victorian era where the invention of words went nuts[1].

Since that Victorian peak the useful range of words has reduced and it’s been estimated that effective though tedious communications can be carried out with a vocabulary of as little as a couple of hundred words[2].

But… this century has seen a new phenomenon with smart devices and the like. With small keyboards and autocorrection giving us “fat finger syndrome” where the user might say think tgey have typed “happy” but “haoot” or similar from adjacent keys comes up. Likewise top row issues such as meaning to type “quest” but getting “question,some” or similar as the finger hits words on the spell check line above the top line of the keyboard. Then there is dropped letters where “one” becomes “on”, “off” becomes “of” etc.

But also the “Who or Whom” issue and “to, too, or two” etc issue, is not uncommon.

But there is also certain word usage… Back a few years ago people said things like “that which is” some now say “that’s” others just “which is”. So “that which is good” becomes “that’s good” or “which is good” mostly such things just pass by but it can cause meanings to be altered inadvertently. In London you here the wince worthy use of “is” instead of “are” and similar.

In the UK we have an expression of “Estuary Essex” refering to a mutilated form of spoken English and writing allegedly from Dagenham and environs that once spawned the “Don’t bash your grammar” phrase. Which is associated by stand up comedians with the liking of “furry dice” hanging from rearview mirrors in the front of cars and even those name decals of “Stew n Tracy” across the top of the windscreen and even –thankfully now rare– “shopping trolly” and “blond” jokes.

So there is a lot to notice in “style” that act as “tells” and might also account for some AI ML hallucinations we’ve seen with LLMs.

[1] With the invention of words such as “phlegmatic” meaning to have the feeling of being full of phlegm (efectively that wheezy chest etc of a heavy cold clearing out the secondary bio-invaders). Also “phlogiston” and other delights of thinking that have since been replaced by more rational but less fun ideas.

[2] Various primate studies have shown that whilst vocalisation is at the very best limited, the learning and correct usage of upto 350 or so sign language words as a vocabulary is possible,

https://en.m.wikipedia.org/wiki/Washoe_(chimpanzee)

Jacob October 2, 2023 1:46 PM

A large part of the issue is one of repeatability and transparent evaluation of detectors in the face of changing models and changing detectors. I have been working to at least compare detectors as part of the evaluation of my own (open-source) detector [1], and there is a lot of nuance in how the evaluations are performed that can drastically change performance. That said, this detector [2] scores very highly, though is incredibly slow.

As there is more co-written work out there, I expect detection performance to decrease, but there is some evidence that at present LLM-generated text can be detected with moderate accuracy.

[1] https://github.com/thinkst/zippy
[2] https://contentatscale.ai/ai-content-detector/

- October 5, 2023 6:59 AM

@Moderator:

1, Amar Dhoot

From the underlying link is a repeat offender of,

Unsolicited Advertising.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.