AI-Generated Text and the Detection Arms Race

In 2023, the science fiction literary magazine Clarkesworld stopped accepting new submissions because so many were generated by artificial intelligence. Near as the editors could tell, many submitters pasted the magazine’s detailed story guidelines into an AI and sent in the results. And they weren’t alone. Other fiction magazines have also reported a high number of AI-generated submissions.

This is only one example of a ubiquitous trend. A legacy system relied on the difficulty of writing and cognition to limit volume. Generative AI overwhelms the system because the humans on the receiving end can’t keep up.

This is happening everywhere. Newspapers are being inundated by AI-generated letters to the editor, as are academic journals. Lawmakers are inundated with AI-generated constituent comments. Courts around the world are flooded with AI-generated filings, particularly by people representing themselves. AI conferences are flooded with AI-generated research papers. Social media is flooded with AI posts. In music, open source software, education, investigative journalism and hiring, it’s the same story.

Like Clarkesworld’s initial response, some of these institutions shut down their submissions processes. Others have met the offensive of AI inputs with some defensive response, often involving a counteracting use of AI. Academic peer reviewers increasingly use AI to evaluate papers that may have been generated by AI. Social media platforms turn to AI moderators. Court systems use AI to triage and process litigation volumes supercharged by AI. Employers turn to AI tools to review candidate applications. Educators use AI not just to grade papers and administer exams, but as a feedback tool for students.

These are all arms races: rapid, adversarial iteration to apply a common technology to opposing purposes. Many of these arms races have clearly deleterious effects. Society suffers if the courts are clogged with frivolous, AI-manufactured cases. There is also harm if the established measures of academic performance – publications and citations – accrue to those researchers most willing to fraudulently submit AI-written letters and papers rather than to those whose ideas have the most impact. The fear is that, in the end, fraudulent behavior enabled by AI will undermine systems and institutions that society relies on.

Upsides of AI

Yet some of these AI arms races have surprising hidden upsides, and the hope is that at least some institutions will be able to change in ways that make them stronger.

Science seems likely to become stronger thanks to AI, yet it faces a problem when the AI makes mistakes. Consider the example of nonsensical, AI-generated phrasing filtering into scientific papers.

A scientist using an AI to assist in writing an academic paper can be a good thing, if used carefully and with disclosure. AI is increasingly a primary tool in scientific research: for reviewing literature, programming and for coding and analyzing data. And for many, it has become a crucial support for expression and scientific communication. Pre-AI, better-funded researchers could hire humans to help them write their academic papers. For many authors whose primary language is not English, hiring this kind of assistance has been an expensive necessity. AI provides it to everyone.

In fiction, fraudulently submitted AI-generated works cause harm, both to the human authors now subject to increased competition and to those readers who may feel defrauded after unknowingly reading the work of a machine. But some outlets may welcome AI-assisted submissions with appropriate disclosure and under particular guidelines, and leverage AI to evaluate them against criteria like originality, fit and quality.

Others may refuse AI-generated work, but this will come at a cost. It’s unlikely that any human editor or technology can sustain an ability to differentiate human from machine writing. Instead, outlets that wish to exclusively publish humans will need to limit submissions to a set of authors they trust to not use AI. If these policies are transparent, readers can pick the format they prefer and read happily from either or both types of outlets.

We also don’t see any problem if a job seeker uses AI to polish their resumes or write better cover letters: The wealthy and privileged have long had access to human assistance for those things. But it crosses the line when AIs are used to lie about identity and experience, or to cheat on job interviews.

Similarly, a democracy requires that its citizens be able to express their opinions to their representatives, or to each other through a medium like the newspaper. The rich and powerful have long been able to hire writers to turn their ideas into persuasive prose, and AIs providing that assistance to more people is a good thing, in our view. Here, AI mistakes and bias can be harmful. Citizens may be using AI for more than just a time-saving shortcut; it may be augmenting their knowledge and capabilities, generating statements about historical, legal or policy factors they can’t reasonably be expected to independently check.

Fraud booster

What we don’t want is for lobbyists to use AIs in astroturf campaigns, writing multiple letters and passing them off as individual opinions. This, too, is an older problem that AIs are making worse.

What differentiates the positive from the negative here is not any inherent aspect of the technology, it’s the power dynamic. The same technology that reduces the effort required for a citizen to share their lived experience with their legislator also enables corporate interests to misrepresent the public at scale. The former is a power-equalizing application of AI that enhances participatory democracy; the latter is a power-concentrating application that threatens it.

In general, we believe writing and cognitive assistance, long available to the rich and powerful, should be available to everyone. The problem comes when AIs make fraud easier. Any response needs to balance embracing that newfound democratization of access with preventing fraud.

There’s no way to turn this technology off. Highly capable AIs are widely available and can run on a laptop. Ethical guidelines and clear professional boundaries can help – for those acting in good faith. But there won’t ever be a way to totally stop academic writers, job seekers or citizens from using these tools, either as legitimate assistance or to commit fraud. This means more comments, more letters, more applications, more submissions.

The problem is that whoever is on the receiving end of this AI-fueled deluge can’t deal with the increased volume. What can help is developing assistive AI tools that benefit institutions and society, while also limiting fraud. And that may mean embracing the use of AI assistance in these adversarial systems, even though the defensive AI will never achieve supremacy.

Balancing harms with benefits

The science fiction community has been wrestling with AI since 2023. Clarkesworld eventually reopened submissions, claiming that it has an adequate way of separating human- and AI-written stories. No one knows how long, or how well, that will continue to work.

The arms race continues. There is no simple way to tell whether the potential benefits of AI will outweigh the harms, now or in the future. But as a society, we can influence the balance of harms it wreaks and opportunities it presents as we muddle our way through the changing technological landscape.

This essay was written with Nathan E. Sanders, and originally appeared in The Conversation.

EDITED TO ADD: This essay has been translated into Spanish.

Tags: AI, LLM

Posted on February 10, 2026 at 7:03 AM • 20 Comments

Comments

wiredog • February 10, 2026 9:22 AM

“Social media platforms turn to AI moderators.”
At least as a first cut on moderation this is a good thing. Moderation at scale is a hard to impossible task and while Section 230 of the CDA provides some protection, it has limits. Lack of good moderation has killed a number of sites. So an LLM that can hear the various dog whistles and an image recognizer that can tag probable csm are good things. The problem comes with the appeal process, which can also get overwhelmed.

Carl Byor • February 10, 2026 10:50 AM

Please observe how the author contrives the impression of balance by selling supposed “upsides” while not stating whether they are completely overwhelmed by the downsides. The subtext is clear: let’s use A.I. to get ourselves out of a hole that we’ve dug with A.I. In other words, the solution is always more A.I.

That, right there, is the basic premise of the tech industry. Regardless of what happens, the answer is always more tech.

And that should tell you everything you need to know about this author and the interests that put the spotlight on him.

mark • February 10, 2026 11:43 AM

Actually, Neil shut down Clarkesworld submissions for three or four months. After, he opened it again, but has developed a method of detection which he does NOT advertise.

From what you wrote, Bruce, expert systems are useful. Chatbots are not. Deepfakes are negatively useful.

Clive Robinson • February 10, 2026 12:16 PM

@ Bruce,

Whilst not quite a foregone conclusion the AI “arms race” was a reasonable thing to expect.

What I suspect most people who expected an arms race got caught by is the shear breadth of the battle front. That is people had not realised just how “generally” the LLM can be used, because they don’t tend to think about it in the right way.

But the main thing about the Current AI LLM and ML systems is they are actually not very good mostly they are,

“Jokers of all trades and masters of none”

And that’s before we start talking about “the memory problem”. Which includes the “garbage bin” training models.

Though they are improving the chance of an ROI greater than the rate of inflations is not at all likely. And it’s fairly clear that financial interests are “Pivoting out” of Tech stocks because they see that there is nothing of worth to support valuations.

We are back to the crazy notion of “burn rate” and we have previous experience of what that means.

But also it’s because people tend to forget about the deflationary effect of “new technology” the price drop for the same capability has a habit of falling way way faster than people actually understand.

Yes we can put LLMs on high end laptops now… Especially if you pick the right model. This has kicked out most of the runway/moat of the likes of Nvidia and OpenAI etc. Those Data Centers people keep talking up are a half decade away at best due to external factors. All this nonsense about nuclear reactors or data centers in space if pretty much “blowing smoke” to try to keep hot air in the ballon.

But more importantly is the NIMBY effect people don’t want the proposed Data Centers anywhere near them. Especially if they come with a jerry built nuclear power station knocked up in a hurry next door. And that’s before we talk about “environmental impact” of using fossil fuels like coal… Which is most likely how the generators will be run. But communities are realising their is nothing in having a data center near them, they won’t get any jobs there not even as janitors.

But as indicated LLMs can now run on high end laptops. So questions arise as to,

1, What exactly is going to go into these Data Centers anyway?
2, Will the data agrigation, collection and reliable collation be possible?
3, Will the ML side become sufficient to get error rates from over 30% down to a lot less than 5%?

And a whole load more that the answers are not going to keep the current AI Corps in business…

With the big one being,

Will there ever be a realistic return on investment?

Even Nvidia know the answer to that. Which is why that $100Billion circular investment deal is gone, and chances are OpenAI will have to sell it’s self to Microsoft for 1cent on the dollar at best in a fairly short period of time.

But the real question people should be thinking about is,

“Does the future of general AI coincide with Current LLM and ML systems?”

The answer to which is “no”.

Which raises,

“Do Current LLM and ML systems actually have a realistic future?”

To which the answer is more complicated but yes, just as we are still using “Expert Systems” that were new in the 1980’s in niche activities LLMs if specific to niche areas have a future probably.

These are the realities and people should get used to what they actually mean…

One thing I can predict is that Nvidia has lost exclusivity and thus it’s valuation is going to sink. Will it stay above the $1trillion it zoomed through like a rocket?

That is questionable but it may get quite close depending on retail inflation and tech deflation.

But people should consider the hard reality of,

“What will an Open Source model on a high end laptop actually get me?”

And I suspect that it will be like asking back in the 1970’s

“What will an electronic calculator get me?”

And in the 80’s

“What will an 8bit personal computer get me?”

These were times when the technologies were fresh and very expensive… But for any real measure of performance the price had gone from not affordable to Xmas present money in less than a decade.

Now apply that thinking to Current LLM systems…

Morley • February 10, 2026 2:17 PM

Those upsides aren’t really upsides considering AI hallucinates so much.

GuestCommenter • February 10, 2026 11:58 PM

Thank you for this thoughtful and insightful post! I really enjoyed reading your analysis.

ResearcherZero • February 11, 2026 4:39 AM

Would it be possible to monitor nuclear weapons proliferation and conduct inspections with AI systems? Getting hold of the data to train models might be extremely difficult and can we then trust these systems to do the job? What happens when these systems make mistakes?

Relying on machines to replace complex human inspection of nuclear weapons systems.

‘https://dnyuz.com/2026/02/09/ai-is-here-to-replace-nuclear-treaties-scared-yet/

How might it work and would nations be willing to trust it?
https://fas.org/publication/inspections-without-inspectors/

What could go wrong. (?)
https://www.politico.com/news/magazine/2025/09/02/pentagon-ai-nuclear-war-00496884

Clive Robonson • February 11, 2026 8:25 AM

@ ResearcherZero,

With regards, your three questions

“Would it be possible to monitor nuclear weapons proliferation and conduct inspections with AI systems?

Getting hold of the data to train models might be extremely difficult and can we then trust these systems to do the job?

What happens when these systems make mistakes?”

The last is the easiest to answer,

It turns on three things,

1, The amount of agency/influence it has.
2, The type of sensors it’s given.
3, The type of past data it has.

If it has no more agency than existing input instruments, then it can replace existing systems.

As I’ve mentioned before Current AI LLMs are little more than large “adaptive tuned filters” so you would expect it to have similar or better indicating accuracy as existing systems.

However it’s likely the false positive rate to rise on new readings because it can find more signals in the noise and not all of them will be real. And even if real may indicate something else such as mining activity or natural disasters.

Which brings us onto what influence it is going to have.

We know there have been false positives in the past and they have not led to retaliative behaviour under MAD conditions.

However as you’ve noted MAD conditions nolonger apply so honestly we don’t know.

In the past the technique used was “turn the sensor gain upto eleven” accept false positives, and hope the humans behave as rational actors.

But it’s nolonger only MAD conditions that are gone. Russia has indicated it has a nuclear powered cruise missile that is both hypersonic and not ballistic in it’s flight path. Whilst I’m not aware of if that is all true, we do know that there was a significant nuclear event at a Russian Navy experimental missile proving ground, that might have been such a device going rather more than “high order” as usual readings at nuclear sites across europe were seen.

So they may or may not have an experimental device that has finally managed to fly once rather than turn it’s self into a cloud of low orbit scrap shortly after launch. Thus may never go into production as a weapon let alone one that is ever going to be used.

The point is it’s a “line drawn” for home news consumption rather than an actual threat to anyone other than the Russians themselves.

Does it matter if such a device is sensed or not? Probably not. For an LLM to be of use it has to have a signal it can filter from other signals to identify launch direction and possible hight trajectory etc. Unless Russia launch a half dozen or so such devices the required data is unlikely to appear.

Thus the output from the LLM is likely to be an “oddity” rather than a definitive trace. As a general rule both the Russian and US forces don’t respond to “deer farts in the woods”. So the influence is likely to be at best treated with caution or just ignored…

So it’s not likely.

Rontea • February 11, 2026 9:46 AM

AI-generated text is creating a new front in the detection arms race, and the policies we craft today will shape the security landscape for years to come. Oversimplifying the problem—treating all AI output as either safe or suspect—invites bad policy decisions that fail to address the nuanced ways generative tools interact with institutions. Yet complexity itself is a perennial enemy of security: the more intricate and opaque our detection and mitigation systems become, the more brittle and exploitable they are. We need layered strategies that balance usability, robustness, and transparency, or else we risk building defenses that collapse under their own weight.

somebody • February 12, 2026 1:19 PM

The problem is that people are relying on heuristics and not evaluating the quality of ideas. This is not an AI problem, it is and always has been a people problem. Ask Dred Scott or George Eliot.

If journal cannot distinguish a good paper from a bad paper without knowing the name of the author it’s a bad journal. If science can’t discard random garbage it’s not a science. If readers care about the biology of the author they don’t care about literature.

If somebody or something comes up with Fermat’s non-marginal proof who cares if it’s a Field’s medalist, a precosious five year old from Uganda, or an AI.

As Randall Munroe says:
https://imgs.xkcd.com/comics/constructive.png

bye be ai • February 12, 2026 1:20 PM

I hate the term hallucinating to describe what AI does in part because it is an attempt at personification intending to mislead…it hallucinates, how human of it. More importantly it misunderstands how AI operates. AI “hallucinating” is the acme of the principle that correlation does not equal casuation.

Matt • February 12, 2026 4:16 PM

Given that after four years of this shit we’ve seen basically zero net upside to LLM adoption, the only sane thing to do is to totally reject it at this point and consign such tools to very limited scopes where they can’t do mass damage.

To some degree this will happen when the AI bubble collapses and the immensely unprofitable LLM ecosystem disappears overnight (since it costs ten times as much to run this garbage as even its biggest boosters are willing to pay for it, and the costs are currently subsidized by investment funding and other sources that are rapidly running out), and despite the immense harm that will do, we’ll at least get to a point where using LLMs for stuff is so expensive that many of the use cases will disappear instantly. So we’ve got that going for us.

Clive Robinson • February 13, 2026 12:57 AM

@ bye be ai,

You say,

“I hate the term hallucinating to describe what AI does in part because it is an attempt at personification intending to mislead”

Is the actually correct technical “knowledge domain term of art” of “soft bullshit”[1] any less personification / anthropomorphization?

[1] see, https://link.springer.com/article/10.1007/s10676-024-09775-5

Clive Robinson • February 13, 2026 3:48 AM

@ somebody,

I must have missed that XKCD but it certainly made me laugh sufficiently loud to cause others to notice through a closed door 🙂

As was once observed,

“A happy boss, is not necessarily a benign boss, but if he’s not laughing at you then you are less likely to feel pain”

(There is a “Blazing Saddles” –1974 film– clip that makes the same point).

Winter • February 13, 2026 5:10 AM

@Somebody

If science can’t discard random garbage it’s not a science.

There are two problems.

.1 Slob noise drowning out good papers. There is a dire shortage of peer reviewers. They are simply unable to review a deluge of garbage.

.2 AI can be used to falsify measurements, experimental outcomes, and photographs to perfectly simulate any theory or hypothesis. Reproducing the study would be the only way to discover the fraud. Which is far too expensive to actually do at scale.

Think vaccines-autism fraudulent study, but now by the thousands.

Clive Robinson • February 14, 2026 12:11 AM

@ Somebody, Winter,

With regards the problem of AI and research and potential issue you raise of,

“The problem is that people are relying on heuristics and not evaluating the quality of ideas.”

This recent “result” was from GPT 5 running in a custom framework for half a day,

https://openai.com/index/new-result-theoretical-physics/

I’m not “qualified” to judge the results but then there are very very few that are.

Which is where the problems start.

You’ve probably heard of “log rolling” and “cherry picking” and “thumb on scales” behaviours.

They are very human failings, and are because “judgment has to be exercised” actually an implicit part of the process of “winnowing down”, to little and “slop happens” to much and “favouritism happens”.

Find a deterministic way to do or measure these “required things” and then you might be able to solve the problem. Otherwise…

Clive Robinson • February 14, 2026 12:36 AM

@ Somebody, Winter,

I forgot to add a “full disclosure”

I know of the work of one of the authors because of my interest in “knots” not just practically but theoretically,

So have a look at the slides from a relevant talk,

https://www.damtp.cam.ac.uk/user/dbs26/talks/Knots.pdf

Also the basic idea of “scattering amplitudes” is known to me through Richard Feynman’s diagrams and the NMRI systems I was involved with the design of back in the 1980’s.

Good or bad we all have a past, and as they all to often say,

“Your past always comes back to haunt you.”

Winter • February 14, 2026 7:15 AM

@Clive

You’ve probably heard of “log rolling” and “cherry picking” and “thumb on scales” behaviours.

What are inspiration, insight, intuition, revelation, and epiphany?

These are random ideas that solve a vexing question.

How could you implent these in an algorithm? With a genetic algoritm that generates random ideas and then selects and improves the best.

That is exactly what LLMs are good at. They can concoct endless variations of relevant ideas and then prune the worst. After that, humans can look at the lists and pick out what works.

Btw, that is also how human research works.

Research in progress.

Clive Robinson • February 14, 2026 8:41 AM

@ Winter,

With regards,

What are inspiration, insight, intuition, revelation, and epiphany?

The trite answer is they are “action” words not “passive” words.

Further they are meaningfully tied up to parts of the Feynman Technique[1],

https://en.wikipedia.org/wiki/Feynman_Technique

And more formalised in Bloom’s Taxonomy,

https://www.simplypsychology.org/blooms-taxonomy.html

Thus learning is an “active not passive” “two way” activity (and more recent scientific study backs this up).

Which kind of answers your second question of,

“How could you implement these in an algorithm?”

But it does not need an LLM as such, it will fail because it is in effect static, which I mention from time to time with the “LLM Memory Issue”.

There are various ways around this,

1, “Retrieval-Augmented Generation”(RAG).
2, “Ralph Loop” under “Gas Town”.

And one or two others that put the static LLM DNN into an infrastructure / framework that kind of behaves like “short term memory”.

Does the LLM reason, not in the slightest, it simply behaves like a “Searl’s Chinese Room” set inside an external infrastructure. All it is is

1, A Database.
2, A set of rules.
3, A fuzzer or randomizer.

That in effect uses the statistics of multilevel semantics as spectrums under selection masks. Thus the DNN is in effect a “DSP matched filter”.

[1] The Feynman Technique has four main steps but… It glosses over some things.

There are two basic types of information “facts that are” and “knowledge that can be reasoned”. To learn “facts that are” there only real way to learn them is by writing them out “longhand” (learning by rote). Feynman did not actually say this because when he reasoned it out most people did not use keyboards or other recording devices. It’s been found more recently that you learn twice as much in about half the time by longhand for most students for the fundamental “facts that are”. For knowledged that can be reasoned it drops to about twice as much over re-reading and similar.

He also did not emphasise certain other things like “chunk things up” and “mix them up”. That is take things in in small parts, then do something else. It makes the brain work like your muscles should be exercised in a gym.

He also did not emphasise the urgency of explaining “chunks to others” as you go, it’s just one reason why very small study groups or one on one work best. It’s what randomly shuffled flash cards try to stimulate if you use them properly. The cognitive effort caused by “random” causes faster neuron development if fMRI is to be believed. It is why having a study group where each person has a different subject actually works well for all in the group.

kiwano • February 15, 2026 11:46 AM

In the case of political comments directed to elected representatives, I believe that there’s a relatively straightforward solution: have a terminal at the constituency office(s) where a constituent can go, sign in, plug in a usb key, and contribute their commentary from that. When they sign out, a bell should ring and an “available” light turn on, so that office staffers can notice if someone’s camping out at the terminal while signing out and back in again. Logs can show whether a whole bunch of submissions were made in a single session. There will likely need to be some accommodations for constituents who can’t make it to the office (and some varied/extended hours for those who can, but not during regular hours), but those needs are probably sparse enough for constituents to be able to request a visit from a staffer or somesuch.

Schneier on Security