Why AI Detectors Fail and How to Verify AI Content

The AI writing debate keeps asking whether a machine helped produce the words. The more important question is whether the claims on the page can actually be verified.

March 2026 By Dr. Amasiya

In my last two posts, I made two arguments that turned out to be more connected than I realized.

In the first, I suggested that we have reduced the entire AI debate to a question of whether “it is allowed or not”, when the more important question is whether it produces something true or harmful.

In the second, I showed through a classroom experiment I ran on literature reviews that humans were not more accurate than AI. They were just more trusted. On the face of it, those are not the same thing. But both posts generated the same question in my inbox: So what do we actually do about it?

This post is my answer. It is also an explanation of why I started VeriFact.

Detectors ask who wrote it. VeriFact asks whether it’s true.

01 / Detection

The detector arms race nobody is winning

When ChatGPT crossed 100 million users in two months, many institutions panicked. Bans were issued. Expulsions were threatened. And a new industry emerged almost overnight: AI detection.

The premise of that industry is the same as the one that sustained the plagiarism detection industry. And it is that the most important question about any piece of writing is who, or what, produced it. Detect the machine. Flag the output. Restore integrity.

I understand the impulse. I do not think it works.

Here is the core problem. The same capability that makes AI writing detectable, a certain statistical regularity in word choice and sentence structure, is also present in clear, confident human prose. Researchers have documented this repeatedly. Turnitin’s own studies acknowledge false positive rates that should alarm anyone using these tools for high-stakes decisions. In 2023, a University of Maryland study found that AI detectors misclassified non-native English speakers’ writing as AI-generated at significantly higher rates than native speakers. This is penalizing a student not for cheating, but for writing differently.

Meanwhile, the tools keep improving. GPT-5 outputs are harder to detect than GPT-3. Whatever comes next will be harder still. The detectors are running a race they are structurally unable to win.

02 / Truth

The question nobody is asking

In my second post, I described an experiment in which I had students and an AI assistant complete the same literature review task. The humans scored higher on perceived credibility. The AI scored comparably on factual accuracy, and in some categories, higher.

The conclusion I drew was simply that we trust humans more because we are used to trusting humans, not because human-produced work is inherently more reliable.

What actually makes work unreliable is not authorship. It is inaccuracy.

The legal profession learned this the hard way. In Mata v. Avianca in 2023, two attorneys submitted briefs containing cases that did not exist. ChatGPT had fabricated them. They had plausible citations, realistic case names, and confident legal reasoning, which were all invented. The attorneys were fined $5,000. The judge noted there was “nothing inherently improper about using AI for assistance,” but that the lawyers had failed to verify anything.

In my own experiment, ChatGPT fabricated citations in 12% of cases. Not in all cases. Not even in most cases. But 12% is not a rounding error when the stakes are a published paper, a legal brief, a medical recommendation, or a health article being read by thousands of people.

The detector would have caught nothing. The citation looked real. The writing was fluent. The only way to find the problem was to check whether the underlying claim was actually true.

03 / VeriFact

What VeriFact is

I spent several months thinking about what a service built around that insight would look like. Something that asks, is this true?

The result is VeriFact.

You submit a piece of writing and a human reviewer reads the full piece, identifies every factual claim, and checks each one against primary sources. You get back a structured report, including which claims are verified, which are unverified, and which are contradicted, with a direct link to the source for every finding.

For high-stakes content, or for those looking for extra comfort, there is a second tier. The reviewer annotates the piece line by line, suggests revisions for flagged sections, and delivers a plain-English explanation of every change alongside a clean revised version. For research, legal documents, health writing, anything with real consequences, this is where the value is.

I built VeriFact because I could not find this service anywhere. There are detectors. There are plagiarism tools. There are grammar checkers. There is no straightforward way to submit a piece of writing and get back a rigorous answer to the question: how much of this can actually be verified?

04 / Audience

Who this is for

Anyone who publishes can, and should, use this. But in practice, three groups have shown the most need.

Health and wellness brands are the most obvious. If you are publishing anything that touches nutrition, supplements, fitness, or medical claims, the cost of being wrong is both reputational and regulatory. A single wrong claim in a product page can trigger an FTC inquiry. We verify health content against peer-reviewed literature and flag anything that oversteps the evidence.

Researchers and academics are the fastest-growing group. Citation errors and misquoted findings both weaken a paper and can trigger retractions. We check every referenced study, every statistic, and every attributed claim against its original source before your work goes to peer review or publication.

The third is freelance writers. For freelancers, your reputation is your business. Delivering fact-checked work, especially on topics you are not an expert in, is the difference between a one-off gig and a long-term client relationship.

VeriFact

Need source-backed confidence before you publish?

Submit your draft for a claim-by-claim review and get a structured fact-check report with linked evidence, flagged risks, and suggested revisions.

Submit your work for fact-checking →

AI detectors are solving the wrong problem. I built something for the right one.

The detector arms race nobody is winning

The question nobody is asking

What VeriFact is

Who this is for

Need source-backed confidence before you publish?

Leave a Comment Cancel reply