Modern Snake Oil: AI Detectors
What are Text Base AI Detectors?
Text-based AI detectors are software tools designed to identify whether a piece of text was generated by artificial intelligence. These detectors analyze various features of the text, such as sentence structure, word usage, and stylistic patterns, to determine if it matches "typical" AI writing characteristics.
The basic flaws with AI Detection
The main weakness is that text characteristics cannot reliably indicate authorship. Since you can't validate its findings it's prone to errors, manipulation, and inherent biases.
Current detection claims are plagued by major accuracy issues, with independent testing revealing high error rates from the tools. This leads to a lot of false positives and false negatives.
It can be easily fooled by slightly altering content, rendering it less effective.
The tools lack transparency in their detection algorithms, making it difficult for users to understand why certain texts are flagged, undermining any trust in their accuracy.
The "Unique Indicators" Myth
The Myth is that these AI Detectors, represent a solution to the "AI Problem". The belief that detectors are a solution is based on a myth that assumes AI writing has unique indicators. The reality is that modern language models show this is not true.
Research Papers:
Can AI-Generated Text be Reliably Detected? : https://arxiv.org/abs/2303.11156
The illusion of accuracy
While it might seem effective at identifying AI-generated content, it often relies on superficial markers and patterns that can result in both false positives and false negatives. This perceived accuracy can be deceptive, as slight modifications to AI-generated text can easily bypass detection, and genuine human writing will be incorrectly flagged.
The problem is that there are no clear textual giveaways that reliably distinguish human writing from AI-generated writing. Humans have been writing for over 5,000 years, and there is no particular "style" to train AI on or use which hasn't been used by someone somewhere at some time. As soon as these detectors hit an unknown style or a style it was "taught" to be bad it will go off regardless of when or who wrote it (ex. US Constitution was written by AI).
Research Papers:
Testing of Detection Tools for AI-generated text: https://arxiv.org/abs/2306.15666
Evaluating the efficacy of AI content detection tools in differentiating between human and AI-generated text : https://edintegrity.biomedcentral.com/articles/10.1007/s40979-023-00140-5
In the News:
https://www.fastcompany.com/91074029/can-using-grammarly-set-off-ai-detection-software
https://dailyfreepress.com/2023/09/28/com-responds-to-turnitins-false-ai-detection-rates/
There are hundreds of these articles and this was only the tip of the iceberg
Ingrained Bias and Discrimination
Text-based AI detectors exhibit bias and discrimination, often rooted in the data they are trained on and the algorithms they use. These systems disproportionately flag certain writing styles or dialects associated with specific cultural or linguistic groups.
False positives are known to impact certain demographics more than others. For example, non-native English speakers are often wrongfully flagged for writing styles that deviate from "standard" English. Students with learning disabilities also suffer from excessive false flags.
The discrimination stems from the use of narrow datasets to train the system. Without diverse data, they fail to encompass language variations. Detectors offer little to no transparency in their training process or detection criteria. With no public accountability, biases and flaws in proprietary algorithms go unchecked. Those targeted unfairly have no recourse against such an opaque system.
Research Papers:
Unfair Punishment and no way to prove innocence
More and more instances are emerging of educators alleging cheating based on shaky detector evidence alone.
Examples:
Often you'll hear an argument of "use this as only one piece of evidence" or "Don't make your conclusion solely of this tool." This is a form of indemnification by the user of the tool or the AI detector software companies, used as an attempt to isolate them from lawsuits. It is not actual functioning good advice. The fact of the matter is there is no good way to prove your innocence. Some say to use version history, but this won't work without saving every change you make when writing a document in multiple saved documents. It is a flawed logic;
What if you're writing it offline?
What if you don't have access to tools that save any history?
Version history tools native to Microsoft Office and Google Docs, don't save every little thing as they merge deltas to save space (it's in their documentation).
Maybe you type too fast for the automatic histories to even register?
The computer crashed in the middle of writing it?
The file is accidentally deleted for some reason after submission?
Maybe you copied blocks of text around from different file iterations thus wiping out any functioning history.
The list goes on, there's a lot of things that can go wrong or don't work in the way you intend. The only semi-reliable way is for people to use a versioning software (ex. github) that is independently validated to prove that you didn't manipulate it. This is a bridge too far for non-technical folks and a bit draconian for a basic school paper!
The result is guilty until proven innocent, with no way to prove your innocence. Couple this with no recourse to overturn incorrect judgments, and this results in unfair punishment.
OpenAI gives up on detection!
In September 23, Open AI themselves gave up on an AI Detector and issued the following statement:
Do AI detectors work?
In short, no, not in our experience. Our research into detectors didn't show them to be reliable enough given that educators could be making judgments about students with potentially lasting consequences.
The failed effort from the makers of ChatGPT which started all of this hype, further confirms the current challenges with AI Detection. If OpenAI cannot identify synthetic text, this raises significant doubt that any other tool made by another company could reliably detect it from these highly opaque models.
After they saw the unintended consequences of inaccurate AI detection, they decided the harm that occurs outweighed any marginal benefit. This highlights the reality that viable tools do not exist and most likely never will.
Original Release:
References:
Electronic Frontier Foundation: To Best Serve Students, Schools Shouldn’t Try to Block Generative AI, or Use Faulty AI Detection Tools
Prompt Engineering & AI Institute: The Truth About AI Detectors - More Harm Than Good
ARS Technica: Why AI writing detectors don’t work
University of Rochester: AI Detection's High False Positive Rates and the Psychological and Material Impacts on Students
University of Maryland: Can AI-Generated Text be Reliably Detected?