OpenAI stressed that while the tool was an improvement on past iterations, it was still “not fully reliable.” The tool correctly identified 26 percent of artificially generated text but falsely flagged 9 percent of text from humans as computer generated.
The OpenAI tool is burdened with common flaws in detection programs: It struggles with short texts and writing that is not in English. In educational settings, plagiarism-detection tools such as TurnItIn have been accused of inaccurately classifying essays written by students as being generated by chatbots.
Detection tools inherently lag behind the generative technology they are trying to detect. By the time a defense system is able to recognize the work of a new chatbot or image generator, like Google Bard or Midjourney, developers are already coming up with a new iteration that can evade that defense. The situation has been described as an arms race or a virus-antivirus relationship where one begets the other, over and over.
“When Midjourney releases Midjourney 5, my starter gun goes off, and I start working to catch up — and while I’m doing that, they’re working on Midjourney 6,” said Hany Farid, a professor of computer science at the University of California, Berkeley, who specializes in digital forensics and is also involved in the A.I. detection industry. “It’s an inherently adversarial game where as I work on the detector, somebody is building a better mousetrap, a better synthesizer.”
Despite the constant catch-up, many companies have seen demand for A.I. detection from schools and educators, said Joshua Tucker, a professor of politics at New York University and a co-director of its Center for Social Media and Politics. He questioned whether a similar market would emerge ahead of the 2024 election.