The amount of people in the ecosystem who thinks it's even possible to detect if something is AI written or not when it's just a couple of sentences is staggering high. And somehow, people in power seems to put their faith in some of these tools that guarantee a certain amount of truthfulness when in reality it's impossible they could guarantee that, and act on whatever these "AI vs Human-written" tool tell them to.
So hopefully this can serve as another example that it's simply not possible to detect if a bunch of characters were outputted by an LLM or not.
So such a model is doomed from the start, unless its parameters are a closely-guarded secret (and never leaked). Then it means it's foolable by those with access and nobody else. Which means there's a huge incentive for adversaries to make their own, etc. etc. until it's just a big arms race.
It's clear the actual answer needs to be: we need better automated tools to detect quality content, whatever that might mean, whether written by a human or an AI. That would be a godsend. And if it turned into an arms race, the arms we're racing each other to build are just higher-quality content.
could you contextualize your use of the word "easily" here?
I feel like "easily" might mean "with infinite funds and frictionless spherical developers."
The "detector" has extremely little information and the only somewhat reasonable criteria are things like style, where ChatGPT certainly has a particular, but by no means unique writing style. And as it gets better it will (by definition) be better at writing in more varied styles.
I listened to a podcast with Scott Aaronson that I'd highly recommend [0]. He's a theoretical computer scientist but he was recruited by OpenAI to work on AI safety. He has a very practical view on the matter and is focusing his efforts on leveraging the probabilistic nature of LLMs to provide a digital undetectable watermark. So it nudges certain words to be paired together slightly more than random and you can mathematically derive with some level of certainty whether an output or even a section of an output was generated by the LLM. It's really clever and apparently he has a working prototype in development.
Some work arounds he hasn't figured out yet is asking for an output in language X and then translating it into language Y. But those may still be eventually figured out.
I think watermarking would be a big step forward to practical AI safety and ideally this method would be adopted by all major LLMs.
That part starts around 1 hour 25 min in.
> Scott Aaronson: Exactly. In fact, we have a pseudorandom function that maps the N-gram to, let’s say, a real number from zero to one. Let’s say we call that real number ri for each possible choice i of the next token. And then let’s say that GPT has told us that the ith token should be chosen with probability pi.
https://axrp.net/episode/2023/04/11/episode-20-reform-ai-ali...
The point being that it's already possible to change ChatGPT's tone significantly. Think of how many people have done "Write a poem but as if <blah famous person> wrote it". The idea that ChatGPT could be reliably detected is kind of silly. It's an interesting problem but not one I'd feel comfortable publishing a tool to solve.
Moreover, the way to deal with AI in this context is not like the way to deal with plagiarism; do not try to detect AI and punish its use.
Instead, assign it's use, and have the students critique the output and find the errors. This both builds skills in using a new technology, and more critically, builds the essential skills of vigilance for errors, and deeper understanding of the material — really helping students strengthen their BS detectors, a critical life skill.
That doesn't mean that it can't be distguishable by some other means.
Same goes for representing what it means. If people don't understand statistics or math and such, then show what it means with circles or coins or stuff like that. Point is don't seem ever a good thing for options to get removed, especially if it's for bein cynical and judgin people like they're beneath deservin it. Don't make no sense.
If I have a tool that returns a random number between 0 and 1, indicating confidence that text is AI generated, is that tool good? Is it ethical to release it? I'd say no, it isn't. Removing the option is far better because the tool itself is harmful.
I saw that this report came out today which frankly is baffling: https://gpai.ai/projects/responsible-ai/social-media-governa... (Foundation AI Models Need Detection Mechanisms as a Condition of Release [pdf])
These models are clearly not good enough for decision-making, but still might tell an interesting story.
Here's an easily testable exercise: get a load of news from somewhere like newsapi.ai, run it through an open model and there should be a clear discontinuity around ChatGPT launch.
We can assume false positives and false negatives, but with a fat wadge of data we should still be able to discern trends.
Certainly couldn't accuse a student of cheating with it, but maybe spot content farms.
Yes, it’s still work, but it’s one step removed from having to think up of the original content.
(that said, "may eventually be possible" is so weak a claim it's already meaningless. Quantum fluctuations may eventually turn me into a potato but it's not keeping me up at night)
It's like asking a 747 to be made into a dog.
It's completely nonsensical to me.
An analogous example: my local pizza delivery (where I worked) would shut the box with a safety sticker, to avoid tampering / dipping by the delivery boys. Now, sometimes they would forget to do this for various logistical reasons. Every one of the non-stickered ones started getting returned as customers worried a pepperoni stolen. They stopped doing it shortly after.
The kind of people that can't get a job at a pizza place.
Personally, I never order delivery through these services. The incentives are all wrong. Not to mention the costs are super high: restaurants don't make any money, I pay out the @$$, and the drivers are given sub-minimum-wage pay after taking on the risks of delivery driving.
Kinda like if they forgot to put the security seal on your aspirin, I'm not going to take them all off because someone forgot to run production with all the bottles sealed.
The tool in question was used for AI text detection not generation.
Of course the smart student will easily figure out a way to stream the GPT output into Google Docs, perhaps jumping around to make "edits".
A clever and unethical student is pretty much undetectable no mater what roadblocks you put in their way. This just stops the not clever ones. :)
Yes, anybody can write an agent to meander about typing the chatgpt generated text into Google docs. Yes, Google could judge how likely it's that a document was typed by a human, but they won't for the same reasons openAI just cancelled this.
Somebody (maybe reacting to this news, maybe reading this thread) will write such an editor or evaluator. Another solution is screen recording as you write. Another (the best one, and the hardest one for educators) is to not request or grade things a robot can write better than most humans.
Why not? Record a bunch of humans writing, train model, release. That's orders of magnitude simpler than to come up with the right text to begin with.
Which sucks, because take-home projects are evaluating a different skill set, and some people thrive on one vs the other. But it is what it is.
No need to complicate it that much. Just start off writing an essay normally, and then paste in the GPT output normally. A teacher probably isn't going to check any of the revision history, especially if there's more than 30 students to go through.
The education bubble is about to implode - it will probably be one of the first industries killed by AI.
This was my conclusion as well testing the image detectors.
Current automated detection isn’t very reliable. I tried out Optic’s AI or Not , which boasts 95% accuracy, on a small sample of my own images. It correctly labeled those with AI content as AI generated, but it also labeled about 50% of my own stock photo composites I tried as AI generated. If generative AI was not a moving target I would be optimistic such tools could advance and become highly reliable. However, that is not the case and I have doubts this will ever be a reliable solution.
from my article on AI art - https://www.mindprison.cc/p/ai-art-challenges-meaning-in-a-w...
Could it be that a large proportion of the source stock photos were actually AI generated?
This is really painful, because for some of my work I need high quality images suitable for print. Now I can't just look at the thumbnail and say "this will work". I now have to examine it taking more of my time.
Starts talking like Shakespeare
Cryptographic signing means "I wrote this" or "I created this". Sure you could sign an AI generated image as yourself. But you could not sign an image as being created by Getty or NYT
Possibly (who am I kidding. *PROBABLY*!) will use chatGPT to help them design the method :)
From my understanding, this is a fools play in the long run, but there are current Ai Classifier Detectors that can successfully detect ChatGPT and other models (Originality.ai being a big one) on longish content.
Their process is fairly simple, they create a classification model after generating tons of examples from all the major models (ChatGPT, GPT4, Laama, etc).
One obvious downside to their strategy is the implementation of Finetuning and how that changes the stylistic output. This same 'heavy hitter' has successfully bypassed Originalities detector using his specified finetuning method (which he said took months of testing and thousands of dollars).
The current state of Google is a disaster, everything is 100 paragraphs per article, the answer you are looking for buried half way in to make sure you spend more time and scroll to appease the algorithm.
I cannot wait for them to sink all these spam websites.
If we accept this ...
The challenge I am foreseeing is this:
We are only at the very beginning of the AI revolution -- and if LLMs need to get more sophisticated and powerful in future they will need good-quality human-generated / curated training data at a scale that is likely impossible to do manual curation/cleansing/quality-checks on.
And there is no doubt that evey medium is going to get bombarded and spammed with AI-generated content in coming years.
How then, are we going to filter the data -- to separate the real data from AI generated noise -- to train future LLMs on -- and really push them to their potential.
This problem has been bugging me for a while and I commented here previously as well, tentatively calling it 'Data Pollution' for the lack of a better word.
Curious to hear other perspectives on this.
¯\_(ツ)_/¯ try paper I guess. Time to brush up on our OCR.
But you know who has more real-world data on typing style? Google, Microsoft, Meta, and everyone else who runs SaaS docs, emails, or messaging. I imagine a lot of students write their essays on Google Docs, Word, or the like, and submit them as attachments or copy-paste into a textbox.
Maybe a better term would be Superior Intelligence (SI). I sure as hell would not be able to pass any legal or medical exams without dedicating the next decade or so to getting there. Nor do I have any interest in doing so. But chat gpt 4 is apparently able to wow its peers. Does that pass the Turing test because it's too smart or too stupid? Most of humanity would fail that test.
So assuming all that to be true, how can the likes of Turnitin claim to be an authority for AI writing detection. When I graduated a few years back, they used to offer plag check only.
Pretty easy - they lie to people.
If the first does a good job, the second fails. And vice versa.
(On the other hand, maybe there is a lot of money to be made selling both, to different groups?)
Only people using it deceptively would be affected. No idea what portion of ChatGPT's users that is, would be very interested to know.
It wouldn’t beat determined users but it would at least catch the unaware.
For educators looking at evaluating students, essays and the like - we possibly need different ways of evaluation rather than on written asynchronous content for communicating concepts and ideas.
For civics, I would say yes.
Imagine you were talking to an online group about a design project for a local neighborhood. Based on the plurality of voices it seemed like mist people wanted a brown and orange design. But later when you talk to actual people in real life, you could only find a few that actually wanted that.
Virtual beings are a great addition to the bot nets that generate false consensus.
https://www.reuters.com/technology/openais-sam-altman-launch...
Now the topic isnt about anything millennial or Zelda related, but I'd think that the language model would select sentence and paragraph phrasing differently.
Maybe I need to switch to the API.
First, it tends to print a five-paragraph essay, with an introduction, three main points, and a conclusion.
Second, it signposts really well. Each of the body paragraphs is marked with either a bullet point or a number or something else that says "I'm starting a new point."
Third, it always reads like a WikiHow article. There's never any subtle humour or self-deprecation or ironic understatement. It's very straightforward, like an infographic.
It's definitely easy to recognize a ChatGPT response to a simple prompt if the author hasn't taken any measures to disguise it. The conclusion usually has a generic reminder that your mileage may vary and that you should always be careful.
If so, nice meta-commentary.
I think this upcoming school year is going to be a wakeup call for many educators. ChatGPT with GPT-4 is already capable of getting mostly A's on Harvard essay assignments - the best analysis I have seen is this one:
https://www.slowboring.com/p/chatgpt-goes-to-harvard
I'm not sure what instructors will do. Detecting AI-written essays seems technologically intractable, without cooperation from the AI providers, who don't seem too eager to prioritize watermarking functionality when there is so much competition. In the short term, it will probably just be fairly easy to cheat and get a good grade in this sort of class.
Besides, even if they did win, they would still lose by shooting their own foot.
It is important humans learn to express themselves in writing. The only way I think this will happen is if kids do their writing at school supervised.
In fact it doesn't take much text to distinguish between two human beings. The humanly-obvious version is that someone that habitually speaks in one dialect and someone else in another must be separate, but even without such obvious tells humans separate themselves into characterizeable subsets of this space fairly quickly.
I'm skeptical about generalized AI versus human detection in the face of the fact that it is adversarial. But a constant, unmoving target of some specific AI in some particular mode would definitely be detectable; e.g., "ChatGPT's current default voice" would certainly be detectable, "ChatGPT when instructed to sound like Ernest Hemmingway" would be detectable, etc. I just question whether ChatGPT in general can be characterized.
In OpenAI's case, its writing style usually comes from OpenAI's in-house dataset they used for RLHF. This is what gives it the ability to chat and respond with its signature (perhaps overly formal and apologetic) tone.
Although it can be used to write in other styles, sometimes it will refuse to because of this.
Not the educators fault tough, more like the system is bad.
My point is that given knowledge is mostly free and available, the system should teach the students to think rather than using tools or remembering facts
bigger text e.g. reports, thesis etc are probably easier & cheaper to verify by humans, with help of A.I. tools (ref checking, searching...)
Here's a decent paper on it.
It covers private watermarking (you can't detect it exists without a key), resistance to modifications, etc. Essentially you wouldn't know it was there and you can't make simple modifications to fool it.
OpenAI could already be doing this, and they could be watermarking with your account ID if they wanted to.
The current best countermeasure is likely paraphrasing attacks https://arxiv.org/pdf/2303.11156.pdf
I suppose hosted solutions like ChatGPT could offer an API where you copy some text in, and it searches its history of generated content to see if anything matches.
> bUt aCtuAlLy...
It's not like I don't know the bajillion limitations here. There are many audiences for detection. All of them are XY Problems. And the people asking for this stuff don't participate on Hacker News aka Unpopular Opinions Technology Edition.
There will probably be a lot of "services" that "just" "tell you" if "it" is "written by an AI."
Watermarking needs to be subtle enough to be unnoticeable to opposing parties, yet distinctive enough to be detectable.
So, this is an arms race especially because detecting it and altering it based on the watermark is also fun :)
This would not impact output quality much, but it would only work for longish outputs. And the token probability "key" could probsbly be reverse engineered with enough output.
Pretty common steganographic technique, really.