Well, apart from the fact that chatGPT is really incapable of developing a thought, and also apart from the fact that half will fail to delete sentences like "I'm a language model, so I can't..." (insert gist of question here), it's painfully obvious if something is LLM generated.
The moment a sentence like "it's crucial to remember" pops up, I know what this is. Then, there's also the element that it always sounds like it's speaking to a child, and it avoids actually saying things unequivocally without some sort of disclaimer, as the legal department's CYA filter will ensure.
I remain thoroughly unimpressed by the entire venture. If this is Skynet 1.0, we're all safe for centuries to come.
Students who pay the $20 a month for it and are aware of its limitations will absolutely use it and it won't be obvious.
I just asked Chat GPT 4 to explain the religious significance of the Wizard of Oz as a literary critic. Here's some of what it gave me, it doesn't write anything like you claim it does:
"Moreover, Dorothy's companions -- the Scarecrow seeking a brain (wisdom), the Tin Man seeking a heart (love/compassion), and the Lion seeking courage (strength) -- symbolize spiritual virtues that are often extolled in religious texts. They embark on this quest together, mirroring the communal aspect of many religions.
The slippers (silver in the book, ruby in the film) can be viewed as sacred objects, or relics, that assist her in her journey, providing divine protection and eventually leading her to salvation (returning home).
Finally, the revelation that the Wizard is a mere mortal, and that Dorothy had the power to return home all along, imparts a spiritual lesson often found in religious narratives: the divine or the sacred is not external, but within us."
If I was a student I could have easily expanded on these concepts (with or without GPT) and turned in a good essay.
This isn't a field like engineering where there are objective right and wrong answers and anyone dies if you pass the students who are not so great at writing essays on literature.
"As a literary critic, describe how Dorothy in the Wizard of Oz is a religious figure."
The divine being contained within I would think would match Buddhism pretty well.
The reference to relics is too vague to pin down to any religion, there's probably lots of examples of it in lots of religions. If I had to defend it off the top if my head I'd compare the Ruby slippers to the "holy moly" herb Athena gives Odysseus to defend him from Circe.
If anything I think GPT went wrong saying strength is one of the virtues associated with the Lion. It would be much easier to focus on courage and say he needs to learn to be like a brave apostle who says things like "Yea, though I walk through the valley of the shadow of death, I will fear no evil; for thou art with me:"
My point wasn't that this essay was particularly good, necessarily, only that it was was good enough for undergraduate work.
Michael Berubé has this story where he says, he once came early to class and overheard the students make great arguments about movies and shows they saw last night, discussing them heatedly. Then, when the lesson started, all the arguments turned bland, banal, reproductive.
Obvious conclusion: They -can- very well produce good insight, but the college and school systems discourage it. They reward them for repeating ideas they read in books about their things, or what the teacher said; an original idea is dangerous, because they're responsible for it themselves, and if the teacher doesn't like it, they'll get punished for it. Safer to say, "Miller said..." and shove off accountability to someone published.
I can see why you believe identifying GPT-generated text is easy. This is because techniques like prompt engineering, few-shot learning, and fine-tuning aren't known and used extensively yet. For instance, with a 32k model, you could input all your previous writings and the instruct gpt to mimic your style—even down to the grammar mistakes.
This requires having a massive amount of previous writings to input, otherwise gpt struggles actually differentiating styles enough to generate it consistently in the same way a human would. Most students do not have enough personal writing data to train from.
This also excludes other strategies, such as getting every student to write on paper in a supervised environment and using this to guide your pattern assessments of submitted electronic works. It's very difficult for people to remain consistent and implement their own style in gpt generation. You can ask the many creative writers who are trying to use gpt for stories, and how many of them have to treat generation as an extreme rough draft of plot points at best.
I think absolutely anyone claiming that detecting LLM generated text is easy is flat out lying to themselves, or has only spent a few tokens and very little time playing with it.
Take semi-decent output, give it a single proof read and a few edits... and I don't fucking believe anyone who says they'll detect it. They absolutely will detect some of the most egregious examples of it, but assuming that's all of it is near willfully naive at this point.
They aren't going to risk getting expelled. Schools have done a good job of putting the fear of God into kids to not use chatGPT. Better to just not turn a paper in than to be accused of plagiarism.
All chatGPT shows to me is we have a ton of smart, incredibly closed minded people that know what they know and they think they have it all figured out.
My paper would be easy to spot if chatGPT helped because the writing would be so much better. The thoughts would be much better organized.
> "I'm a language model, so I can't..."
You won't catch the clever students who programmically remove these (e.g. using Langchain).