We were using C, and instead of using globals or parameters, each function declared the same local variables in the same order. The stack, then, remained sufficiently consistent that each function had access to the values it needed.
When I confronted the two of them about plagiarism (and explaining what they had done wrong) their defense was that they were working on the problem together, and thus had made the same mistake out of ignorance.
And frankly, it made perfect sense. I could easily see myself doing something like that.
I guess my point is that, for at least some small portion of the problem space, plagiarism isn't really plagiarism.
The other problem is that we have reduced scientific ethics to the subject of plagiarism. But there's other things, like publication ethics, medical ethics &c, and they don't even appear on the kids' radar screens.
Not a single company has ever asked me to invert a binary tree/implementing any more complicated algorithm nor my grades, they have only cared about my personality and non-school projects.
Hell, we are encouraged to work together with other people at my university.
Why judge student performance on something that they are using _to learn_? It doesn't make any sense.
Every student is basically competing with one another to get the highest GPA possible - if you're going to give them cookie cutter homework with solutions can be easily searched for on the internet _and_ can only bring their GPA down, then they're going to cheat. Plain and simple.
Give them homework and "grade it" to give them feedback, sure, but don't make it count.. that is, if the goal is to have students learn.
While I believe "almost definitely" is a bit too strong, one of the most important lessons undergraduates need to learn asap is learning how to learn effectively. Homework is only a tool that can be very helpful when learning, but it's not the only tool available.
> which is useful to instructors
That's nice, but it's the instructor's job to serve serve the students.
> Is the class understanding the lectures?
In most classes, this should be easy to determine simply from interacting with the students in class, the questions they ask during office hours, etc.
--
The ideal instructor should offer regular homework problems/projects that are clear examples of what the students will be expected to know, and check the work of anybody that wants to take advantage of the instructor's knowledge. However, if a student feels their time would be better spent elsewhere[1], that's their business.
The exam will reveal if they made a good decision. Like any skill, many students will make mistakes when they first attempt this kind of project/time management. Fortunately, as undergraduates, the consequences of failure are usually having to retake the class. Later in life, failure may mean losing a job or other far more damaging consequences.
(For the record, when I was an undergraduate at UC Davis, many classes had optional homework. I spent a lot of time working on it for some classes, and others I skipped because it was trivial compared to what I was doing every day at my job.)
[1] e.g. other classes, a job, their own self-study, or maintaining friendships at a social event
Then they fail?
I don't mean to sound glib, but I don't see the issue here.
Students that want to pass will pass, students that don't, won't.
I say this as a student that didn't pass.
Because it's not that simple. These are INTRO classes.
For many students, an intro CS class may be the first class they have encountered in their lives in which they finally have to work.
So, part of the job of a teacher is to teach the class material but also to teach good studying habits. Without grading homework, the first real feedback that a student is in trouble will come on a midterm when they fail--the feedback is WAY too late at that point.
That having been said--your professor isn't as stupid as you think he is. Plagiarism that fools the professor is as much work as just doing the assignment.
And professors have lots of ways of dealing with plagiarists far short of disciplinary proceedings. For example, partial credit on exams is quite subjective, and plagiarists tend to lose the "coin flip" if the professor is on the fence.
The problem I have is simply that there are quite a few professors who simply don't care. They make it far too easy to cheat--reusing a previous year's project or exam, for example, is a no-no.
> Give them homework and "grade it" to give them feedback
which addresses your main point.
To me, plagiarism is taking credit for someone else's ideas at their expense; it's a "sin" against the person being copied.
Copying someone else with their connivance, or paying some essay-mill writer to do your work for you, should be in the same category as taking a calculator into a mental-arithmetic test, not the same category as « My name in Dnepropetrovsk is cursed, when he finds out I publish first ».
that seems like an unworkable definition. If I submit John Milton's Paradise Lost as my own work on an application to grad school, it certainly wouldn't be an "expense" to the long dead author.
Plagiarism is a fraud where you misrepresent the work of others as your own. In the academic world, what else is there but your own work?
(If I tried it with Paradise Lost there'd be no expense to their reputation, but that just means it's not a realistic example of someone attempting plagiarism.)
Work published under your name but performed by anonymous flunkies. It's... the norm in the academic world, actually.
You claim
> We have virtually eliminated false positives at this point
but offer no explanation for how you verify this.
You later rant about the fact that students have the audacity to challenge these (very serious) charges and the university actually expects you to follow up when they do. The horror!
IMO it's your system of senseless programming exercises and automated grading that is broken. Instructors need to put in the time and effort and assign homework where students have to actually think and be creative, rather than reuse the same assignments for the 10th year running and be shocked when submissions turn out to be similar.
These claims were not made.
> You claim
> > We have virtually eliminated false positives at this point
> but offer no explanation for how you verify this.
Yes, he did. He described the process that arrived at the conclusion including "keeping only the cases that contain indisputable evidence — for example, hundreds of lines copied right down to the last whitespace error". It's clear from the context that this was verified by following up with the alleged plagiarizers. Indeed, he notes one false positive.
I'm not sure what your concern is. The article presents the process they've used for catching plagiarism and rather than point out any actual flaws, you attack the strawman of blindly using anti-plagiarism software and treating it as literally infallible.
IMO it isn't acceptable for an instructor to say that they don't have time to provide an explanation when asked by the university.
Sounds like they don't want to deal with plagiarism, if you can avoid it by simply making your copying "disputable".
MOSS is clever-- instead of doing direct textual comparison, you compare streams of tokens. This means that even if a student reformats the whitespace or renames all the variables (a common obfuscation technique), the same stream of "TOKEN ASSIGN_OP TOKEN LPAREN TOKEN COMMA TOKEN RPAREN" will exist. TMOSS extends this to snapshots of code as a student develops it, which is apparently 2x more effective!
This author also delicately avoids the cultural side to plagiarism-- many students come from backgrounds where "group work" is common, and passing classes is a communal effort, including homework. It's an unfortunately common mistake to think the grade is what matters, not the fundamental skill development.
In fact, "build it yourself from scratch" is an anti-pattern in my opinion.
I'm not condoning cheating, but why would one not expect this to be be the default behavior?
As others have suggested, there are much easier "solutions" related to logging keystrokes and commits should you really want to catch and punish this behavior.
In the context of getting things done, that is sometimes true.
However, as an employer, I want to know that you:
1) have enough knowledge to build it from scratch if you have to, starting with analyzing the problem and ending with a coded, tested, debugged, and working solution, preferably at least somewhat optimized.
2) have enough knowledge to be able to read the code that you might want to use in [not building it from scratch] and assess it's value, considering A) whether it will meet the actual need, B) whether it will do so at a lower cost than writing it in-house from scratch, C) whether it will meet or exceed performance parameters, D) not introduce more problems than an in-house solution.
3) make a well-informed and reasoned decision between #1 and #2, and not merely be a copy-paste monkey.
Doing copy-paste as a regular practice in school eliminates all three of these capabilities.
In short, school is different from work, and you need to adhere to different practices.
edit: format
Not sure how well plagiarism detection would work though...
Maybe something like: "Your homework was flagged as possible plagarism. Report to this lab at this time, and code one additional problem which should be easy to anyone who understood the original homework."
Anyone who really did the homework will be in and out within five minutes. If you can't finish in an hour, you get a zero on that one homework assignment, not expelled.
That flips the incentives. Also, reducing the punishment cuts the drama of people arguing that the software is not 100% accurate.
"But wait, that would require huge amounts of time investment from the professors/TAs"
Yes, it's almost as if paying $65k a year for someone to teach you something should result in that person teaching you that thing instead of just checking in to see if you've learned it on your own.
It just seems ridiculous that the thing we are focusing on is that kids are cheating and not that we've made this massive, unscalable, expensive, ineffective education system. We are acting like its the kids who aren't holding up their end of the bargain when they try and cheat it.
Interestingly enough, here's how this worked when I was an undergrad TA:
* The college's central office paid TAs for obligatory undergrad courses (so basically CS101 etc. up until sophomore year) * Advanced courses were funded by the individual chairs and institutes. And would you look at that, suddenly there was a whole less of code submissions auto-checked for plagiarism, and a whole more "talking to the TA to explain your reasoning and problems you encountered". This also served the double purpose that these TAs got to know the students, their likes and dislikes, and led to some easy recruiting for PhD candidates and such.
That universities pay the TA's beans isn't the customer's problem.
Imagine you are teaching someone basic web development. The assignment is "make a webpage that displays some facts about your favorite tv show." Easy.
I can buy that it makes the grading more complicated, but coming up with assignments is not difficult.
I did both CS and economics in college. And in my CS classes, even discussing the homework with classmates was often "against the rules".
But in my business and economics classes, me and my classmates would regularly work together on the homework my straight up assigning certain problems to certain people and then copying from each other.
This was not only allowed by the professor, it was explicitly ENCOURAGED!
They understood that if you talked to classmates, you will be able to understand things better, instead of struggling and failing to do stuff on your own.
And with such wildly differing guidelines for different classes, things were often confusing to students.
One potential solution to "cheating" is to explicitly allow it, such that everyone is on the same playing field.
What matters, at the end of the day, is that the students learn the material.
For example, individual projects where everyone in the class is working on something different; or at the other extreme, proctored exams.
(Of course, neither of these systems is entirely free from cheating, but the barrier is higher.)
Coming up with hundreds of different small projects to e.g. get people to understand pointers isn't very realistic, and if you only test them in exams you're missing critical feedback both for the teachers and the students before it is too late.
If you're regularly failing your homework, it probably means you won't be ready for the exam that actually counts. Which should be enough feedback
i've always wondered why they don't do something like select a random number of students and ask them to explain their work in an interview with the professor or TA.
i figure, even if they cheated, if they can give a satisfactory explanation, they probably learned the damn thing anyway. plus, you don't have to select even half the students before word gets around that they better understand the work they are turning in.
Students will pass when their program passes the grading/test suite. Test suite can be check ahead of time by uploading binary/zip/tarball to this location using their student id: <location here>. Detailed instructions <here>.
IIRC, this is what my university did 15 years ago. Not everyone had unique homework, but often the homework assignments were not all the same. I guess how unique they are depends on if you have some parameters in your homework generation or if you have a bunch of misc grad students to generate a few sets of homework that you can build up over time.
How might you grade style? IMO Don't. If you must, then save it for interactive review/lab sessions/office hours or style-specific spot checks (e.g. "At least once per semester your assignment will be additionally graded on program style according <insert guidelines here>. This will be added as additional points to your semester total.")
When students are doing homework you want them to form collaborative study groups or freely consult any other source, if it helps them learn. Fear of plagiarism is antithetical to that.
That would be wonderful if more people were able to focus on the learning, and less on the grades for unrelated subjects that don't matter for your job.
There's another tool: the repo. My son was accused of plagiarism in his last year of high school. It could have been a "he said/he said" case -- in fact it started that way -- until I pointed out that if he believed he was in the right he had a record that could be checked.
The CS teacher had to explain to the principal why the repo proved who had copied whom (and left me wondering why the teacher hadn't looked there first????) which wasn't easy because the plagiarist's parents were big donors to the school. So in the end, despite what it says in the school handbook, the only penalty was a 0 on the assignment.
But a good lesson for my kid on both programming and the sociopathologies of organizations.
They even hint at that sort of solution in the piece by mentioning cosmetic changes to the files at the last minute.
Does git count as a block chain for proof of work? :)
This has so many advantages even beyond defending against plagiarism charges that it really wasn't hard to drive home.
The big advantages being defense against the inevitable computer crash and the inevitable directory deletion.
Then, the final project would be such that you'd have to explain your code, either in person with a TA, or by writing documentation for it.
We really need to move on from this academic mindset of homework, grades, and plagiarism toward something that is actually reflective of the world outside of academia. The concept of plagiarism doesn't really exist in the software industry -- it's a matter of what you can get done.
The problem isn't with CS at all, but rather with USA colleges in general. Indeed the only professor I've read who seems to even notice the problem is Harry Lewis. Most subjects should be taught very differently than they are taught. USA university education makes a great deal of unnecessary and counterproductive work for students and professors. The busywork threatens to drive out real academic work.
The reason for this is so that more such work might be created for administrators, who must multiply inexorably to absorb the ridiculous amounts of money that our ridiculous system of student debt generates. In fact it will be no surprise if some schools eventually do hire enough administrators to suspend 40% of every CS course every semester. One hopes that the professors who could restore some of the quality that universities used to possess, will realize by then that they can restore that and should restore that.
For example: What if we move toward a more seminar-style approach of having students discuss and critique each others' code on larger projects?
This might not get rid of all copy/pasting, but it would create a huge incentive for students to at least understand how their code works, in order to avoid embarrassment in front of their classmates. And, should two kids copy/paste the same code, and that becomes apparent in the course of a peer review session, well, that's an event that everyone will remember. No need for the instructor to make themself the bad guy in the process, either.
It also has the side benefit of giving students experience with code review, with reading and understanding others' code, and maybe helps them start to develop a sense for how to write clean, readable code several years before they start getting bludgeoned by senior devs at their first full-time job.
As for smaller problem set type homework, why not give them group work? It doesn't necessarily need to be graded, aside from credit/no credit, if you're worried about giving A's to duffers. I had a few classes that did that back when I was in school, and I really liked it. I felt like I learned faster, both from working together with classmates and because the format allowed them to give us more challenging problem sets.
It was my senior year, and I attended a systems programming course that was being piloted and was very challenging. Work in the CS department was very group heavy, especially in courses heavy on theory. I benefited a ton from working with groups with other students outside of class. In your data structures/theory/math courses, this wasn't an issue - But in this class in particular, peoples submissions started to look similar.
It was resolved rather quickly because we just had to be honest, but I thought it was interesting - Specifically because, in classes that were so challenging heavy collaboration was what pulled me through, I barely remember the course content anymore. But, the soft skills I acquired from hours of collaborating with my peers after class hours has followed me for life and made a noticeable impact on my career.
Suppose that technique X can actually deter students from cheating 100% of the time.
So we apply technique X to intro class Z that has 300 students.
Now we have an intro class of 300 non-cheating students who sit quietly and listen to an instructor for an hour a week.
Then those non-cheating students sign in for a more reasonably sized class section of 40 to sit quietly and listen to a graduate student for an hour.
Finally, these non-cheating students take tests and do assignments written in such a way that the amount of grading time does not put the graduate students over the weekly allotted work time for their TA-ship in their particular program.
Ballpark-- by what percentage would one say the quality of the learning environment has improved by employing technique X?
Otherwise, we'd be promoting collaborative learning and letting those who don't contribute or cheat simply cheat themselves.
Most things related to grading support the certification function, not the education function.
The companies experimenting on such things are more likely to be able to adapt to a climate where certification is becoming meaningless.
Hopefully, more companies will realize certification designed for the industrial age has run its course and academia will no longer be incentivized to continue gatekeeping via certification and will get back to focusing on education.
Perhaps the solution is to be more creative with how computer science education is taught? If the students are copying homework problems they don't understand, they're not going to do well on the projects or exams that might be part of the rest of their grade.
classes that might have had 30 students in the past now have 300. you can’t hope to grade projects for all those students, even if you put them in groups.
Why not skew the grading more heavily towards in-class midterms and finals?
Or you could generate indivualized hws for each student, but that may not be feasible in a 500 student intro to cs class.
oddly, post degree, we're actively encouraged to re-use code.
I. The time I got caught for "plagiarizing". In an intro systems class, me, a CS major, and my roommate, who wanted to minor in CS, were working together and I was "showing him the ropes". He was an intelligent student and we never worked together on the homeworks aside from general verbal discussions on what the solution could be. He used a Windows laptop and for one of the assignments, his C code wasn't compiling because he was missing some libraries and he told me he couldn't figure it out and we were approaching a deadline and asked me to compile it for him and send him back the binary. I did so, but when sending back the binary, in a rush, I accidentally mistook my HW folder for his (we'd downloaded this as a part of the assignment, and the folder structure was identical) and sent him my binary by mistake. Both of our solutions worked. Obviously, we got "caught" in the most naive way. Our binaries had the same MD5 hash and the CMS flagged us. We were both confused at first, and then we realized what happened and explained it to the professor. The proof was simple - just compile my roommate's binary and run it. However, he annulled our assignment to 0. We still both got As (because you could drop one homework) and while some may claim this was a gentle slap on the wrist, it felt unjust. We clearly made a dumb mistake and we shouldn't be punished at all, especially when we knew how rampant actual plagiarism was.
II. The time I caught students for "plagiarizing". As Kevin points out in his post, there aren't really any incentives to catch students for cheating. As a TA, I get no benefit, and moreover, there's a cost. No one wants to be known as THAT TA who busts kids for using "a little help". Keeping that in mind, I was usually very lenient when it comes to cheating. I've noticed signs, but there was never enough proof to warrant the effort of calling someone out. However, at one level it went too far. Two students who were partners for the "projects" had submitted nearly identical solutions for a complex Graphics homework assignment. They got the answer right, but I looked into their working and they both said "(9/5) / (4/3) == (4/7) / (5*9) = 1/3". I don't remember the exact values, but it was two steps of non-sense numbers and then a correct answer. I ended up reporting the case, mostly because I felt like my intelligence as a TA had been insulted. Are you seriously going to submit random numbers with a correct solution hoping I won't see? In any case, it didn't go anywhere.
III. Discovering a cheating ring. At our university, one of my good friends and project partners told me there was an "enormous Asian cheating racket" - not to call out any specific race, I'm Asian too. I wasn't surprised - to be blatant, it made sense. We're very grade oriented with tiger parents. Then I learnt the extent of it. There were apparently Chinese forums and "outsourcers" you could send your homework problems to and they would solve it and give it back. In addition, there were special shared systems like DC++ where you could discover answers to homeworks for different classes at my university as well as Prelims, Midterms and Finals contributed by students of previous years. I was in shock. Students would leave exam halls to go to the bathroom just to look at these answers mid-exam. But was I gonna tattle? No.
IV. The reality at universities. Not just in CS, but in every other subject, almost everybody cheats. Excuses that go around are: "I've worked on it with someone else" "Oh the TA in office hours told everybody the exact same solution" "What? Cheating? me?" "Maybe he/she took it from me, I didn't do it"
And look, people aren't stupid. We all know how cheating works. You get a homework assignment, and you re-write the sentences in your own language. You get some code from someone else and you define some useless functions with 1-2 lines of code. Or you arbitrarily re-organize lines of code. You rename all the variables. You re-organize your functions. You create some unnecessary classes.
There were students who distribute 10 homework assignments between 10 people (in groups of 2), and have one do the assignment (use office hours, friends, google, whatever) and the other literally re-write the assignment in LaTeX 9 different ways for the others to use. No one would ever really have to do the work.
The well known key to cheating is plausible deniability - if there's enough evidence you didn't do it, you didn't do it.
And I can speak for Indians, but CS education in India aside from the IIT, the IIIT, BITS and some NITs is dismal. Cheating is rampant there, and they're much more well versed with the art because it's much harder to cheat and get away with it in India - you can't bring phones to your exam or freely go to the bathroom mid exam, for example.