Why we still can't stop plagiarism in undergraduate computer science (opens in new tab)

(kevinchen.co)

51 pointskevinchen8y ago118 comments

118 comments

I was teaching the lab portion of CS 101 (don't remember the actual course number off-hand) when I discovered that two students had the same remarkable code that shouldn't have worked but did.

We were using C, and instead of using globals or parameters, each function declared the same local variables in the same order. The stack, then, remained sufficiently consistent that each function had access to the values it needed.

When I confronted the two of them about plagiarism (and explaining what they had done wrong) their defense was that they were working on the problem together, and thus had made the same mistake out of ignorance.

And frankly, it made perfect sense. I could easily see myself doing something like that.

I guess my point is that, for at least some small portion of the problem space, plagiarism isn't really plagiarism.

HarryHirsch8y ago

In other words, we have made studying in groups a crime. Normally, to study a subject you would pose yourself problems and solve them, sometimes with friends. But we insist that homework be done individually, and we insist to assess every last little thing. I think this is a problem in the American university.

The other problem is that we have reduced scientific ethics to the subject of plagiarism. But there's other things, like publication ethics, medical ethics &c, and they don't even appear on the kids' radar screens.

Hextinium8y ago

This is entirely true, on almost all of my homework I have been specifically instructed to not communicate on more than a ideal level with any code that I write. They could just be missing a semicolon somewhere but I am not supposed to even look at people's code. This leads into group sessions basically everyone being silent and just asking for syntax help. It's dystopian.

williamstein8y ago

I strongly agree with you. When I was an undergraduate long ago, I remember getting lectured at great length by a math professor who was angry that many of the students in his abstract algebra course had worked together on the homework. I was one of those students (who had worked together), and I was very puzzled and annoyed by the professor, since I knew that I had learned far more on that assignment than any other! As a result -- I'm a college professor (in computational math), and over the last fifteen years on EVERY homework assignment I've given, I've explicitly encouraged students to work together as long as they clearly acknowledge who they worked with (and how). I wish everybody else did the same.

openIce8y ago

Here's my input as an actual computer engineering student with good grades and who was hired after 2 years: I have no idea why professors wouldn't want students to collaborate on homework, forcing students to solve problems on their own will only create basement coders who can't cooperate with other people.

Not a single company has ever asked me to invert a binary tree/implementing any more complicated algorithm nor my grades, they have only cared about my personality and non-school projects.

Hell, we are encouraged to work together with other people at my university.

wink8y ago

I am equally confused as well. It's been a while since I studied CS, but when stuff was to be handed in on paper we even had multiple names on it (which was allowed, but like only up to 4) and when it was switched to digital handin, why wouldn't you submit the same thing twice, with both names on it.

thatswrong08y ago

Here's the easiest solution: stop grading homework

Why judge student performance on something that they are using _to learn_? It doesn't make any sense.

Every student is basically competing with one another to get the highest GPA possible - if you're going to give them cookie cutter homework with solutions can be easily searched for on the internet _and_ can only bring their GPA down, then they're going to cheat. Plain and simple.

Give them homework and "grade it" to give them feedback, sure, but don't make it count.. that is, if the goal is to have students learn.

diabeetusman8y ago

Optional homework almost definitely won't get done. Making it count ensures that students would actually do it, which is useful to instructors (what do they need to focus on in class? Is the class understanding the lectures?).

thatswrong08y ago

You can still grade the homework (to give both lecturers and students feedback), but just have it not count towards the kids letter grade in the course. If no one is doing it, then you could make the grade for homework pass / fail: did you hand something gradeable in or not?

2 more replies

pdkl958y ago

> almost definitely won't get done.

While I believe "almost definitely" is a bit too strong, one of the most important lessons undergraduates need to learn asap is learning how to learn effectively. Homework is only a tool that can be very helpful when learning, but it's not the only tool available.

> which is useful to instructors

That's nice, but it's the instructor's job to serve serve the students.

> Is the class understanding the lectures?

In most classes, this should be easy to determine simply from interacting with the students in class, the questions they ask during office hours, etc.

The ideal instructor should offer regular homework problems/projects that are clear examples of what the students will be expected to know, and check the work of anybody that wants to take advantage of the instructor's knowledge. However, if a student feels their time would be better spent elsewhere[1], that's their business.

The exam will reveal if they made a good decision. Like any skill, many students will make mistakes when they first attempt this kind of project/time management. Fortunately, as undergraduates, the consequences of failure are usually having to retake the class. Later in life, failure may mean losing a job or other far more damaging consequences.

(For the record, when I was an undergraduate at UC Davis, many classes had optional homework. I spent a lot of time working on it for some classes, and others I skipped because it was trivial compared to what I was doing every day at my job.)

[1] e.g. other classes, a job, their own self-study, or maintaining friendships at a social event

Teever8y ago

> Optional homework almost definitely won't get done.

Then they fail?

I don't mean to sound glib, but I don't see the issue here.

Students that want to pass will pass, students that don't, won't.

I say this as a student that didn't pass.

1 more reply

loeg8y ago

Pass/fail grading for making an attempt.

bsder8y ago

> Here's the easiest solution: stop grading homework

Because it's not that simple. These are INTRO classes.

For many students, an intro CS class may be the first class they have encountered in their lives in which they finally have to work.

So, part of the job of a teacher is to teach the class material but also to teach good studying habits. Without grading homework, the first real feedback that a student is in trouble will come on a midterm when they fail--the feedback is WAY too late at that point.

That having been said--your professor isn't as stupid as you think he is. Plagiarism that fools the professor is as much work as just doing the assignment.

And professors have lots of ways of dealing with plagiarists far short of disciplinary proceedings. For example, partial credit on exams is quite subjective, and plagiarists tend to lose the "coin flip" if the professor is on the fence.

The problem I have is simply that there are quite a few professors who simply don't care. They make it far too easy to cheat--reusing a previous year's project or exam, for example, is a no-no.

zrobotics8y ago

I think that there is a solution that is being overlooked, which worked pretty well in my intro class. The professor explained that he would be using plagarism software and wouldn't be accepting excuses. He also explained that he expected to see comments explaining nearly everything, and provided a style guide. While this is training a bad coding practice (who want's to read code that has that many unnecessary comments?), I don't see any way around it for an intro course, since it's really hard to avoid reusing assignments. There aren't that many ways to code 'hello world', but if you make it clear that code must be thoroughly commented then it will drastically reduce false positives. Doesn't stop people copying wholesale from stackexchange, but w/ that level of commenting required I don't think students are really learning less. And this isn't required for higher-level courses, since by that point the assignments are complex enough that accidental plagiarism is extremely unlikely.

jkmcf8y ago

Not sure if you missed it, or I'm misunderstanding you, but he wrote:

> Give them homework and "grade it" to give them feedback

which addresses your main point.

1 more reply

matthewbauer8y ago

But then where does your grade come from? More testing doesn’t seem like a good solution.

webkike8y ago

Grade by completion. Maybe 50% of questions right rounds up to 100%?

1 more reply

sjg0078y ago

lab work.

aYsY4dDQ2NrcNzA8y ago

How about just stop grading on a curve? At least then, cheaters can't hurt the grades of non-cheaters.

mjw10078y ago

Maybe I'm being small-minded, but I strongly dislike using the word 'plagiarism' to refer to cheating on your homework by copying someone else.

To me, plagiarism is taking credit for someone else's ideas at their expense; it's a "sin" against the person being copied.

Copying someone else with their connivance, or paying some essay-mill writer to do your work for you, should be in the same category as taking a calculator into a mental-arithmetic test, not the same category as « My name in Dnepropetrovsk is cursed, when he finds out I publish first ».

ambulancechaser8y ago

> plagiarism is taking credit for someone else's ideas at their expense;

that seems like an unworkable definition. If I submit John Milton's Paradise Lost as my own work on an application to grad school, it certainly wouldn't be an "expense" to the long dead author.

Plagiarism is a fraud where you misrepresent the work of others as your own. In the academic world, what else is there but your own work?

mjw10078y ago

If I discovered a lost work of Milton and pretended it was my own, then that would be acting at the expense of Milton's reputation, in the sense I mean. So I don't think it matters if the "someone else" happens to be dead.

(If I tried it with Paradise Lost there'd be no expense to their reputation, but that just means it's not a realistic example of someone attempting plagiarism.)

thaumasiotes8y ago

> In the academic world, what else is there but your own work?

Work published under your name but performed by anonymous flunkies. It's... the norm in the academic world, actually.

ahelwer8y ago

Heh. I wanted to store my old university coursework somewhere, and GitHub seemed as good a place as any. Because I'm cheap and don't want to pay for a private repo, there it sits in all its public glory. Few years back I got an irate email from a professor that students were copying a program I wrote for a SPARC assembly course which has remained unchanged for, like, two decades. So, to some degree the whole plagiarism thing is due to professorial laziness.

Balgair8y ago

Oh, now that is great! Instead of just re-writing the problem for the class, the PI takes the time to write you an angry email and just make themself look foolish. Gotta love lazy professors.

dpark8y ago

People are turning in your SPARC assembly code for their assignments? Are they actually still teaching SPARC assembly?

ahelwer8y ago

I don't know if it's still the case of this year, but for our computer architecture course we learned SPARC as well as x86. Was pretty neat to explore the different ways of doing things.

paxys8y ago

I have a big problem with the general theme of this article, which is that plagiarism detection software is infallible and every student who disagrees with its findings is wrong and dishonest.

You claim

> We have virtually eliminated false positives at this point

but offer no explanation for how you verify this.

You later rant about the fact that students have the audacity to challenge these (very serious) charges and the university actually expects you to follow up when they do. The horror!

IMO it's your system of senseless programming exercises and automated grading that is broken. Instructors need to put in the time and effort and assign homework where students have to actually think and be creative, rather than reuse the same assignments for the 10th year running and be shocked when submissions turn out to be similar.

dpark8y ago

> I have a big problem with the general theme of this article, which is that plagiarism detection software is infallible and every student who disagrees with its findings is wrong and dishonest.

These claims were not made.

> You claim

> > We have virtually eliminated false positives at this point

> but offer no explanation for how you verify this.

Yes, he did. He described the process that arrived at the conclusion including "keeping only the cases that contain indisputable evidence — for example, hundreds of lines copied right down to the last whitespace error". It's clear from the context that this was verified by following up with the alleged plagiarizers. Indeed, he notes one false positive.

I'm not sure what your concern is. The article presents the process they've used for catching plagiarism and rather than point out any actual flaws, you attack the strawman of blindly using anti-plagiarism software and treating it as literally infallible.

paxys8y ago

It doesn't matter if they think their process is perfect. It is still just an accusation at that point, and students have the right to appeal it.

IMO it isn't acceptable for an instructor to say that they don't have time to provide an explanation when asked by the university.

1 more reply

Scaevolus8y ago

> "Then, we apply another filter, keeping only the cases that contain indisputable evidence — for example, hundreds of lines copied right down to the last whitespace error."

Sounds like they don't want to deal with plagiarism, if you can avoid it by simply making your copying "disputable".

MOSS is clever-- instead of doing direct textual comparison, you compare streams of tokens. This means that even if a student reformats the whitespace or renames all the variables (a common obfuscation technique), the same stream of "TOKEN ASSIGN_OP TOKEN LPAREN TOKEN COMMA TOKEN RPAREN" will exist. TMOSS extends this to snapshots of code as a student develops it, which is apparently 2x more effective!

This author also delicately avoids the cultural side to plagiarism-- many students come from backgrounds where "group work" is common, and passing classes is a communal effort, including homework. It's an unfortunately common mistake to think the grade is what matters, not the fundamental skill development.

uberman8y ago

Almost every aspect of our discipline encourages open source code sharing and code reuse. This is a discipline wide mind set.

In fact, "build it yourself from scratch" is an anti-pattern in my opinion.

I'm not condoning cheating, but why would one not expect this to be be the default behavior?

As others have suggested, there are much easier "solutions" related to logging keystrokes and commits should you really want to catch and punish this behavior.

toss18y ago

> "In fact, "build it yourself from scratch" is an anti-pattern in my opinion."

In the context of getting things done, that is sometimes true.

However, as an employer, I want to know that you:

1) have enough knowledge to build it from scratch if you have to, starting with analyzing the problem and ending with a coded, tested, debugged, and working solution, preferably at least somewhat optimized.

2) have enough knowledge to be able to read the code that you might want to use in [not building it from scratch] and assess it's value, considering A) whether it will meet the actual need, B) whether it will do so at a lower cost than writing it in-house from scratch, C) whether it will meet or exceed performance parameters, D) not introduce more problems than an in-house solution.

3) make a well-informed and reasoned decision between #1 and #2, and not merely be a copy-paste monkey.

Doing copy-paste as a regular practice in school eliminates all three of these capabilities.

In short, school is different from work, and you need to adhere to different practices.

edit: format

2muchcoffeeman8y ago

Depends what you’re writing. I’d expect a student to write code for data structures and sort algorithms themselves.

Not sure how well plagiarism detection would work though...

zrobotics8y ago

Isn't that the point of these courses anyway? Unless you have very, very good reasons, why would you ever write your own bubble sort professionally outside of niche cases? But having to implement different sorts by hand makes it much clearer how they differ, as well teaching basic algorithm knowledge.

1 more reply

kangnkodos8y ago

After implementing plagiarism detection campus-wide, design a process which is very time consuming on the students part, and not as time consuming on staff.

Maybe something like: "Your homework was flagged as possible plagarism. Report to this lab at this time, and code one additional problem which should be easy to anyone who understood the original homework."

Anyone who really did the homework will be in and out within five minutes. If you can't finish in an hour, you get a zero on that one homework assignment, not expelled.

That flips the incentives. Also, reducing the punishment cuts the drama of people arguing that the software is not 100% accurate.

wdewind8y ago

Here's an idea: why not make the assignments personal enough that you cannot cheat on them?

"But wait, that would require huge amounts of time investment from the professors/TAs"

Yes, it's almost as if paying $65k a year for someone to teach you something should result in that person teaching you that thing instead of just checking in to see if you've learned it on your own.

rhombocombus8y ago

Here's an idea: you try to do that teaching three or four sections and report back how that goes. I am not sure if you have ever taught post secondary classes before, but what you're describing is untenable for someone who wants to not work 12 hour days. It's a nice idea, and I know that I would have loved to do that when I was teaching, but the fact of the matter is it isn't feasible unless you have a squadron of TA's to help you carry it out, and that's not gonna happen.

wdewind8y ago

Totally! I definitely didn't mean to imply this is the Prof/TA's fault. It is the fault of many institutions and systems above them. I meant that TA's shouldn't be responsible for grading 4 sections of 60 students each. The only natural outcome of that is poor education, one facet of which is easily cheatable assignments.

It just seems ridiculous that the thing we are focusing on is that kids are cheating and not that we've made this massive, unscalable, expensive, ineffective education system. We are acting like its the kids who aren't holding up their end of the bargain when they try and cheat it.

1 more reply

ThePadawan8y ago

> (...) unless you have a squadron of TA's to help you carry it out, and that's not gonna happen.

Interestingly enough, here's how this worked when I was an undergrad TA:

* The college's central office paid TAs for obligatory undergrad courses (so basically CS101 etc. up until sophomore year) * Advanced courses were funded by the individual chairs and institutes. And would you look at that, suddenly there was a whole less of code submissions auto-checked for plagiarism, and a whole more "talking to the TA to explain your reasoning and problems you encountered". This also served the double purpose that these TAs got to know the students, their likes and dislikes, and led to some easy recruiting for PhD candidates and such.

1 more reply

ben5098y ago

It's a fair complaint from the student / customer's perspective, though, because university tuitions are constantly increasing for no apparent increase in value.

That universities pay the TA's beans isn't the customer's problem.

CaptSpify8y ago

Then schools should hire squadron's of TA's. We pay them enough that they can afford to, but they simply choose not to.

1 more reply

zrobotics8y ago

Fine for higher-level courses, but how exactly are intro courses supposed to make 'hello world' and the like personal enough that you can't cheat?

wdewind8y ago

The same way it's done in kindergartens around the world? They say "write me a story about your family" or "draw your family tree." So ok: "write me a program that is somehow relevant to your life."

Imagine you are teaching someone basic web development. The assignment is "make a webpage that displays some facts about your favorite tv show." Easy.

I can buy that it makes the grading more complicated, but coming up with assignments is not difficult.

stale20028y ago

One thing that I think people don't talk about enough on this topic is the wildly different plagerism guidelines between different classes.

I did both CS and economics in college. And in my CS classes, even discussing the homework with classmates was often "against the rules".

But in my business and economics classes, me and my classmates would regularly work together on the homework my straight up assigning certain problems to certain people and then copying from each other.

This was not only allowed by the professor, it was explicitly ENCOURAGED!

They understood that if you talked to classmates, you will be able to understand things better, instead of struggling and failing to do stuff on your own.

And with such wildly differing guidelines for different classes, things were often confusing to students.

One potential solution to "cheating" is to explicitly allow it, such that everyone is on the same playing field.

What matters, at the end of the day, is that the students learn the material.

waqf8y ago

The article doesn't seem to even consider the possibility of assessing students in some other way than through standardized homeworks which are easily copied.

For example, individual projects where everyone in the class is working on something different; or at the other extreme, proctored exams.

(Of course, neither of these systems is entirely free from cheating, but the barrier is higher.)

detaro8y ago

For intro-level CS courses, neither one works all that well. You can do that in more advanced (but still undergraduate) courses, but for beginner-level you IMHO need the feedback loop of regular homework. It doesn't have to be strictly graded, but it basically has to be there and at least some enforcement for it actually being done (e.g. handing in homework is required, but it's not a problem if it is doesn't actually work).

Coming up with hundreds of different small projects to e.g. get people to understand pointers isn't very realistic, and if you only test them in exams you're missing critical feedback both for the teachers and the students before it is too late.

thatswrong08y ago

Assess the homework, give them a grade on it but don't have it factor into their final grade at all. Just let the exams / big projects take care of that.

If you're regularly failing your homework, it probably means you won't be ready for the exam that actually counts. Which should be enough feedback

2 more replies

leetcrew8y ago

as a grader, it is already hard enough to keep up with the work that gets generated by thirty students attempting the same weekly homework assignment. i don't know how i could possibly grade thirty different projects in a week, let alone in any fair or reasonable way.

i've always wondered why they don't do something like select a random number of students and ask them to explain their work in an interview with the professor or TA.

i figure, even if they cheated, if they can give a satisfactory explanation, they probably learned the damn thing anyway. plus, you don't have to select even half the students before word gets around that they better understand the work they are turning in.

emidln8y ago

Add a grading section to each assignment with something like this: --------- Program is expected to maintain this interface <link to interface here> over stdin/stdout (or sockets, etc).

Students will pass when their program passes the grading/test suite. Test suite can be check ahead of time by uploading binary/zip/tarball to this location using their student id: <location here>. Detailed instructions <here>.

IIRC, this is what my university did 15 years ago. Not everyone had unique homework, but often the homework assignments were not all the same. I guess how unique they are depends on if you have some parameters in your homework generation or if you have a bunch of misc grad students to generate a few sets of homework that you can build up over time.

How might you grade style? IMO Don't. If you must, then save it for interactive review/lab sessions/office hours or style-specific spot checks (e.g. "At least once per semester your assignment will be additionally graded on program style according <insert guidelines here>. This will be added as additional points to your semester total.")

1 more reply

Larrikin8y ago

Standardized homeworks as a simple check on learning with an individual or group project at the end has been my favorite way of learning.

waqf8y ago

Sure! But the method of learning doesn't have to be the method of assessment. In fact, as pointed out upthread, it shouldn't be.

When students are doing homework you want them to form collaborative study groups or freely consult any other source, if it helps them learn. Fear of plagiarism is antithetical to that.

1 more reply

jessaustin8y ago

My recollection of CS-50/51 from decades ago is that the final project with the most value was an individual one. It seems infeasible to make all projects individualized, if you have a student/TA ratio greater than 3.

nkrisc8y ago

What I find fascinating about this problem is students paying all that money in order to deliberately avoid learning anything.

waqf8y ago

As you know, such students are paying for the piece of paper which certifies that they learned something. Who's to say it doesn't work?

matte_black8y ago

They are paying for the piece of paper not for the content.

nkrisc8y ago

I too paid for a piece of paper. But given how expensive that piece of paper was I damn well made sure I learned something too.

dpark8y ago

In an intro CS class, many of the students don't want to be there. They're taking the class because it's a requirement for the degree they actually want, which may have little or nothing to do with CS. Intro classes aren't just for people who want to specialize in that area.

stale20028y ago

Then perhaps the world needs to stop caring so much about the assement part of a university.

That would be wonderful if more people were able to focus on the learning, and less on the grades for unrelated subjects that don't matter for your job.

nkrisc8y ago

I do agree with you there. If you didn't actually learn anything in school, that should become apparent when you're unable to perform your job function. If you can perform your job function regardless, then I suppose it doesn't matter all that much either way.

Scaevolus8y ago

They're very frustrating to interview.

ben5098y ago

They should be flagged in the phone screen.

gumby8y ago

This essay is most about culture, but it does mention an anti-plagiarism program (which sounds pretty hard to do except in very trivial cases, but who knows?)

There's another tool: the repo. My son was accused of plagiarism in his last year of high school. It could have been a "he said/he said" case -- in fact it started that way -- until I pointed out that if he believed he was in the right he had a record that could be checked.

The CS teacher had to explain to the principal why the repo proved who had copied whom (and left me wondering why the teacher hadn't looked there first????) which wasn't easy because the plagiarist's parents were big donors to the school. So in the end, despite what it says in the school handbook, the only penalty was a 0 on the assignment.

But a good lesson for my kid on both programming and the sociopathologies of organizations.

fr0sty8y ago

Requiring students to submit their VCS history along with the finished project would at least up the cost to the students for copy and pasting.

They even hint at that sort of solution in the piece by mentioning cosmetic changes to the files at the last minute.

emidln8y ago

I wrote assignments for cash in college. When it came time for me to take a class, it was noticed that my work was very similar to past students. I tried using my VCS history as a defense when my prof noticed that my homework was similar to that of past students. After questioning some former students, it became obvious that the reason my VCS history was reasonable and style still so similar was that I was the person who had written other student's assignments in the past.

ngomez8y ago

I'm also a TA for the class mentioned in this article. We teach Git and have a submission system where students submit patches based on skeleton code; students are required to make at least five commits. We still have a significant number of students who copy code, and while it does help with picking up on that kind of behavior those students also don't seem to care about the increased cost and will pad their commits anyway.

rrauenza8y ago

Was going to post something similar -- to protect myself as a student, I'd quickly adopt git to keep a history of my work. Not that this couldn't be forged...

Does git count as a block chain for proof of work? :)

bsder8y ago

Pretty much. When I taught CS, I told people to commit early and commit often.

This has so many advantages even beyond defending against plagiarism charges that it really wasn't hard to drive home.

The big advantages being defense against the inevitable computer crash and the inevitable directory deletion.

pkamb8y ago

Wish I was taught or even knew what VCS was in University.

eecsninja8y ago

Engineering classes should really switch from being homework-based to being project-based. Even something as simple as small coding projects that can be done in a week.

Then, the final project would be such that you'd have to explain your code, either in person with a TA, or by writing documentation for it.

We really need to move on from this academic mindset of homework, grades, and plagiarism toward something that is actually reflective of the world outside of academia. The concept of plagiarism doesn't really exist in the software industry -- it's a matter of what you can get done.

jessaustin8y ago

TFA seems a bit at odds with itself. One reason academic boards don't care too much about CS plagiarism complaints is that CS generates so very many of them, compared to other fields. The reason isn't that CS students are degenerates (although they may be anyway), instead it's because it is so much easier to check for plagiarism in CS. So, sure closed-source is bad and definitely we can always use more TAs, but the problem is clearly not "we're only punishing 10% of our students while we should be punishing 40%!"

The problem isn't with CS at all, but rather with USA colleges in general. Indeed the only professor I've read who seems to even notice the problem is Harry Lewis. Most subjects should be taught very differently than they are taught. USA university education makes a great deal of unnecessary and counterproductive work for students and professors. The busywork threatens to drive out real academic work.

The reason for this is so that more such work might be created for administrators, who must multiply inexorably to absorb the ridiculous amounts of money that our ridiculous system of student debt generates. In fact it will be no surprise if some schools eventually do hire enough administrators to suspend 40% of every CS course every semester. One hopes that the professors who could restore some of the quality that universities used to possess, will realize by then that they can restore that and should restore that.

bunderbunder8y ago

Instead of coming up with punitive solutions, I wonder what can be done to re-structure computer science education in ways that move the incentive structure away from one that encourages plagiarism? Bonus points if it improves the quality of the education, too.

For example: What if we move toward a more seminar-style approach of having students discuss and critique each others' code on larger projects?

This might not get rid of all copy/pasting, but it would create a huge incentive for students to at least understand how their code works, in order to avoid embarrassment in front of their classmates. And, should two kids copy/paste the same code, and that becomes apparent in the course of a peer review session, well, that's an event that everyone will remember. No need for the instructor to make themself the bad guy in the process, either.

It also has the side benefit of giving students experience with code review, with reading and understanding others' code, and maybe helps them start to develop a sense for how to write clean, readable code several years before they start getting bludgeoned by senior devs at their first full-time job.

As for smaller problem set type homework, why not give them group work? It doesn't necessarily need to be graded, aside from credit/no credit, if you're worried about giving A's to duffers. I had a few classes that did that back when I was in school, and I really liked it. I felt like I learned faster, both from working together with classmates and because the format allowed them to give us more challenging problem sets.

ofcx8y ago

I was accused once in my undergrad and I thought it highlighted an interesting issue.

It was my senior year, and I attended a systems programming course that was being piloted and was very challenging. Work in the CS department was very group heavy, especially in courses heavy on theory. I benefited a ton from working with groups with other students outside of class. In your data structures/theory/math courses, this wasn't an issue - But in this class in particular, peoples submissions started to look similar.

It was resolved rather quickly because we just had to be honest, but I thought it was interesting - Specifically because, in classes that were so challenging heavy collaboration was what pulled me through, I barely remember the course content anymore. But, the soft skills I acquired from hours of collaborating with my peers after class hours has followed me for life and made a noticeable impact on my career.

jancsika8y ago

> Finally, as educators, we also hope that the accused student can learn difficult lessons about ethical behavior in the classroom rather than the workplace.

Suppose that technique X can actually deter students from cheating 100% of the time.

So we apply technique X to intro class Z that has 300 students.

Now we have an intro class of 300 non-cheating students who sit quietly and listen to an instructor for an hour a week.

Then those non-cheating students sign in for a more reasonably sized class section of 40 to sit quietly and listen to a graduate student for an hour.

Finally, these non-cheating students take tests and do assignments written in such a way that the amount of grading time does not put the graduate students over the weekly allotted work time for their TA-ship in their particular program.

Ballpark-- by what percentage would one say the quality of the learning environment has improved by employing technique X?

matthewbauer8y ago

This seems like something accreditation orgs like ABET should be more worried about. If students are cheating their way to degrees, that hurts everyone with a CS degree. Professors cant really do much if their uni doesnt care.

crawfordcomeaux8y ago

College isn't about education, but about signaling your value as a contributor to capitalism.

Otherwise, we'd be promoting collaborative learning and letting those who don't contribute or cheat simply cheat themselves.

dragonwriter8y ago

College has two distinct but related functions: education and certification. While related, these things are sometimes in tension.

Most things related to grading support the certification function, not the education function.

crawfordcomeaux8y ago

Certification isn't a necessity in the face of accepting uncertainty. Once we accept the lack of certainty around hiring people, we can start coming up with solutions to overcome certification.

The companies experimenting on such things are more likely to be able to adapt to a climate where certification is becoming meaningless.

Hopefully, more companies will realize certification designed for the industrial age has run its course and academia will no longer be incentivized to continue gatekeeping via certification and will get back to focusing on education.

1 more reply

jefflinwood8y ago

I co-teach an upper-level undergraduate class where the students create independent programming projects - the computer science students couldn't copy anyone else in the class's code even if they wanted to, as it wouldn't make any sense for their context.

Perhaps the solution is to be more creative with how computer science education is taught? If the students are copying homework problems they don't understand, they're not going to do well on the projects or exams that might be part of the rest of their grade.

currymj8y ago

part of the reason this may be salient now is that there are SO MANY undergrads taking CS courses.

classes that might have had 30 students in the past now have 300. you can’t hope to grade projects for all those students, even if you put them in groups.

sampo8y ago

Maybe the party that detects and decides on consequences for plagiarism should be a separate entity in the university. Like the internal affairs division in police departments that ordinary police officers hate so much in movies and tv-series. They would be an "external enemy", so the teaching staff would not have to suffer from friction with their students in these unpleasant matters, and also the consequences would be out of hands of the teachers.

piracy18y ago

I wonder how much of the 'plagiarism' is just people copying the same StackOverflow snippit.

zombieprocesses8y ago

Because the source for most generic programming assignments are already online?

Why not skew the grading more heavily towards in-class midterms and finals?

Or you could generate indivualized hws for each student, but that may not be feasible in a 500 student intro to cs class.

ggm8y ago

the other side of the coin is the group assignment where three of the five do all the work and all five live or die on the benefit.

oddly, post degree, we're actively encouraged to re-use code.

xenihn8y ago

It's a problem in MS programs too.

dd3678y ago

I was a TA for a graduate level class at one of the top universities in the US and I've had some interesting encounters with plagiarism.

I. The time I got caught for "plagiarizing". In an intro systems class, me, a CS major, and my roommate, who wanted to minor in CS, were working together and I was "showing him the ropes". He was an intelligent student and we never worked together on the homeworks aside from general verbal discussions on what the solution could be. He used a Windows laptop and for one of the assignments, his C code wasn't compiling because he was missing some libraries and he told me he couldn't figure it out and we were approaching a deadline and asked me to compile it for him and send him back the binary. I did so, but when sending back the binary, in a rush, I accidentally mistook my HW folder for his (we'd downloaded this as a part of the assignment, and the folder structure was identical) and sent him my binary by mistake. Both of our solutions worked. Obviously, we got "caught" in the most naive way. Our binaries had the same MD5 hash and the CMS flagged us. We were both confused at first, and then we realized what happened and explained it to the professor. The proof was simple - just compile my roommate's binary and run it. However, he annulled our assignment to 0. We still both got As (because you could drop one homework) and while some may claim this was a gentle slap on the wrist, it felt unjust. We clearly made a dumb mistake and we shouldn't be punished at all, especially when we knew how rampant actual plagiarism was.

II. The time I caught students for "plagiarizing". As Kevin points out in his post, there aren't really any incentives to catch students for cheating. As a TA, I get no benefit, and moreover, there's a cost. No one wants to be known as THAT TA who busts kids for using "a little help". Keeping that in mind, I was usually very lenient when it comes to cheating. I've noticed signs, but there was never enough proof to warrant the effort of calling someone out. However, at one level it went too far. Two students who were partners for the "projects" had submitted nearly identical solutions for a complex Graphics homework assignment. They got the answer right, but I looked into their working and they both said "(9/5) / (4/3) == (4/7) / (5*9) = 1/3". I don't remember the exact values, but it was two steps of non-sense numbers and then a correct answer. I ended up reporting the case, mostly because I felt like my intelligence as a TA had been insulted. Are you seriously going to submit random numbers with a correct solution hoping I won't see? In any case, it didn't go anywhere.

III. Discovering a cheating ring. At our university, one of my good friends and project partners told me there was an "enormous Asian cheating racket" - not to call out any specific race, I'm Asian too. I wasn't surprised - to be blatant, it made sense. We're very grade oriented with tiger parents. Then I learnt the extent of it. There were apparently Chinese forums and "outsourcers" you could send your homework problems to and they would solve it and give it back. In addition, there were special shared systems like DC++ where you could discover answers to homeworks for different classes at my university as well as Prelims, Midterms and Finals contributed by students of previous years. I was in shock. Students would leave exam halls to go to the bathroom just to look at these answers mid-exam. But was I gonna tattle? No.

IV. The reality at universities. Not just in CS, but in every other subject, almost everybody cheats. Excuses that go around are: "I've worked on it with someone else" "Oh the TA in office hours told everybody the exact same solution" "What? Cheating? me?" "Maybe he/she took it from me, I didn't do it"

And look, people aren't stupid. We all know how cheating works. You get a homework assignment, and you re-write the sentences in your own language. You get some code from someone else and you define some useless functions with 1-2 lines of code. Or you arbitrarily re-organize lines of code. You rename all the variables. You re-organize your functions. You create some unnecessary classes.

There were students who distribute 10 homework assignments between 10 people (in groups of 2), and have one do the assignment (use office hours, friends, google, whatever) and the other literally re-write the assignment in LaTeX 9 different ways for the others to use. No one would ever really have to do the work.

The well known key to cheating is plausible deniability - if there's enough evidence you didn't do it, you didn't do it.

dd3678y ago

And it's an even bigger problem with MEng/MS students. These are usually unfunded cash cow programs even at top universities. They accept fairly mediocre students from China and India and the class is usually 80% Chinese/Indian. A generalization, of course, but they have 0 intellectual curiosity. They are here to pay $50k-60k for 1 or 2 years, make sure they have as close to a 4.0 and then go get a tech job where they will make $150k/yr, and little to none of their skills from class would be needed.

And I can speak for Indians, but CS education in India aside from the IIT, the IIIT, BITS and some NITs is dismal. Cheating is rampant there, and they're much more well versed with the art because it's much harder to cheat and get away with it in India - you can't bring phones to your exam or freely go to the bathroom mid exam, for example.

j / k navigate · click thread line to collapse

118 comments

macintux8y ago

I was teaching the lab portion of CS 101 (don't remember the actual course number off-hand) when I discovered that two students had the same remarkable code that shouldn't have worked but did.

And frankly, it made perfect sense. I could easily see myself doing something like that.

I guess my point is that, for at least some small portion of the problem space, plagiarism isn't really plagiarism.

HarryHirsch8y ago

Hextinium8y ago

williamstein8y ago

openIce8y ago

Not a single company has ever asked me to invert a binary tree/implementing any more complicated algorithm nor my grades, they have only cared about my personality and non-school projects.

Hell, we are encouraged to work together with other people at my university.

wink8y ago

thatswrong08y ago

Here's the easiest solution: stop grading homework

Why judge student performance on something that they are using _to learn_? It doesn't make any sense.

Give them homework and "grade it" to give them feedback, sure, but don't make it count.. that is, if the goal is to have students learn.

diabeetusman8y ago

thatswrong08y ago

2 more replies

pdkl958y ago

> almost definitely won't get done.

> which is useful to instructors

That's nice, but it's the instructor's job to serve serve the students.

> Is the class understanding the lectures?

In most classes, this should be easy to determine simply from interacting with the students in class, the questions they ask during office hours, etc.

[1] e.g. other classes, a job, their own self-study, or maintaining friendships at a social event

Teever8y ago

> Optional homework almost definitely won't get done.

Then they fail?

I don't mean to sound glib, but I don't see the issue here.

Students that want to pass will pass, students that don't, won't.

I say this as a student that didn't pass.

1 more reply

loeg8y ago

Pass/fail grading for making an attempt.

bsder8y ago

> Here's the easiest solution: stop grading homework

Because it's not that simple. These are INTRO classes.

For many students, an intro CS class may be the first class they have encountered in their lives in which they finally have to work.

That having been said--your professor isn't as stupid as you think he is. Plagiarism that fools the professor is as much work as just doing the assignment.

The problem I have is simply that there are quite a few professors who simply don't care. They make it far too easy to cheat--reusing a previous year's project or exam, for example, is a no-no.

zrobotics8y ago

jkmcf8y ago

Not sure if you missed it, or I'm misunderstanding you, but he wrote:

> Give them homework and "grade it" to give them feedback

which addresses your main point.

1 more reply

matthewbauer8y ago

But then where does your grade come from? More testing doesn’t seem like a good solution.

webkike8y ago

Grade by completion. Maybe 50% of questions right rounds up to 100%?

1 more reply

sjg0078y ago

lab work.

aYsY4dDQ2NrcNzA8y ago

How about just stop grading on a curve? At least then, cheaters can't hurt the grades of non-cheaters.

mjw10078y ago

Maybe I'm being small-minded, but I strongly dislike using the word 'plagiarism' to refer to cheating on your homework by copying someone else.

To me, plagiarism is taking credit for someone else's ideas at their expense; it's a "sin" against the person being copied.

ambulancechaser8y ago

> plagiarism is taking credit for someone else's ideas at their expense;

that seems like an unworkable definition. If I submit John Milton's Paradise Lost as my own work on an application to grad school, it certainly wouldn't be an "expense" to the long dead author.

Plagiarism is a fraud where you misrepresent the work of others as your own. In the academic world, what else is there but your own work?

mjw10078y ago

(If I tried it with Paradise Lost there'd be no expense to their reputation, but that just means it's not a realistic example of someone attempting plagiarism.)

thaumasiotes8y ago

> In the academic world, what else is there but your own work?

Work published under your name but performed by anonymous flunkies. It's... the norm in the academic world, actually.

ahelwer8y ago

Balgair8y ago

Oh, now that is great! Instead of just re-writing the problem for the class, the PI takes the time to write you an angry email and just make themself look foolish. Gotta love lazy professors.

dpark8y ago

People are turning in your SPARC assembly code for their assignments? Are they actually still teaching SPARC assembly?

ahelwer8y ago

I don't know if it's still the case of this year, but for our computer architecture course we learned SPARC as well as x86. Was pretty neat to explore the different ways of doing things.

paxys8y ago

I have a big problem with the general theme of this article, which is that plagiarism detection software is infallible and every student who disagrees with its findings is wrong and dishonest.

You claim

> We have virtually eliminated false positives at this point

but offer no explanation for how you verify this.

You later rant about the fact that students have the audacity to challenge these (very serious) charges and the university actually expects you to follow up when they do. The horror!

dpark8y ago

> I have a big problem with the general theme of this article, which is that plagiarism detection software is infallible and every student who disagrees with its findings is wrong and dishonest.

These claims were not made.

> You claim

> > We have virtually eliminated false positives at this point

> but offer no explanation for how you verify this.

paxys8y ago

It doesn't matter if they think their process is perfect. It is still just an accusation at that point, and students have the right to appeal it.

IMO it isn't acceptable for an instructor to say that they don't have time to provide an explanation when asked by the university.

1 more reply

Scaevolus8y ago

> "Then, we apply another filter, keeping only the cases that contain indisputable evidence — for example, hundreds of lines copied right down to the last whitespace error."

Sounds like they don't want to deal with plagiarism, if you can avoid it by simply making your copying "disputable".

uberman8y ago

Almost every aspect of our discipline encourages open source code sharing and code reuse. This is a discipline wide mind set.

In fact, "build it yourself from scratch" is an anti-pattern in my opinion.

I'm not condoning cheating, but why would one not expect this to be be the default behavior?

As others have suggested, there are much easier "solutions" related to logging keystrokes and commits should you really want to catch and punish this behavior.

toss18y ago

> "In fact, "build it yourself from scratch" is an anti-pattern in my opinion."

In the context of getting things done, that is sometimes true.

However, as an employer, I want to know that you:

3) make a well-informed and reasoned decision between #1 and #2, and not merely be a copy-paste monkey.

Doing copy-paste as a regular practice in school eliminates all three of these capabilities.

In short, school is different from work, and you need to adhere to different practices.

edit: format

2muchcoffeeman8y ago

Depends what you’re writing. I’d expect a student to write code for data structures and sort algorithms themselves.

Not sure how well plagiarism detection would work though...

zrobotics8y ago

1 more reply

kangnkodos8y ago

After implementing plagiarism detection campus-wide, design a process which is very time consuming on the students part, and not as time consuming on staff.

Anyone who really did the homework will be in and out within five minutes. If you can't finish in an hour, you get a zero on that one homework assignment, not expelled.

That flips the incentives. Also, reducing the punishment cuts the drama of people arguing that the software is not 100% accurate.

wdewind8y ago

Here's an idea: why not make the assignments personal enough that you cannot cheat on them?

"But wait, that would require huge amounts of time investment from the professors/TAs"

Yes, it's almost as if paying $65k a year for someone to teach you something should result in that person teaching you that thing instead of just checking in to see if you've learned it on your own.

rhombocombus8y ago

wdewind8y ago

1 more reply

ThePadawan8y ago

> (...) unless you have a squadron of TA's to help you carry it out, and that's not gonna happen.

Interestingly enough, here's how this worked when I was an undergrad TA:

1 more reply

ben5098y ago

It's a fair complaint from the student / customer's perspective, though, because university tuitions are constantly increasing for no apparent increase in value.

That universities pay the TA's beans isn't the customer's problem.

CaptSpify8y ago

Then schools should hire squadron's of TA's. We pay them enough that they can afford to, but they simply choose not to.

1 more reply

zrobotics8y ago

Fine for higher-level courses, but how exactly are intro courses supposed to make 'hello world' and the like personal enough that you can't cheat?

wdewind8y ago

The same way it's done in kindergartens around the world? They say "write me a story about your family" or "draw your family tree." So ok: "write me a program that is somehow relevant to your life."

Imagine you are teaching someone basic web development. The assignment is "make a webpage that displays some facts about your favorite tv show." Easy.

I can buy that it makes the grading more complicated, but coming up with assignments is not difficult.

stale20028y ago

One thing that I think people don't talk about enough on this topic is the wildly different plagerism guidelines between different classes.

I did both CS and economics in college. And in my CS classes, even discussing the homework with classmates was often "against the rules".

This was not only allowed by the professor, it was explicitly ENCOURAGED!

They understood that if you talked to classmates, you will be able to understand things better, instead of struggling and failing to do stuff on your own.

And with such wildly differing guidelines for different classes, things were often confusing to students.

One potential solution to "cheating" is to explicitly allow it, such that everyone is on the same playing field.

What matters, at the end of the day, is that the students learn the material.

waqf8y ago

The article doesn't seem to even consider the possibility of assessing students in some other way than through standardized homeworks which are easily copied.

For example, individual projects where everyone in the class is working on something different; or at the other extreme, proctored exams.

(Of course, neither of these systems is entirely free from cheating, but the barrier is higher.)

detaro8y ago

thatswrong08y ago

Assess the homework, give them a grade on it but don't have it factor into their final grade at all. Just let the exams / big projects take care of that.

If you're regularly failing your homework, it probably means you won't be ready for the exam that actually counts. Which should be enough feedback

2 more replies

leetcrew8y ago

i've always wondered why they don't do something like select a random number of students and ask them to explain their work in an interview with the professor or TA.

emidln8y ago

Add a grading section to each assignment with something like this: --------- Program is expected to maintain this interface <link to interface here> over stdin/stdout (or sockets, etc).

1 more reply

Larrikin8y ago

Standardized homeworks as a simple check on learning with an individual or group project at the end has been my favorite way of learning.

waqf8y ago

Sure! But the method of learning doesn't have to be the method of assessment. In fact, as pointed out upthread, it shouldn't be.

When students are doing homework you want them to form collaborative study groups or freely consult any other source, if it helps them learn. Fear of plagiarism is antithetical to that.

1 more reply

jessaustin8y ago

nkrisc8y ago

What I find fascinating about this problem is students paying all that money in order to deliberately avoid learning anything.

waqf8y ago

As you know, such students are paying for the piece of paper which certifies that they learned something. Who's to say it doesn't work?

matte_black8y ago

They are paying for the piece of paper not for the content.

nkrisc8y ago

I too paid for a piece of paper. But given how expensive that piece of paper was I damn well made sure I learned something too.

dpark8y ago

stale20028y ago

Then perhaps the world needs to stop caring so much about the assement part of a university.

That would be wonderful if more people were able to focus on the learning, and less on the grades for unrelated subjects that don't matter for your job.

nkrisc8y ago

Scaevolus8y ago

They're very frustrating to interview.

ben5098y ago

They should be flagged in the phone screen.

gumby8y ago

This essay is most about culture, but it does mention an anti-plagiarism program (which sounds pretty hard to do except in very trivial cases, but who knows?)

But a good lesson for my kid on both programming and the sociopathologies of organizations.

fr0sty8y ago

Requiring students to submit their VCS history along with the finished project would at least up the cost to the students for copy and pasting.

They even hint at that sort of solution in the piece by mentioning cosmetic changes to the files at the last minute.

emidln8y ago

ngomez8y ago

rrauenza8y ago

Was going to post something similar -- to protect myself as a student, I'd quickly adopt git to keep a history of my work. Not that this couldn't be forged...

Does git count as a block chain for proof of work? :)

bsder8y ago

Pretty much. When I taught CS, I told people to commit early and commit often.

This has so many advantages even beyond defending against plagiarism charges that it really wasn't hard to drive home.

The big advantages being defense against the inevitable computer crash and the inevitable directory deletion.

pkamb8y ago

Wish I was taught or even knew what VCS was in University.

eecsninja8y ago

Engineering classes should really switch from being homework-based to being project-based. Even something as simple as small coding projects that can be done in a week.

Then, the final project would be such that you'd have to explain your code, either in person with a TA, or by writing documentation for it.

jessaustin8y ago

bunderbunder8y ago

For example: What if we move toward a more seminar-style approach of having students discuss and critique each others' code on larger projects?

ofcx8y ago

I was accused once in my undergrad and I thought it highlighted an interesting issue.

jancsika8y ago

> Finally, as educators, we also hope that the accused student can learn difficult lessons about ethical behavior in the classroom rather than the workplace.

Suppose that technique X can actually deter students from cheating 100% of the time.

So we apply technique X to intro class Z that has 300 students.

Now we have an intro class of 300 non-cheating students who sit quietly and listen to an instructor for an hour a week.

Then those non-cheating students sign in for a more reasonably sized class section of 40 to sit quietly and listen to a graduate student for an hour.

Ballpark-- by what percentage would one say the quality of the learning environment has improved by employing technique X?

matthewbauer8y ago

crawfordcomeaux8y ago

College isn't about education, but about signaling your value as a contributor to capitalism.

Otherwise, we'd be promoting collaborative learning and letting those who don't contribute or cheat simply cheat themselves.

dragonwriter8y ago

College has two distinct but related functions: education and certification. While related, these things are sometimes in tension.

Most things related to grading support the certification function, not the education function.

crawfordcomeaux8y ago

Certification isn't a necessity in the face of accepting uncertainty. Once we accept the lack of certainty around hiring people, we can start coming up with solutions to overcome certification.

The companies experimenting on such things are more likely to be able to adapt to a climate where certification is becoming meaningless.

1 more reply

jefflinwood8y ago

currymj8y ago

part of the reason this may be salient now is that there are SO MANY undergrads taking CS courses.

classes that might have had 30 students in the past now have 300. you can’t hope to grade projects for all those students, even if you put them in groups.

sampo8y ago

piracy18y ago

I wonder how much of the 'plagiarism' is just people copying the same StackOverflow snippit.

zombieprocesses8y ago

Because the source for most generic programming assignments are already online?

Why not skew the grading more heavily towards in-class midterms and finals?

Or you could generate indivualized hws for each student, but that may not be feasible in a 500 student intro to cs class.

ggm8y ago

the other side of the coin is the group assignment where three of the five do all the work and all five live or die on the benefit.

oddly, post degree, we're actively encouraged to re-use code.

xenihn8y ago

It's a problem in MS programs too.

dd3678y ago

I was a TA for a graduate level class at one of the top universities in the US and I've had some interesting encounters with plagiarism.

The well known key to cheating is plausible deniability - if there's enough evidence you didn't do it, you didn't do it.

dd3678y ago

j / k navigate · click thread line to collapse