Entries Tagged "plagiarism"

Page 1 of 1

Identifying Programmers by Their Coding Style

Fascinating research on de-anonymizing code — from either source code or compiled code:

Rachel Greenstadt, an associate professor of computer science at Drexel University, and Aylin Caliskan, Greenstadt’s former PhD student and now an assistant professor at George Washington University, have found that code, like other forms of stylistic expression, are not anonymous. At the DefCon hacking conference Friday, the pair will present a number of studies they’ve conducted using machine learning techniques to de-anonymize the authors of code samples. Their work could be useful in a plagiarism dispute, for instance, but it also has privacy implications, especially for the thousands of developers who contribute open source code to the world.

Posted on August 13, 2018 at 4:02 PMView Comments

The Effectiveness of Plagiarism Detection Software

As you’d expect, it’s not very good:

But this measure [Turnitin] captures only the most flagrant form of plagiarism, where passages are copied from one document and pasted unchanged into another. Just as shoplifters slip the goods they steal under coats or into pocketbooks, most plagiarists tinker with the passages they copy before claiming them as their own. In other words, they cloak their thefts by scrambling the passages and right-clicking on words to find synonyms. This isn’t writing; it is copying, cloaking and pasting; and it’s plagiarism.

Kerry Segrave is a right-clicker, changing “cellar of store” to “basement of shop.” Similarly, he changes goods to items, articles to goods, accomplice to confederate, neighborhood to area, and women to females. He is also a scrambler, changing “accidentally fallen” to “fallen accidentally;” “only with” to “with only;” and, “Leon and Klein,” to “Klein and Leon.” And, he scrambles phrases within sentences; in other words, the phases of his sentences are sometimes scrambled.

[…]

Turnitin offers another product called WriteCheck that allows students to “check [their] work against the same database as Turnitin.” I signed up and submitted the early pages of Shoplifting. WriteCheck matched many of Shoplifting’s phrases to those of the i>New York Times articles in its library of student papers. Remember, I submitted them as a student paper to help Turnitin find them; now WriteCheck has them too! WriteCheck warned me that “a significant amount of this paper is unoriginal” and advised me to revise it. After a few hours of right-clicking and scrambling, I resubmitted it and WriteCheck said it was okay, being cleansed of easily recognizable plagiarism.

Turnitin is playing both sides of the fence, helping instructors identify plagiarists while helping plagiarists avoid detection. It is akin to selling security systems to stores while allowing shoplifters to test whether putting tagged goods into bags lined with aluminum thwart the detectors.

Posted on September 19, 2011 at 6:35 AMView Comments

Term Paper Writing for Hire

This recent essay (commentary here) reminded me of this older essay, both by people who write student term papers for hire.

There are several services that do automatic plagiarism detection — basically, comparing phrases from the paper with general writings on the Internet and even caches of previously written papers — but detecting this kind of custom plagiarism work is much harder.

I can think of three ways to deal with this:

  1. Require all writing to be done in person, and proctored. Obviously this won’t work for larger pieces of writing like theses.
  2. Semantic analysis in an attempt to fingerprint writing styles. It’s by no means perfect, but it is possible to detect if a piece of writing looks nothing like a student’s normal writing style.
  3. In-person quizzes on the writing. If a professor sits down with the student and asks detailed questions about the writing, he can pretty quickly determine if the student understand what he claims to have written.

The real issue is proof. Most colleges and universities are unwilling to pursue this without solid proof — the lawsuit risk is just too great — and in these cases the only real proof is self-incrimination.

Fundamentally, this is a problem of misplaced economic incentives. As long as the academic credential is worth more to a student than the knowledge gained in getting that credential, there will be an incentive to cheat.

Related note: anyone remember my personal experience with plagiarism from 2005?

Posted on November 16, 2010 at 6:36 AMView Comments

Plagiarism and Academia: Personal Experience

A paper published in the December 2004 issue of the SIGCSE Bulletin, “Cryptanalysis of some encryption/cipher schemes using related key attack,” by Khawaja Amer Hayat, Umar Waqar Anis, and S. Tauseef-ur-Rehman, is the same as a paper that John Kelsey, David Wagner, and I published in 1997.

It’s clearly plagiarism. Sentences have been reworded or summarized a bit and many typos have been introduced, but otherwise it’s the same paper. It’s copied, with the same section, paragraph, and sentence structure — right down to the same mathematical variable names. It has the same quirks in the way references are cited. And so on.

We wrote two papers on the topic; this is the second. They don’t list either of our papers in their bibliography. They do have a lurking reference to “[KSW96]” (the first of our two papers) in the body of their introduction and design principles, presumably copied from our text; but a full citation for “[KSW96]” isn’t in their bibliography. Perhaps they were worried that one of the referees would read the papers listed in their bibliography, and notice the plagiarism.

The three authors are from the International Islamic University in Islamabad, Pakistan. The third author, S. Tauseef-Ur-Rehman, is a department head (and faculty member) in the Telecommunications Engineering Department at this Pakistani institution. If you believe his story — which is probably correct — he had nothing to do with the research, but just appended his name to a paper by two of his students. (This is not unusual; it happens all the time in universities all over the world.) But that doesn’t get him off the hook. He’s still responsible for anything he puts his name on.

And we’re not the only ones. The same three authors plagiarized this paper by French cryptographer Serge Vaudenay and others.

I wrote to the editor of the SIGCSE Bulletin, who removed the paper from their website and demanded official letters of admission and apology. (The apologies are at the bottom of this page.) They said that they would ban them from submitting again, but have since backpedaled. Mark Mandelbaum, Director of the Office of Publications at ACM, now says that ACM has no policy on plagiarism and that nothing additional will be done. I’ve also written to Springer-Verlag, the publisher of my original paper.

I don’t blame the journals for letting these papers through. I’ve refereed papers, and it’s pretty much impossible to verify that a piece of research is original. We’re largely self-policing.

Mostly, the system works. These three have been found out, and should be fired and/or expelled. Certainly ACM should ban them from submitting anything, and I am very surprised at their claim that they have no policy with regards to plagiarism. Academic plagiarism is serious enough to warrant that level of response. I don’t know if the system works in Pakistan, though. I hope it does. These people knew the risks when they did it. And then they did it again.

If I sound angry, I’m not. I’m more amused. I’ve heard of researchers from developing countries resorting to plagiarism to pad their CVs, but I’m surprised see it happen to me. I mean, really; if they were going to do this, wouldn’t it have been smarter to pick a more obscure author?

And it’s nice to know that our work is still considered relevant eight years later.

EDITED TO ADD: Another paper, “Analysis of Real-time Transport Protocol Security,” by Junaid Aslam, Saad Rafique and S. Tauseef-ur-Rehman”, has been plagiarized from this original: Real-time Transport Protocol (RTP) security,” by Ville Hallivuori.

EDITED TO ADD: Ron Boisvert, the Co-Chair of the ACM Publications Board, has said this:

1. ACM has always been a champion for high ethical standards among computing professionals. Respecting intellectual property rights is certainly a part of this, as is clearly reflected in the ACM Code of Ethics.

2. ACM has always acted quickly and decisively to deal with allegations of plagarism related to its publications, and remains committed to doing so in the future.

3. In the past, such incidents of plagarism were rare. However, in recent years the number of such incidents has grown considerably. As a result, the ACM Publications Board has recently begun work to develop a more explicit policy on plagarism. In doing so we hope to lay out (a) what constitutes plagarism, as well as various levels of plagarism, (b) ACM procedures for handling allegations of plagarism, and (c) specific penalties which will be leveled against those found to have committed plagarism at each of the identified levels. When this new “policy” is in place, we hope to widely publicize it in order to draw increased attention to this growing problem.

EDITED TO ADD: There’s a news story with some new developments.

EDITED TO ADD: Over the past couple of weeks, I have been getting repeated e-mails from people, presumably faculty and administrators of the International Islamic University, to close comments in this blog entry. The justification usually given is that there is an official investigation underway so there’s no longer any reason for comments, or that Tauseef has been fired so there’s no longer any reason for comments, or that the comments are harmful to the reputation of the university or the country.

I have responded that I will not close comments on this blog entry. I have, and will continue to, delete posts that are incoherent or hostile (there have been examples of both).

Blog comments are anonymous. There is no way for me to verify the identity of posters, and I don’t. I have, and will continue to, remove any posts purporting to come from a person it does not come, but generally the only way I can figure that out is if the real person e-mails me and asks.

Otherwise, consider this a forum for anonymous free speech. The comments here are unvetted and unverified. They might be true, and they might be false. Readers are expected to understand that, and I believe for the most part they do.

In the United States, we have a saying that the antidote for bad speech is more speech. I invite anyone who disagrees with the comments on the page to post their own opinions.

Posted on August 1, 2005 at 6:07 AMView Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.