Science
How plagiarism-detection programs became an unlikely political weapon
The plagiarism accusations first struck Claudine Gay when a right-wing activist published several examples of unattributed text from the Harvard president’s academic writings. Though insufficient attribution wasn’t the only controversy swirling around Gay — her response to congressional questions about antisemitism on campus played a much bigger role — it was the tipping point that forced her resignation this month.
The next volley hit Neri Oxman, a former MIT professor and the wife of hedge fund manager Bill Ackman, who had campaigned vigorously for Gay’s ouster. The publication Business Insider reported that several paragraphs and sentences from Oxman’s dissertation appeared to have been lifted from Wikipedia. Oxman apologized for the errors on social media.
In response, Ackman wrote on X that he would be getting into the plagiarism review game as well. Ackman said his review would cover all the published work of all of MIT’s faculty, its president, Sally Kornbluth, and the university’s board members — plus all the work of the staff at Business Insider, and possibly also the work of the faculties at Harvard, Yale, Princeton, Stanford, the University of Pennsylvania and Dartmouth.
“Vetting every publication from every academic over their career at a huge university like Harvard would take thousands of hours,” said Chris Caren.
He would know. Caren is the chief executive of Oakland-based Turnitin, the world’s largest provider of academic integrity software. The company’s products include Feedback Studio, a program designed for high school and college instructors, and iThenticate, a more rigorous offering favored by academic journal editors. According to the company, 80% of U.S. college students attend schools that use Turnitin’s software to check student work for plagiarism. So do 50% of U.S. high school students. Nearly all of the leading scholarly journals use the company’s products to check submitted articles for misappropriated language and missing citations, Caren said. (Turnitin’s programs analyze only text, he noted, and won’t catch fudged figures, manipulated images or other data-related chicanery.)
The widespread adoption of plagiarism-detection software in higher learning over the last decade means the prospect of a “plagiarism check” for most college graduates under the age of about 30 isn’t much of a threat. Their essays, papers, theses and dissertations were almost certainly vetted in this way when they handed them in. But for older academics, subjecting work to the software’s level of scrutiny could well reveal attribution errors — intentional or not — that have never come to light before.
And that’s what a small but highly motivated sector of Turnitin’s customer base is counting on.
“We allow anyone to use them — media organizations, political groups,” Caren said of Turnitin’s products. “If there are other firms that want to look into someone’s past, it’s the same technology, it’s just being used by people we didn’t design it for in the first place.”
The National Science Foundation describes plagiarism as “the appropriation of another person’s ideas, processes, results or words without giving appropriate credit.” Harvard and MIT define it in similar language in their academic integrity guidelines.
In academia particularly, it can be a devastating charge. “People get jobs, grants, and a litany of other opportunities based on their research that by default is assumed to be original to them. If it is later found out to not be, it would then be saying that they got these opportunities effectively based on fraud,” said Christian Moriarty, a professor of ethics and law at St. Petersburg College in Florida.
That’s why “an accusation, unfounded or not, undermines their authority and position,” he said.
No one has accused Gay or Oxman of stealing data or high-level ideas. But some of their published works appear to contain expository sentences and paragraphs that closely match language in sources available at the time — the type of plagiarism that software can most easily detect.
Gay’s accusers highlighted multiple instances of prose that echoed other sources. For instance, two paragraphs in her 1997 doctoral dissertation closely mirrored text in a paper by researchers who were not cited anywhere in the paper. Harvard said Gay requested corrections to some of the works.
In Oxman’s case, Business Insider identified 15 nonconsecutive paragraphs in her 2010 dissertation that closely resemble language that appeared in Wikipedia articles at that time. Most are definitions of technical terms and concepts. The publication also found passages in her research papers that echoed other sources. Neither Christopher Rufo, the activist who first raised allegations against Gay, nor Business Insider disclosed what software they used to identify the problematic text.
Turnitin programs were used to discover that parts of Melania Trump’s 2016 speech at the Republican National Convention matched Michelle Obama’s 2008 remarks to the Democratic National Convention, Caren said.
The CEO said he also believes that the company’s software was used to determine that Germany’s former defense minister, Karl-Theodor zu Guttenberg, had plagiarized in his doctoral dissertation, a massive political scandal in that country that led to the star politician’s downfall in 2011.
Though Feedback Studio is only available to institutions, iThenticate can be licensed by anyone. The program digests the text of a book, research paper or article in minutes and returns a detailed report that flags the percentage of phrases and passages in the document that match those published online and in Turnitin’s database of academic journals.
The report has to be checked by a human to weed out legitimate uses of quoted material. Though the process is time-consuming, it’s much faster than an equally thorough review would have taken in a predigital age. “It’s easier to search for plagiarism than ever before,” said Jonathan Bailey, a copyright and plagiarism consultant in New Orleans. “The easier something is to do, the more people are likely to do it.”
It’s easier to search for plagiarism than ever before. The easier something is to do, the more people are likely to do it.
— Jonathan Bailey, copyright and plagiarism consultant
The idea of using plagiarism accusations as a means to discredit rivals was around long before the invention of plagiarism-checking software, said Sam Bruton, director of the Office of Research Integrity at the University of Southern Mississippi. “People have always had the ability to raise allegations of scholarly integrity for ulterior motives, be those motives personal (grudges, resentments), political or something different,” Bruton wrote in an email.
He challenged the idea that the spread of the software is primarily responsible for an increase in plagiarism accusations, attributing it instead “to the hyper-politicization that has engulfed so many American institutions.”
But many educators and academics who use such programs in their daily work said that seeing them employed for political ends has been disheartening.
The technology is designed to support instructors and help enforce proper citation guidelines, said Moriarty, who teaches other professors how to use such tools.
“People in the academic integrity field often don’t like it or appreciate it or think it’s appropriate to use academic integrity software as a means to punish for punishment’s sake,” Moriarty said. Plagiarism-detection software can’t determine how or why language similar to other sources appeared in an author’s work, whether the issue violates an institution’s code of ethics or what the consequences of such an infraction should be.
For now, at least, only humans can do that.
“Human expertise is essential to maintaining the integrity of scholarly and academic work,” said Greer Murphy, director of academic honesty at University of Rochester’s College of Arts, Sciences and Engineering in New York. “But such has always been true — the sophistication of modern technology hasn’t changed things.”