Just two years after Open AI released ChatGPT, publishers and content aggregators are developing a stream of AI-enhanced search and discovery tools. They’re also experimenting with the ways generative AI could help a strained research ecosystem with back-end tasks, such as editing, peer reviewing and documenting the experimental process.
Those are all signs that the scholarly publishing industry is set “for exponential growth in its use across the research and publication lifecycle,” according to a report published late last month by the education research firm Ithaka S+R.
It follows a report the group released in January that called for a shared infrastructure for handling the risks and opportunities digital technology presents for scholarly communication, though that report was based on interviews from early 2023, when ChatGPT has been on the market for only a few months.
‘A Lot of Variables’
But even as the technology influences the production and publication of academic research at an intensifying pace, researchers themselves have been slow to adopt generative AI widely, as various reports—including some from Inside Higher Ed—have shown.
To help understand this discrepancy, Ithaka S+R launched its most recent project this past summer.
“Generative AI has injected a lot of variables into the equation of scholarly publishing. And there’s not yet a shared framework for understanding what those implications are,” said Dylan Ruediger, co-author of both reports and a senior program manager of the research enterprise division at Ithaka S+R. “Thus, there’s also not a shared framework for understanding what is going to be involved in managing the effects of generative AI.”
Researchers are already raising concerns that the freely accessible information—some of which hasn’t undergone rigorous peer review—used to train some large language models (LLMs) could undermine the integrity of scholarly research.
Ithaka S+R’s new report, which is based on interviews with a dozen representatives from the scholarly publishing world, including librarians, scholarly society members, funders and publishers, reveals areas of agreement as well as divergence regarding what AI may mean for academic research practices going forward.
“The consensus among the individuals with whom we spoke is that generative AI will enable efficiency gains across the publication process,” Ruediger and his co-author, Tracy Bergstrom, program manager of collections and infrastructure at Ithaka, wrote in a blog post about the report. “Writing, reviewing, editing, and discovery will all become easier and faster. Both scholarly publishing and scientific discovery in turn will likely accelerate as a result of AI-enhanced research methods.”
However, the interviewees were divided on just how those efficiency gains will shape the future of scholarly publishing. Some suggested that they might help publishing function more smoothly but won’t “fundamentally alter its dynamics or purpose,” according to the report. But other respondents painted a “much hazier scenario,” in which generative AI transforms academic research to such an extent that it “dwarfs” the digital tools created over the past 30 years.
One of the most popular topics of discussion was the benefits and downsides of using AI to assist with peer reviewing, which scholars have long lamented they rarely get paid for.
“There were complicated ethical implications of making automation part of peer review,” Ruediger said. “But there was also a real recognition that this was one of the big bottlenecks in the publication process right now.” Generative AI could help alleviate that holdup by matching reviewers with authors and handling the basic editing and formatting of citations, which would “allow human reviewers to focus more on the content,” he said.
But academe hasn’t yet embraced the push to develop clear communication around generative AI the way other industries have.
Somewhere between 69 and 96 percent of biomedical researchers don’t use generative AI for any specific research purpose, according to another Ithaka S+R study published in October. Additionally, Inside Higher Ed’s most recent survey of college and university chief technology officers found that just 9 percent believe higher education is prepared to handle the new technology’s rise.
About half of respondents to the Inside Higher Ed survey also said their institutions emphasize using AI for individual cases rather than thinking about it at enterprise scale.
That may be partly because “it’s really difficult to get decentralized, fairly autonomous faculty and other people inside universities to all act and behave in a relatively congruent way,” Ruediger said. “Generative AI is an enterprise-level challenge. In order for this to be a productive technology for higher ed, we’re going to have to think about it as a system problem at the institutional level and beyond.”
Publishers, however, are already “thinking about this in a fairly systematic way,” he said, which has in turn created “a real space and a need for the other stakeholder communications.”
While publishers such as Taylor & Francis and Wiley have already sold millions of dollars’ worth of academic research data to train Microsoft and other proprietary LLMs, most academic researchers are still focused on getting promotions and tenure in an environment that warns them to “publish or perish.”
‘Mitigate Unintended Consequences’
Yet if those eager-to-publish researchers use LLMs that have been trained on free but faulty information, it has the potential to “deteriorate the quality” of future research, said Chhavi Chauhan, director of scientific outreach at the American Society for Investigative Pathology and program manager for Women in AI Accelerate and Raise Programs.
And repeatedly regurgitating bad data also poses a risk of “compromising the novelty of ideas,” she said. “Humans are thinking about things in creative ways, but large language models can only see what’s already out there. They don’t have creativity.”
While major publishers and the federal government have created guidance on uses of generative AI, to be most effective those plans also need buy-in from academic institutions, she said.
In an industry with “no clear benchmarks,” Chauhan said, “collaboration is the way to move forward.”
And although the diversity of stakeholders and platforms may make it difficult to develop “blanket policies,” Chauhan said that preserving the public’s trust in academic research will require at least “a minimal checklist that everyone should abide by.”
Whatever that looks likes, she added, the guidance should aim to “mitigate unintended consequences” and focus on “what AI can do for humans.”