Tag: Publishing

  • AI-generated images threaten science — here’s how researchers hope to spot them

    AI-generated images threaten science — here’s how researchers hope to spot them

    [ad_1]

    A composite of six AI-generated micrographs

    All of these images were generated by AI.Credit: Proofig AI, 2024

    From scientists manipulating figures to the mass production of fake papers by paper mills, problematic manuscripts have long plagued the scholarly literature. Science sleuths work tirelessly to uncover this misconduct to correct the scientific record. But their job is becoming harder, owing to the introduction of a powerful new tool for fraudsters: generative artificial intelligence (AI).

    “Generative AI is evolving very fast,” says Jana Christopher, an image-integrity analyst at FEBS Press in Heidelberg, Germany. “The people that work in my field — image integrity and publication ethics — are getting increasingly worried about the possibilities that it offers.”

    The ease with which generative-AI tools can create text, images and data raises fears of an increasingly untrustworthy scientific literature awash with fake figures, manuscripts and conclusions that are difficult for humans to spot. Already, an arms race is emerging as integrity specialists, publishers and technology companies race to develop AI tools that can assist in rapidly detecting deceptive, AI-generated elements of papers.

    “It’s a scary development,” Christopher says. “But there are also clever people and good structural changes that are being suggested.”

    Research-integrity specialists say that, although AI-generated text is already permitted by many journals under some circumstances, the use of such tools for creating images or other data is less likely to be viewed as acceptable. “In the near future, we may be okay with AI-generated text,” says Elisabeth Bik, an image-forensics specialist and consultant in San Francisco, California. “But I draw the line at generating data.”

    Bik, Christopher and others suspect that data, including images, fabricated using generative AI are already widespread in the literature, and that paper mills are taking advantage of AI tools to produce manuscripts en masse (see ‘Quiz: can you spot AI fakes?’).

    Under the radar

    Pinpointing AI-produced images poses a huge challenge: they are often almost impossible to distinguish from real ones, at least with the naked eye. “We get the feeling that we encounter AI-generated images every day,” Christopher says. “But as long as you can’t prove it, there’s really very little you can do.”

    There are some clear instances of generative-AI use in scientific images, such as the now-infamous figure of a rat with absurdly large genitalia and nonsensical labels, created using the image tool Midjourney. The graphic, published by a journal in February, sparked a social-media storm and was retracted days later.

    Quiz: Can you spot AI fakes? A series of six images, three of which were produced by artificial-intelligence image software.

    Credit: Proofig (generated images)

    Most cases aren’t so obvious. Figures fabricated with Adobe Photoshop or similar tools before the rise of generative-AI — especially in molecular and cell biology — often contain telltale signs that sleuths can spot, such as identical backgrounds or an unusual absence of smears or stains. AI-made figures often lack such signs. “I see tonnes of papers where I think, these Western blots do not look real — but there’s no smoking gun,” Bik says. “You can only say they just look weird, and that of course isn’t enough evidence to write to an editor.”

    But signs suggest that AI-made figures are appearing in published manuscripts. Text written using tools such as ChatGPT is on the rise in papers, given away by standard chatbot phrases that authors forget to remove and telltale words that AI models tend to use. “So we have to assume that it’s also happening for data and for images,” says Bik.

    Another clue that fraudsters are using sophisticated image tools is that most of the issues that sleuths are currently detecting are in papers that are several years old. “In the past couple of years, we’ve seen fewer and fewer image problems,” Bik says. “I think most folks who have gotten caught doing image manipulation have moved on to creating cleaner images.”

    How to create images

    Creating clean images using generative AI is not difficult. Kevin Patrick, a scientific-image sleuth known as Cheshire on social media, has demonstrated just how easy it can be and posted his results on X. Using Photoshop’s AI tool Generative Fill, Patrick created realistic images — that could feasibly appear in scientific papers — of tumours, cell cultures, Western blots and more. Most of the images took less than a minute to produce (see ‘Generating bogus science’).

    “If I can do this, certainly the people who are getting paid to generate fake data are going to be doing this,” Patrick says. “There’s probably a whole bunch of other data that could be generated with tools like this.”

    Some publishers say that they have found evidence of AI-generated content in published studies. These include PLoS, which has been alerted to suspicious content and found evidence of AI-generated text and data in papers and submissions through internal investigations, says Renée Hoch, managing editor of PLoS’s publication-ethics team in San Francisco, California. (Hoch notes that AI use is not forbidden in PLoS journals, and that its AI policy focuses on author accountability and transparent disclosures.)

    Generating bogus science: Examples of AI-generated western blot, tumour sample and cell culture images.

    Credit: Kevin Patrick

    Other tools might also provide opportunities for people wishing to create fake content. Last month, researchers published1 a generative-AI model for creating high-resolution microscopy images — and some integrity specialists have raised concerned about the work. “This technology can easily be used by people with bad intentions to quickly generate hundreds or thousands of fake images,” Bik says.

    Yoav Shechtman at the Technion–Israel Institute of Technology in Haifa, the tool’s creator, says that the tool is helpful for producing training data for models because high-resolution microscopy images are difficult to obtain. But, he adds, it isn’t useful for generating fake because users have little control over the output. Existing imaging software such as Photoshop is more useful for manipulating figures, he suggests.

    Weeding out fakes

    Human eyes might not be able to catch generative AI-made images, but AI might (see ‘AI images are hard to spot’).

    The makers behind tools such as Imagetwin and Proofig, which use AI to detect integrity issues in scientific figures, are expanding their software to weed out images created by generative AI. Because such images are so difficult to detect, both companies are creating their own databases of generative-AI images to train their algorithms.

    Proofig has already released a feature in its tool for detecting AI-generated microscopy images. Company co-founder Dror Kolodkin-Gal in Rehovot, Israel, says that, when tested on thousands of AI-generated and real images from papers, the algorithm identified AI images 98% of the time and had a 0.02% false-positive rate. Dror adds that the team is now working on trying to understand what, exactly, their algorithm detects.

    “I have great hopes for these tools,” Christopher says. But she notes that their outputs will always need to be assessed by an expert who can verify the issues they flag. Christopher hasn’t yet seen evidence that AI image-detection software are reliable (Proofig’s internal evaluation has not been published). These tools are “limited, but certainly very useful, as it means we can scale up our effort of screening submissions,” she adds.

    AI images are hard to spot: Graph showing researchers struggle to identify AI-generated microscopy images, with a median success rate of 50%.

    Source: Proofig quiz

    Multiple publishers and research institutions already use Proofig and Imagetwin. The Science journals, for example, use Proofig to scan for image-integrity issues. According to Meagan Phelan, communications director for Science in Washington DC, the tool has not yet uncovered any AI-generated images.

    Springer Nature, which publishes Nature, is developing its own detection tools for text and images, called Geppetto and SnapShot, which flag irregularities that are then assessed by humans. (The Nature news team is editorially independent of its publisher.)

    Fraudsters, beware

    Publishing groups are also taking steps to address AI-made images. A spokesperson for the International Association of Scientific, Technical and Medical (STM) Publishers in Oxford, UK, said that it is taking the problem “very seriously” and pointed to initiatives such as United2Act and the STM Integrity Hub, which are tackling paper mills and other scientific-integrity issues.

    Christopher, who is chairing an STM working group on image alterations and duplications, says that there is a growing realization that developing ways to verify raw data — such as labelling images taken from microscopes with invisible watermarks akin to those being used in AI-generated text — might be the way forward. This will require new technologies and new standards for equipment manufacturers, she adds.

    Patrick and others are worried that publishers will not act quickly enough to address the threat. “We’re concerned that this will just be another generation of problems in the literature that they don’t get to until it’s too late,” he says.

    Still, some are optimistic that the AI-generated content that enters papers today will be discovered in the future.

    “I have full confidence that technology will improve to the point that it can detect the stuff that’s getting done today — because at some point, it will be viewed as relatively crude,” Patrick says. “Fraudsters shouldn’t sleep well at night. They could fool today’s process, but I don’t think they’ll be able to fool the process forever.”

    [ad_2]

    Source link

  • resources for the artistically challenged

    resources for the artistically challenged

    [ad_1]

    After graduating from the medical and biological illustration programme at Johns Hopkins University in Baltimore, Maryland, Shiz Aoki fulfilled a long-held dream: she launched her own company. Founded in 2010 in Toronto, Canada, Anatomize Studios works with large clients — pharmaceutical companies, magazines and medical professionals with niche needs and capacious budgets. Yet Aoki would often also field requests from individual researchers. They wanted to create visualizations for papers, presentations or outreach, but struggled to distil their complex science down to something approachable, let alone visually appealing.

    “I must have turned away hundreds of scientists and saw them taking to PowerPoint to create content that, sadly, didn’t do justice to these really important scientific discoveries they were making,” Aoki recalls. “I realized that my love of art is not just a passion thing — that science was actually being stalled by a lack of tools and understanding of science communication.”

    Fortunately, it’s easier than ever for researchers to create compelling figures and images, even without a background in design. For one thing, there’s BioRender, a web-based app that Aoki co-founded in 2017. Akin to Adobe Illustrator, but for life scientists, BioRender includes both bioscience-specific drawing tools and a library of more than 50,000 scientifically accurate icons. This resource and others like it — including BioIcons, Reactome and Servier Medical Art — show just how far the fields of data visualization and scientific illustration have come in the past few years, and how scientists remain hungry for tools to help them depict and share their work.

    Nature contacted graphic designers, scientific and medical illustrators, and journal art directors to glean tips and resources for creating polished visuals. Here’s what they said.

    Prioritize illustrations

    The design of figures might seem secondary to running experiments and writing them up for publication. But visualizations can help readers to make sense of abstract concepts in a way that words alone cannot.

    “The figures you choose are actually really important,” says Kelly Krause, creative director for the Nature family of journals, who is based in New York City. “People make snap judgements based on visuals, and if they don’t look good, they can steer someone away from a paper that otherwise they might like to read.” Think, for instance, of a graphical abstract that can serve as an advertisement for a research article.

    So, devote time to your visuals. Decide what information is essential, make an outline of the content, and edit your figures as ruthlessly as you would any manuscript.

    “Almost every time I talk to a scientist, they initially give me way more information than I need, because every single detail feels important,” says Kelly Finan, a designer based in Hop Bottom, Pennsylvania. “But I often find that when I then ask them to explain their work, scientists become aware of what’s extraneous and what isn’t.”

    Identify your audience

    You wouldn’t write a popular-science talk as you would a research paper, and the same goes for visualizations (see ‘Focus on basic design principles’). Is the goal to inform the reader, elicit an emotion or present data in a unique way? The answer can guide not just the content, but also style choices. “In certain fields, there’s an established way of doing things, but in others, there’s room to be more creative while still maintaining accuracy,” says Nobles Green II, the founder of Amplify Biovisuals in Atlanta, Georgia, and president of the Association of Medical Illustrators.

    Focus on basic design principles

    Familiarity with the basics of design — such as hierarchy, composition, colour and typography — can go a long way when it comes to producing polished figures.

    Hierarchy

    • Let the graphic match the flow of the language used. Because most languages read from left to right, you might want to design your graphic to ‘start’ at the top left.

    • Use numbers, bold and italicized lettering, and different font sizes to guide the reader through the image.

    • Use left- or right-justified text. Centre-aligned text is harder for the brain to process.

    Composition

    • Be consistent with style choices, such as those concerning fonts, the colour palette and iconography.

    • Use white space to make your visualization easier to digest.

    • Focus your visualization on a single goal; create different graphics for different audiences.

    Colour

    • Use a colour-palette generator to make visualizations more accessible.

    • Supplement colour with different line styles (for example, solid, dotted, dashed) to aid comprehension.

    • Use the cyan, magenta, yellow and key (CMYK) colour model for print, and the red, green, blue (RGB) model for images that will remain digital.

    Typography

    • Opt for a font with a uniform line thickness (also called the stroke weight) such as Courier or Roboto Mono.

    • Avoid mixing many different fonts in a single image, although some designers will choose a serif font for the main text and a complementary sans serif for subheadings and labels.

    • Ensure that text is readable: use a point size of at least 12 for main text, and 7 for labels.

    Similarly, consider the intended audience. Nicolle Fuller, the founder and creative director of SayoStudio, a science-communication firm in Bellevue, Washington, says this helps to set boundaries around the amount and types of information necessary in a visualization. “You can get away with more complexity when you’re making graphics for other scientists,” she explains — for instance, by including membrane proteins on the cell surface that would overcomplicate an image for the lay reader.

    Some designers therefore warn against trying to make a single visualization serve too many purposes. Instead, they say, it’s better to design a range of items — an infographic for social media, a visual abstract and a figure for a seminar presentation, for instance — using the same information. Fuller says that considering the audience has helped clients to think creatively about their data, prompting occasional “aha moments”.

    Don’t over-design

    With academic manuscripts ballooning in size, it can be tempting to let figures do the same. But more information doesn’t necessarily lead to greater comprehension, and many illustrators live by the motto that less is more.

    “There’s a tendency to overly decorate a figure — add a gradient or a shadow to make it look more jazzy — that actually gets in the way,” Krause says. “You wouldn’t expect flowery prose in a scientific paper, so why would you do that to your figures?”

    Ashleigh Campsall, a senior graphic designer at the life-sciences magazine The Scientist, says lean graphics tend to look more professional, and the more white space, the better. “Letting everything breathe makes it easy to digest and interpret, and takes away some of the mental work for the reader,” she says.

    Think accessibility

    As dedication to diversity, equity and inclusion has grown, so too has the academic community’s embrace of inclusive visualization methods. For example, colour palettes should be suitable for people with a colour-vision deficiency or who are colour-blind, but should incorporate redundancy, too. A line graph might use different colours to indicate each treatment, for instance, but you can also use solid, dashed and dotted lines to increase comprehension, as well as more-descriptive captions.

    Create ‘alt text’, too — a written description of an image to be read aloud by a screen reader. One guideline is to limit alt text to roughly 280 characters, or about the length of a social-media post. And use that space creatively, Green advises: you’re trying to paint a picture with words.

    Use AI sparingly (or not at all)

    Image generators powered by artificial intelligence (AI) have made it easier than ever to create seemingly high-quality pictures from scratch. But almost as soon as these tools appeared, horror stories emerged. Several papers have been retracted owing to bizarre, AI-generated visualizations, including two published earlier this year, one showing a rat with overly large testes in Frontiers in Cell Development and Biology and the other containing an anatomically flawed figure with nonsense labels in Medicine.

    Many publishers now ban AI-generated images from manuscripts, and designers who spoke to Nature say they mostly avoid the technology. Campsall, for example, might pull a stock image into Adobe Illustrator and use its AI generator to extend the background. “But for wholesale image design, the technology is really just not there yet,” she says. (Citing the unstable legal framework surrounding generative-AI-based images, Nature has so far barred their use except in instances in which AI is the research focus.)

    But other designers, including Aoki, say there’s room to leverage AI creatively. Just as writers might use a chatbot to brainstorm headlines or check a draft for tone, image generators can be helpful during the mock-up process. BioRender, Aoki says, is beta-testing a handful of AI-powered tools that allow users to input a text description — say, a cell–cell interaction or an experimental timeline — and get a draft figure out.

    “The difference here is that the data that we’re training on isn’t just random data from the Internet, it’s our massive library of vetted icons,” says Aoki, adding that humans must still provide the final stamp of approval. “Scientific integrity and accuracy are so important, so we want to make sure we get this right.”

    [ad_2]

    Source link

  • High-performers and specialists in neuroscience research

    High-performers and specialists in neuroscience research

    [ad_1]

    Strong focus

    Among the top 25 countries for neuroscience output in the Nature Index, these ten have the highest proportion of neuroscience Share relative to their overall Share (neuroscience %). The United States, Germany, United Kingdom and Canada all rank within the top 10 overall for neuroscience; Norway and Portugal have the lowest overall ranks, at 22 and 23, respectively.

    On the up

    The Share of the fastest rising institutions in neuroscience for 2022–23 is shown over a five-year period. The University of Queensland in Australia is the only institution from outside China in the top five. The top-ranked institution in neuroscience overall, Harvard University in Cambridge, Massachusetts, was the sixth fastest riser, increasing its Share by 4.5% to reach 229.20 in 2023.

    Institution outputs

    Institutions with a special focus on neuroscience research are highlighted in this chart, which plots their neuroscience Share against their neuroscience %. Just over 10% of the top 200 institutions in neuroscience have more than 200 Share in the topic for the period 2019–23, and only 8.5% have more than 30% of their overall Share related to neuroscience.

    The Chinese Academy of Sciences in Beijing has a relatively low proportion of its Nature Index output focused on neuroscience research, but it has the 6th highest Share in the topic, at 378.76. Harvard University’s Share in neuroscience (996.17) dwarfs that of all other institutions. With 19.6% of its total Share in the Index related to neuroscience, this is a clear priority area. Neuroscience-related outputs represented 89.7% of the total Share of the Allen Institute in Seattle, Washington, for 2019–23. The institution is ranked 144th in the topic overall, with a Share of 53.15.

    [ad_2]

    Source link

  • How long COVID could lift the fog on neurocognitive disorders

    How long COVID could lift the fog on neurocognitive disorders

    [ad_1]

    Neurocognitive symptoms, including an impaired ability to process and memorize information, are among the most common and debilitating manifestations of long COVID, a disease experienced by as many as 400 million people worldwide, by one recent estimate (Z. Al-Aly et al. Nature Med. 30, 2148–2164; 2024). These symptoms, which can develop alongside those resulting from diseases of the lungs, heart and other organs, affect patients’ everyday functioning for months or even years following COVID-19. Matthew Fitzgerald, a 28-year-old former engineer at Tesla, described his long-COVID-related impairment during a clinic visit: “I’m a shell of myself. My physical issues aren’t half as bad as my brain problems. You can say brain fog, but that doesn’t come close to doing it justice.”

    Extreme cases of long COVID stand out — authors who cannot write; nurses who fear making a medical error — but symptoms for most people are more insidious. Many long-COVID patients have neurological problems that meet the criteria for what would normally be considered age-related mild cognitive impairment, or mild to moderate dementia.

    Over the past 30 years, US$42.5 billion have been spent on Alzheimer’s research, with limited progress. A decade ago, in part owing to the discovery of neurocognitive symptoms among younger, previously healthy people with complex illness in the intensive care unit, the US National Institutes of Health (NIH) designated a category known as Alzheimer’s disease and related dementias (ADRD) to describe neurological conditions that rob people of their memory and personhood. There is now ample evidence that both older and younger people with long COVID and other infection-associated chronic conditions are at risk of developing ADRD.

    Michael Peluso portrait.

    Michael J. PelusoCredit: Noah Berger/UCSF

    As a result, the NIH and other institutions around the world have begun to expand the scope of dementia research to include long COVID under the funding umbrella of ADRD. We serve as co-investigators on a soon-to-launch National Institute on Aging-funded phase III trial to test whether baricitinib, an immune-modulating medication, can improve symptoms of patients with ADRD from long COVID. We hope that this and similar work will open the door for studies of other infection-associated chronic conditions, including myalgic encephalomyelitis/chronic fatigue syndrome and post-treatment Lyme disease.

    Brain studies of COVID patients have been among the most revealing science to emerge from the pandemic. Patient scans reveal structural changes, such as in regions near the olfactory tracts and in specific areas of the blood–brain barrier, a membrane that protects the central nervous system from blood-borne toxins and pathogens. Signs of inflammation are sometimes present, and viral remnants have been found in brain specimens of people who died.

    Wes Ely portrait.

    E. Wesley ElyCredit: Heidi Ross

    Much remains unknown about how long COVID develops and can be treated, but research on the interplay between our immune and nervous systems could provide clues. Scientists have identified how vagal neurons, which connect the brain to the rest of the body, can relay information about pathogens to the brain stem by increasing or dampening the immune response, for example (H. Jin et al. Nature 630, 695–703; 2024). Many researchers have hypothesized that abnormalities in vagal signalling, potentially set off by the SARS-CoV-2 virus, can drive long COVID.

    Considering that long COVID affects more than 5% of people infected with SARS-CoV-2, and the risk that some of these patients will develop a rapidly acquired ADRD, there now exists a critical mass of people to study in this category. Vast resources will be needed to untangle how SARS-CoV-2 infection causes long COVID and how it might be prevented and treated. This line of research could have major implications for autoimmune diseases, in general, and neuro-inflammatory conditions, in particular.

    Funding organizations are beginning to respond. Beyond the NIH’s US$1.15 billion RECOVER initiative to support long-COVID research, institutes within the NIH are increasingly supporting studies of neurologic long COVID. Major funders in Europe and elsewhere are also stepping up. But more commitments are urgently needed. With sustained investment in long-COVID research, there is enormous potential to inform future directions in ADRD — an area that in the coming years will contend with rapidly escalating patient numbers that are expected to reach 139 million globally in 2050, up from 55 million in 2020. It is crucial that we do not lose momentum.

    Competing Interests

    The authors declare no competing interests.

    [ad_2]

    Source link

  • Peer review by committee? New journal rethinks old model

    Peer review by committee? New journal rethinks old model

    [ad_1]

    Low angle view of hands of a group of people working together around a table of paper documents.

    The Stacks Journal is upending conventional peer review by introducing collaboration into the process.Credit: FangXiaNuo/Getty

    The peer-review system has been stressed and stretched to a near-breaking point. It’s becoming harder to find reviewers, many of whom see reviewing as a burden that is not adequately rewarded. The rise of predatory publishers, many of which falsely claim to provide a peer-review process; paper mills, which are known to fabricate peer reviews; and plagiarism of peer-review reports have harmed trust in the system.

    The Stacks Journal is aiming to provide a faster, more transparent and trustworthy peer-review model by organizing committees of researchers to assess manuscripts.

    Launched in July as an open-access, digital-only publication, the Stacks Journal is the brainchild of David Green, an ecologist based in Portland, Oregon. The inspiration, says Green, was his own experience with the inefficiencies of academic publishing. In 2020, Green, who had finished a study on the impact of wildfire on carnivores1, wanted to get the results out quickly so that they could inform land-management policy. But his paper languished in the publishing system for almost two years, with no clear explanation as to why. So, he resolved to change the process.

    Green spoke to Nature Index about the inspiration for the Stacks, and how he hopes it will fix some of the weaknesses of academic publishing.

    What inspired you to launch the Stacks Journal?

    I talked to other ecologists at conferences and field sites, and everyone was frustrated with the status quo of scientific publishing — from huge article-processing fees and long peer-review times to the rise of predatory journals. These and other factors undermine people’s ability to publish their research; estimates from clinical-trial data suggest that around 50% of good data never get published2. We’re missing out on a lot of important information.

    I started researching peer review and learnt that it hasn’t changed much in the past 40 years. So, I explored what a new system could look like. I did in-depth interviews with dozens of researchers in different fields and surveyed hundreds more to test ideas.

    The result is the Stacks Journal’s peer-review process, which was designed to reflect how people discuss ideas in the Internet age: meeting online to collaborate across social-media platforms, for example. Advances in the way we communicate haven’t yet made it to the peer-review process.

    How does the Stacks Journal’s peer-review model work?

    We are shifting peer review away from an individual gatekeeper model, wherein an editor at a journal decides what should be published. Instead, we use a community-based model, in which we gather input from a group of people to collectively determine whether an article is published. We’ve designed this model to be rewarding to both authors and reviewers, and completely transparent.

    What’s key is that the Stacks Journal’s peer-review process happens in collaboration instead of isolation. This is how peer review and publishing used to work. For instance, in the nineteenth century, the Royal Society in London invited groups of scholars with expertise in specific topics to come together, debate new work and determine whether it would be published. Now, most journals have two reviewers who assess a manuscript separately. At the Stacks, we bring together communities of reviewers to collaborate. It’s double-blind, to ensure fairness, and reviewers can see each other’s comments and discuss whether they agree.

    All the peer-review reports, underlying data and code are publicly posted, along with the names of the reviewers.

    What else sets the Stacks Journal apart?

    We’ve created a ‘credibility score’ for each published article, so readers can quickly get a sense of the reviewer’s feedback. The credibility score is calculated as the percentage of reviewers who voted to accept the article for publication. So, for example, if six out of seven reviewers think an article should be published, its score will be 86%.

    To recognize the role of the reviewers in contributing to the research, they can opt in to be credited as ‘collaborators’, listed just below the authors on the published article. That way, a reviewer can include their work on their CV.

    Our publishing model is also different — we offer an annual membership for US$199 that allows unlimited open-access publishing. In conventional publishing, it can cost thousands of dollars to publish one article. In our research, we found that this limits a lot of researchers from ever sharing or publishing their findings.

    How does the journal find and coordinate reviewers?

    The Stacks is built on communities of researchers that form around specific topics. Right now, we’re focused on ecology, but soon we’ll add chemistry, computer science and medicine. Any eligible researcher can sign up to be a reviewer on our website for free. To be eligible, you must have published at least one peer-reviewed article in the relevant field of study.

    When we receive a submission, we send it to reviewers with expertise in the paper’s topic. Reviewers submit their feedback on our online platform, which they use to discuss among themselves. The reviewers are all blinded to each other’s identities during the process, and no individual carries more weight than another.

    It has been easy for us to find reviewers. They find the process rewarding, and they keep coming back.

    What challenges have you encountered?

    We’ve had to cap the number of reviewers on each article at seven, because that’s what our software can handle. This means we’ve had to turn people away. We want to have unlimited reviewers on every article, so we are building new software to make this happen.

    Another challenge is the fact that we are a new journal — we don’t have an impact factor or third-party marker of credibility, so some scientists are not ready to submit their research to us. However, authors who have say that they love how streamlined the publishing process is and how much our review system strengthened their papers, which brings credibility to their research that is more long-lasting than that afforded by most journals.

    Over the next year, we aim to publish more than 100 articles, including our first special issue, and will continue finding ways to do peer review in a more productive and efficient way.

    This interview has been edited for length and clarity.

    Nature Index’s news and supplement content is editorially independent of its publisher, Springer Nature. For more information about Nature Index, see the homepage.

    [ad_2]

    Source link

  • Data integrity concerns flagged in 130 women’s health papers — all by one co-author

    Data integrity concerns flagged in 130 women’s health papers — all by one co-author

    [ad_1]

    Close-up of a stack of magazines tied together with string

    Duplicated text and unusual statistics have been flagged in 130 studies by a single physician-researcher and his co-authors.Credit: Getty

    A team of scientist–sleuths has flagged data-integrity concerns in 130 studies authored by the same biomedical researcher, a specialist in women’s health and gynaecology, and his colleagues. The sleuths published their findings in a peer-reviewed paper earlier this year1.

    Some of the studies that were identified as potentially problematic have been cited by other researchers or included in analyses that could inform clinical practice. The number of papers being questioned is among the highest by a still-active life-scientist, say some specialists.

    The 130 studies were published between 2014 and 2023 and report the results of clinical trials and other research on maternal and women’s health. The highlighted problems include oddities in reported statistics, unfeasible results and text that is identical to other papers. Ahmed Abbas, an obstetrician and gynaecologist at Assiut University in Egypt, is listed as a co-author or corresponding author for all 130 articles. Abbas did not respond to Nature’s request for comment.

    Some of the papers remain part of the literature. Eleven have been retracted. Before it was retracted, one of those 11 was included in a 2019 meta-analysis on a treatment to prevent miscarriage. The retractions of the paper by Abbas and his team and another, unrelated paper will probably change the conclusion of the analysis, says one of the 2019 work’s authors.

    The inclusion of a potentially unreliable study in a systematic review can have harmful consequences, because “it can immediately affect how a surgeon or an [obstetrician–gynaecologist] is doing their job”, says James Heathers, a forensic meta-scientist at Linnaeus University in Växjö, Sweden, who was not involved with the investigation that identified the data-integrity concerns.

    Women’s health specialists are actively developing strategies to prevent the publication of questionable data. But they say that once these papers are published, it’s difficult to purge them from the literature.

    Alaa Mohamed Ahmed Attia, the dean of the Faculty of Medicine at Assiut University, with which Abbas is affiliated, did not respond to Nature’s request for a comment on the concerns raised about Abbas’s publications in this year’s peer-reviewed paper.

    Rejection and retraction

    The 130 flaggedstudies were described in a paper published in May1 in the Journal of Gynecology Obstetrics and Human Reproduction by obstetrician and gynaecologist Ben Mol, at Monash University in Clayton, Australia, and his colleagues.

    In 2016, Mol peer-reviewed an unpublished manuscript co-authored by Abbas about a clinical trial of the hormone progesterone to prevent miscarriage. Mol noticed discrepancies in the paper and notified the journal, he says. The journal rejected the work by Abbas and his team. But in 2017, a different journal, The Journal of Maternal-Fetal & Neonatal Medicine, published a version2 of the manuscript that included changes to the sections that Mol had flagged, he says. The journal ultimately retracted the paper in December 2019.

    According to the retraction notice, the journal’s editors-in-chief learnt that previous versions of the manuscript “showed significant changes to the underlying data.” The notice also said that when contacted, the authors could not provide the original data to verify the results. According to the journal’s publisher, Taylor & Francis, concerns about the paper were first raised in February 2019. The resulting investigation led to the article’s retraction later that year, the publisher says. Abbas did not respond to Nature’s request for comment about the retraction.

    Massive database

    Mol’s team decided to survey all papers by Abbas with the exception of literature reviews, case reports and studies done as a part of an international collaboration. They identified 263 papers that included Abbas as an author. These studies collectively enrolled more than 74,000 participants between 2009 and 2022.

    Of the 263 studies analysed in the paper, 130 — almost half — raised the sleuths’ concerns. Some of the papers had statistics that seemed unfeasible. One used wording that was similar to that of a previously published paper. The articles that the team flagged appeared in journals produced by several publishers such as Taylor & Francis and Springer Nature, which also publishes Nature. Nature’s news team is editorially independent of its publisher. When asked for comment by the news team, Springer Nature did not respond.

    The sheer number of studies that were claimed to have been produced in such a short period of time caught the attention of Mol’s team. According to the reported registration and publication timeline of the papers, in May 2017, Abbas would have been conducting 88 simultaneous clinical studies. Catherine Cluver, a gynaecologist and obstetrician who leads the preeclampsia research unit at Stellenbosch University in South Africa, agrees with Mol’s team that it seems unfeasible to conduct such a large number of studies at one time. “Doing all of the regulatory work, the ethics approvals, making sure the trials are being run correctly … I think there is no way you could do more than four or five, and even then, it’s a push,” she says.

    Concern over numbers

    A common issue identified by Mol and his colleagues was statistical oddities. One paper they flagged, published in the journal Proceedings in Obstetrics and Gynecology3, evaluated the effect of the medication esomeprazole in women with the pregnancy complication preeclampsia. The sleuths noted that the last digit of 31 of the 32 values in tables 2 and 3, including means and standard deviations, are even numbers (see ‘Even numbers abound’). In scientific data, the digits of such measurements and statistical results tend to be more equally distributed between odd and even numbers, so the chance of having so many values ending in even numbers would be low. The numbers are a “concern”, according to the paper by Mol and his team.

    Even numbers abound: Two tables published in a paper which include a high proportion of values that end in even numbers.

    Source: Ref. 1

    The tables also feature numerous pairs of numbers that have identical digits after the decimal point — for example, 0.76. Some of the repetitious values are in the same table; some are split across the tables. This, too, is concerning, says the paper by Mol and his team.

    These unusual numbers should compel the authors to present their raw data, says Nicholas Brown, a psychologist and research-integrity specialist at Linnaeus University.

    The editor-in-chief of Proceedings in Obstetrics and Gynecology, Donna Santillan, said in a statement that all inquiries about research or publication misconduct are investigated by the journal. Santillan, a reproductive sciences researcher at the University of Iowa in Iowa City, declined to comment on whether this study is currently being investigated, citing privacy concerns.

    Continuing investigation

    Other studies flagged by Mol’s team describe apparently improbable results. In a 2020 survey4 in The European Journal of Contraception & Reproductive Health Care that assessed the attitudes of obstetricians and gynaecologists in Egypt towards abortion, for example, the mean age of physicians surveyed was 42.6, and their mean number of years in practice was 26.4. For these numbers to be correct, the mean age at which these physicians started practicing would be 16.2. The same paper contains phrases that are identical to those of a study5 published in 2009 by different authors (see ‘Textual echo’).

    Textual echo: Excerpts from two different papers which appear to show similar wording.

    Source: Ref. 1

    The journal’s publisher, Taylor & Francis, says that it is currently investigating the paper, after concerns were raised in December 2023. Abbas did not respond to a request for comment about the investigation.

    Mol says that he is not accusing the authors of data fabrication and it’s possible that the discrepancies are a result of unintentional errors. “We’re just presenting the facts and then other people can draw a conclusion.”

    Clinical-trial checklist

    Some publications that specialize in women’s health told Nature that they are actively working to keep problematic research from being published. For example, a group of journal editors are fighting against data falsification in the field of obstetrics and gynaecology by sharing information about potentially flawed papers. The group also drew up a checklist of seven requirements that randomized controlled trials must meet to be published, such as ethics-committee approval. If a trial’s authors don’t fulfil these requirements, “we’re not going to publish it”, says Vincenzo Berghella, editor-in-chief of the American Journal of Obstetrics & Gynecology Maternal-Fetal Medicine and a maternal-fetal specialist at Thomas Jefferson University in Philadelphia, Pennsylvania.

    If problematic studies do end up in journals, investigating them post-publication can be a “painstakingly difficult” process, says Žarko Alfirević, a specialist in fetal and maternal medicine at the University of Liverpool, UK. “The burden of proof needs to be enormously high” for journals to admit that fraud has been committed, he says.

    To mitigate the damage of problematic studies in the medical literature, Alfirević, who is an editor at Cochrane, a group that reviews medical evidence, is pushing for the adoption of trustworthiness assessments of randomized controlled trials as a condition for authors to include them in systematic reviews.

    Downstream effect

    The risk of flawed papers affecting medical care is real, says Mol. One example is the 2017 study by Abbas and his colleagues on the use of progesterone to prevent miscarriage and the 2019 systematic review in which the study was included. That same review by Cochrane also incorporated a second study, authored by a different group, that was also subsequently retracted. Both papers contributed to the review’s conclusion that progesterone supplements might reduce the risk of miscarriage in women who have experienced recurrent miscarriages. The review has been cited in ten clinical guidelines.

    Now it’s clear that, despite what the retracted studies suggested, the supplements are not effective for all women who have experienced recurrent miscarriages6. The review’s corresponding author, David Haas, an obstetrician and gynaecologist at Indiana University in Indianapolis, says that it is “highly likely” that the two retractions will change the review’s conclusion. He and his colleagues are now working to publish an updated version of the review in which the retracted studies have been removed. A notice on the current online version of the review says that the review authors have been advised that the study by Abbas and his colleagues is the subject of investigation and the review team has moved the study from ‘included studies’ to ‘studies awaiting classification’.

    Another review that included a paper authored by Abbas and his colleagues is also being updated. The meta-analysis7, published in 2023, analysed papers on a strategy combining progesterone and a procedure on the cervix to prevent pre-term birth and concluded that the combination could be successful. Among the papers analysed was a study8 by Abbas and his co-authors, which was published in the International Journal of Gynecology & Obstetrics in 2020.

    The journal retracted the paper in late 2023, noting that “inconsistencies were found within the dataset … which call into question the validity of the data.” The authors of the meta-analysis say they are aware that Abbas’s paper has been retracted and they are about to submit an amended version that excludes the retracted work. “Fortunately, removing this paper from our meta-analysis has not influenced the primary outcome,” says corresponding author Craig Pennell, an obstetrician and gynaecologist at the University of Newcastle in Australia.

    [ad_2]

    Source link

  • why it’s time to banish bad-mannered reviews

    why it’s time to banish bad-mannered reviews

    [ad_1]

    A black and white photograph of Cathy Foley at her Linfield Office in 1993

    Even Cathy Foley, Australia’s chief scientist, has encountered unhelpful peer-review comments on her work.Credit: Fairfax Media Archives/Getty

    Electromagnetics researcher Akhlesh Lakhtakia is head of a leading US department of engineering science at Pennsylvania State University in University Park, the author of more than 840 journal articles and a fellow of 9 learned societies. But in 1988, when he was an assistant professor, a peer reviewer said of a paper he had submitted for publication: “This is rubbish. Obviously the author or authors had no EM [electromagnetics] training nor physical intuition.”

    Peer review, which has for centuries been the standard tool to determine an academic paper’s suitability for publication, is known to be flawed1. Now, one of its major weaknesses, sheer bad manners on the part of the reviewer, has been highlighted in a YouTube video from IOP Publishing (IOPP), headquartered in Bristol, UK — a society-owned publisher of more than 90 journals.

    Released to mark Peer Review Week, which this year runs from 23 to 27 September, the one-minute film features four scientists who hold placards showing the rude, inappropriate or irrelevant reviews that they received when they were early-career researchers. At the same time, an overlay lists their stellar achievements since. Activities taking place during this year’s Peer Review Week, involving more than 35 organizations around the world, will focus on innovation and technology, including artificial intelligence (AI) and how it can be used to automate administrative tasks in the peer-review process.

    A study published in 2019 revealed that six in ten researchers in the international science, technology, engineering and mathematics (STEM) community have received at least one unprofessional review — and of those, seven in ten have received several2.

    Such reviews don’t always cause long-term damage. Lakhtakia tells Nature that he felt “outraged for a few weeks” after he was refused the chance to submit a revised manuscript that would refute the reviewer’s criticisms. “Then,” he adds, “I wrote a monograph on the broader research topic that led to my elevation to the fellowship of a major learned society in 1992.”

    Physicist Cathy Foley, Australia’s chief scientist, who has published 112 refereed papers in international journals, recalls a reviewer’s unhelpful comments on a manuscript that she and a younger colleague had “put our heart and soul into drafting” in 2009, when she was a research-programme leader.

    “It was written in a very personal way that suggested our team was substandard and we are not worthy of being researchers,” she says. “It took a lot of discussion and coaching to help us see beyond the nasty comments and look for the research advice. Focusing on that enabled us to revise the paper and move on.”

    But not everyone shrugs off insulting remarks. Their impact on self-confidence, productivity and career trajectories can be significant, says Laura Feetham-Walker, IOPP’s reviewer-engagement manager, who led the video project. Here, she explains why mean-spirited peer-review comments should be challenged, and why the science community needs to discuss this commonplace humiliation of its younger members.

    When did you realize unprofessional reviewer comments were an issue?

    I first heard academics discussing this problem shortly after I joined IOPP in 2020 as its first reviewer-engagement manager, while I was running training workshops for early-career reviewers. Before that, I had worked at BMJ Group and The Lancet. I see my role as engaging with reviewers and supporting them in submitting excellent, constructive reports.

    A still from IOP Publishing’s video showing Akhlesh Lakhtakia holding a piece of paper displaying a rude peer review comment

    Akhlesh Lakhtakia, who leads a US department of engineering science, holds a piece of paper showing a rude reviewer comment that he received early in his career.Credit: IOP Publishing

    It became clear at the workshops how many researchers have been affected by this issue — including the senior reviewers, who spoke up about the rude comments they’d received early in their careers.

    Is there a fine line between useful criticism and rudeness?

    Not at all. The two are very different. You can have a critical, even very negative, review that is not at all problematic. In the 2019 study, which received feedback from 1,106 STEM professionals who had been first authors on manuscripts submitted to peer-reviewed journals, the definition was clear-cut: unprofessional peer review is that which is unethical, irrelevant, mean-spirited or cruel and lacking constructive criticism2.

    Should academics learn to be a bit more thick-skinned?

    Maybe. Lakhtakia advises junior researchers to “dispassionately evaluate criticism and then proceed accordingly”. But people from groups that are historically under-represented in STEM — women, non-binary people and those from ethnic minorities — are most likely to report that their confidence as a scientist has been undermined by rude reviews2. They are also the groups most likely to report long-term setbacks in their productivity and career advancement. It’s easy to see how that might happen: if you’ve already got imposter syndrome, and your confidence is low, a mean comment might really get to you. This matters. The STEM community can’t afford to allow unprofessional peer review to disempower effective researchers or lead to important work going unpublished.

    What can be done?

    It’s an editor’s job to sift out unprofessional comments in reviews. IOPP policy, in line with guidance from the Committee on Publication Ethics, a non-profit organization in Eastleigh, UK, is for the editor to ‘rescind’ problematic reviews and ask the reviewer to revise and resubmit them. If the reviewer declines, then the editor might make minor amendments to remove any problematic comments. But editors are under time pressure and deal with many peer-review reports every day, so some inappropriate comments slip through the net. We need to define unprofessional reviewing, to make it easier to track and to filter such comments out.

    We also need to acknowledge that most reviewers do a brilliant job in difficult circumstances and will welcome support to improve their skill and confidence as a reviewer. IOPP now offers a free peer-review training course, available to everyone. Those who complete it earn a certificate, and early-career reviewers can include it on their CV.

    But most of all, scientists need to talk more about rude reviews. That’s why we made the video.

    Do anonymous reviews encourage rudeness?

    Perhaps, yes. There is some evidence of an ‘online disinhibition effect’ — or a lack of restraint that people feel with they communicate over the Internet — so rudeness might have increased as peer review has moved online, although more research is needed in this area.

    What about open peer review? Is that helping to end rude reviews?

    It has had a major impact, undoubtedly. At IOPP, in February 2022, we introduced a version of open peer review (OPR), called transparent peer review (TPR), throughout our open-access journals. TPR shows the complete peer-review process, from initial review to final decision, with the reviewer reports published alongside accepted articles. It requires both authors and reviewers to opt in. Anecdotally, senior staff on TPR journals say they have never seen rude or unprofessional comments in TPR reviewer reports. But TPR would have to be mandatory for reviewers to completely eliminate rudeness.

    Is there a downside to TPR?

    No. But uptake has been modest. Only about half of our authors choose to make their reviewer reports visible. The number of STEM journals that use TPR or OPR is relatively low, so often it isn’t an option for authors and reviewers.

    In 2021, we introduced double-anonymous (DA) peer review. We were the first physics publisher to adopt this approach across our entire portfolio. Under DA, both authors and reviewers are anonymous. The main aim is to reduce bias in science publishing with respect to gender, race, country of origin and affiliation, the latter reflecting the little-acknowledged risk of ‘prestige bias’, or the favouring of work by scientists associated with elite institutions.

    We were also the first society publisher to combine DA and TPR throughout our open-access journals. The reviewer and author are anonymous throughout the review process, but the reviewers’ names and full reports are published with the final research.

    We know DA peer review works. Women and non-binary people who choose to submit through DA are 11% more likely to have their papers accepted. In particular, sub. mitting authors in Australasia (9.8%) and Africa (7.7%) are more likely to have their manuscript accepted if it is anonymized, and only authors in western Europe benefit from their papers not being anonymized. We also know, again anecdotally, that DA reduces the frequency of rude comments.

    Why do you think that is?

    Remember that reviewers are always ‘seen’ by their editors. Our internal editors rate every review on a scale of 1 to 5. Those who get a 5 achieve IOP trusted reviewer status. A review containing unprofessional comments will almost always get a score of 1, or occasionally 2 if the comments are borderline unprofessional.

    Last November, we introduced reviewer feedback, which allows reviewers to request their review’s score direct from the editorial team. More than 23,000 reviewers have already opted to receive this feedback. We also offer an overview of how to critique a scientific manuscript, which includes examples of reports, for papers from eight fields of physics, that were rated 1, 3 or 5.

    Reviewers’ comments can also be seen by other reviewers during co-reviewing, whereby a reviewer formally invites a colleague to collaborate with them. More than eight in ten reviewers who have participated in co-review have told us that they find it useful or very useful.

    The signs all indicate that these mechanisms work together to improve reviewers’ understanding of how to do good peer review — and that will benefit science at large.

    [ad_2]

    Source link

  • Can AI be used to assess research quality?

    Can AI be used to assess research quality?

    [ad_1]

    Illustration of an ethereal humanoid figure sitting a table writing and a human figure in a white coat is reflected in the table

    Illustration: Neil Webb

    Do squirrel surgeons generate more citation impact? The question seems ludicrous, or perhaps the start of a bad joke. But the question, posed by data scientist, Mike Thelwall, was not a joke. It was a test. Thelwall, who works at the University of Sheffield, UK, had been assessing the ability of large language models (LLMs) to evaluate academic papers against the criteria of the research excellence framework (REF), the United Kingdom’s national audit of research quality. After giving a custom version of ChatGPT the REF’s criteria, he fed 51 of his own research works into the model and was surprised by the chatbot’s capability to produce plausible reports. “There’s nothing in the reports themselves to say that it’s not written by a human expert,” he says. “That’s an astonishing achievement.”

    However, the squirrel paper really threw the model. Thelwall had created the paper by taking one of his own rejected manuscripts on whether male surgeons generate more citation impacts than female surgeons, and to make it nonsensical he replaced ‘male’ with ‘squirrel’, ‘female’ with ‘human’ and any references to gender he switched to ‘species’ throughout the paper. His ChatGPT model could not determine that ‘squirrel surgeons’ were not a real thing during evaluation and the chatbot scored the paper highly.

    Thelwall also found that the model was not particularly successful at applying a score based on REF guidelines to the 51 papers that were assessed. He concluded that as much as the model could produce authentic-sounding reports, it wasn’t capable of evaluating quality.

    The rapid rise of generative artificial intelligence (AI) such as ChatGPT and image generators such as DALL-E has led to increasing discussion about where AI might fit into research evaluation. Thelwall’s study1, published in May, is just one piece of a puzzle that academics, research institutions and funders are trying to piece together. It comes as researchers also grapple with the many other ways that AI is affecting science and the developing guidelines that are springing up around its use. These discussions, however, have rarely focused on providing a steer on how AI might be used in assessing research quality. “That is the next frontier,” says Gitanjali Yadav, a structural biologist at India’s National Institute of Plant Genome Research in New Delhi, and member of the AI working group at the Coalition for Advancing Research Assessment, a global initiative to improve research assessment practice.

    Notably, the AI boom also coincides with growing calls to rethink how research outputs are evaluated. Over the past decade, there have been calls to move away from publication-based metrics such as journal impact factors and citation counts, which have shown to be prone to manipulation and bias. Integrating AI into this process at such a time provides an opportunity to incorporate it in new mechanisms for understanding, and measuring, the quality and impact of research. But it also raises important questions about whether AI can fully aid research evaluation, or whether it has the potential to exacerbate issues and even create further problems.

    Quality assessments

    Research quality is difficult to define, although there is a general consensus that good quality research is underpinned by honesty, rigour, originality and impact. There’s a wide variety of mechanisms, each operating at different levels of the research ecosystem, to assess these traits, and myriad ways to do so. The bulk of research-quality assessment happens in the peer-review process, which is, in many cases, the first external quality review performed on a new piece of science. Many journals have been using a suite of AI tools to supplement this process for some time. There’s AI to match manuscripts with suitable reviewers, algorithms that detect plagiarism and check for statistical flaws, and other tools aimed at strengthening integrity by catching data manipulation.

    More recently, the rise of generative AI has seen a rush of research aimed at exploring how well an LLM might be able to aid peer review — and whether scientists would trust those tools to do so. Some publishers allow AI to assist in manuscript preparation, if adequately disclosed, but do not allow its use in peer review. Even so, there’s a growing belief among academics in the ability of these tools, particularly those based on natural language processing and LLMs.

    Five proportion bars showing the responses to a survey of researchers who used an AI tool to generate feedback on research manuscripts.

    Source: Ref. 2

    A study published in July this year2, led by computer science PhD student, Weixin Liang, in the lab of biomedical data scientist, James Zou, at Stanford University in California, assessed the capability of one LLM, GPT-4, to provide feedback on manuscripts. The study asked researchers to upload a manuscript and have it assessed by their AI model. Researchers then completed a survey evaluating the feedback and how it compared with human reviewers. It received 308 responses, with more than half describing the AI-generated reviews as “helpful” or “very helpful”. But the study did highlight some problems with that feedback: it was sometimes generic and struggled to provide in-depth critiques.

    Zou thinks this doesn’t necessarily preclude the use of such tools in certain situations. One particular example he mentions is early-career researchers working on the first draft of a paper. They could upload a draft to a bespoke LLM and receive commentary about deficiencies or errors in their draft. But given the laborious and somewhat repetitive nature of peer review, some academics worry that there could be a tendency to lean on the outputs from a generative AI system capable of delivering reports. “There’s no kind of glory or funding associated with peer review. It’s just seen as a scientific duty,” says Elizabeth Gadd, head of research culture and assessment at Loughborough University, UK. There is already evidence that peer reviewers are using ChatGPT and other chatbots to some extent, despite the rules put in place by some journal publishers.

    Thelwall believes there’s more that AI could do in helping peer reviewers to evaluate research quality, but there is reason to move slowly. “We just need lots of testing,” he says. “And not just technical testing, but also pragmatic testing, where we gain confidence that if we provide the AI to the reviewers, for example, that they won’t abuse it.”

    Yadav sees great benefit in AI as a time-saving tool and has been working with it to help rapidly assess wildlife imagery from field-based cameras in India, but she sees peer review as too important to the scientific community to hand over to the bots. “I’m personally absolutely against peer review being done by AI,” she says.

    Quality savings

    One of the most discussed benefits of using AI is the idea that it could free up time. This is particularly apparent in institutional and national systems of evaluating research — some of which have incorporated AI. For instance, one funder in Australia, the National Health and Medical Research Council (NHMRC), already uses AI through “a hybrid model combining machine learning and mathematical optimisation techniques” to identify suitable human peer reviewers to judge grant proposals. The system helps to remove one of the administrative bottlenecks in the evaluation process, but it’s where the AI use ends. An NHMRC spokesperson says the agency “does not use artificial intelligence, in any form, to directly assist with research quality evaluation” itself.

    Even using AI for such administrative support could be a major resource saving, however, especially for large national assessments such as the REF. Thelwall says the exercise is known for its incredible drain on researchers’ time. More than 1,000 academics help to assess research quality in the REF and it takes them about half a year to get it done.

    “If we can automate evaluations”, says Thelwall, then “it would be a massive productivity boost”. And there’s potential for huge savings: the most recent REF, in 2021, was estimated to have cost around £471 million (US$618 million).

    Similarly, New Zealand’s assessment of researchers, the Performance Based Research Fund, has previously been described by Tim Fowler, chief executive of the government’s Tertiary Education Commission, as a “backbreaking” exercise. In it, academics submit portfolios for assessment, placing an extreme burden on them and institutions. In April, the government scrapped it and a working group has been charged with delivering a new plan by February 2025.

    These examples suggest AI’s major potential to create more efficiency, at least for large, bureaucratic, assessment systems and processes. At the same time, the technology is developing as perspectives on what constitutes research quality are evolving and becoming more nuanced. “How you might have defined research quality in the early twentieth century is not how you define it now,” says Marnie Hughes-Warrington, deputy vice-chancellor of research and enterprise at the University of South Australia in Adelaide. Hughes-Warrington is a member of the Excellence in Research Australia transition group, which is considering the future of the country’s assessment exercise after a review in 2021 found that it placed a significant burden on universities. She says the research community is increasingly recognizing the need to assess more “non-traditional research outputs” — such as policy documents, creative works, exhibitions — and then beyond to social and economic impacts.

    As the conversations are happening alongside the AI boom, it makes sense that new tools could fit into revised methods of research-quality evaluation. For instance, Hughes-Warrington points to how AI is already being used to detect image manipulation in journals or to synthesize data from systems used to uniquely identify researchers and documents. Applying these kinds of methods would be consistent with the mission of institutions such as universities and national bodies. “Why wouldn’t organizations, driven by curiosity and research, implement new ways of doing things?” she says.

    However, Hughes-Warrington also highlights where incorporating AI will meet resistance. There’s privacy, copyright and data-security concerns to acknowledge, inherent biases in the tools to overcome and a need to consider the context in which research assessments take place, such as how impacts will differ across disciplines, institutions and countries.

    Gadd isn’t against incorporating AI and says she is noticing it appear more often in discussions around research quality. But she warns that researchers are already one of the most assessed professions in the world. “My own general view on this is that we assess too much,” she said. “Are we looking at using AI to solve a problem that’s of our own making?”

    Having seen how bibliometrics-based assessments can damage the sector, with metrics such as journal impact factors misused as a substitute for quality and shown to hinder early-career researchers and diversity, Gadd is concerned about how AI might be implemented, especially if models are trained on these same metrics. She also says decisions involving allocation of promotions, funding or other rewards will always need human involvement to a far greater extent. “You have to be very cautious”, she says, about shifting to technology “to make decisions which are going to affect lives”.

    Gadd has worked extensively in developing SCOPE, a framework for responsible research evaluation by the International Network of Research Management Societies, a global organization that brings research management societies together to coordinate activities and share knowledge in the field. She says one of the key principles of the scheme is to “evaluate only where necessary” and, in that perhaps, there is a lesson for how we should think about incorporating AI. “If we evaluated less, we could do it to a higher standard,” she says. “Maybe” AI can support that process, but a “lot of the arguments and worries we’re having about AI, we had about bibliometrics.”

    [ad_2]

    Source link

  • Rise of ChatGPT and other tools raises major questions for research

    Rise of ChatGPT and other tools raises major questions for research

    [ad_1]

    It has been less than two years since Nature Index last looked at research data on artificial intelligence (AI), but it is a demonstration of the breathtaking speed of the field’s growth that it is now firmly rooted in the public’s consciousness as the technological revolution of our time. The launch of ChatGPT in November 2022 was a watershed moment, immediately raising questions about how large language models (LLMs) would transform society, especially the world of work.

    Research is just one area scrambling to understand the potential impact of AI technologies. In this supplement, we investigate some of the pressing issues it faces, including how AI might be used to evaluate studies and researchers, many of whom worry it will just increase the already heavy burden of assessment. There are also major questions about academia’s role as AI takes hold, especially given that current progress is largely driven by powerful companies with a commercial interest in keeping their research and data secret. Big tech’s grip on AI is vexing governments, too, as shown by the impact that lobbyists are having on emerging consumer regulation.

    In Nature Index journals, corporate research output is growing — in the United States, the leading country for AI research, it more than doubled from a Share of 51.8 in 2019 to 106.5 in 2021. But it still continues to represent a tiny proportion of total AI Share — just 3.8% in the United States last year — suggesting companies are either publishing the bulk of their research elsewhere or are keeping it under wraps. It is also concerning that the global south, where AI could help accelerate development, is under-represented; South Africa is the only African country in the top 40 nations for AI output, for instance. Although Nature Index journals represent a fraction of AI research, finding ways to redress these imbalances is essential to ensure that this revolution benefits everyone.

    [ad_2]

    Source link

  • a researcher’s quest to keep his own work from being plagiarized

    a researcher’s quest to keep his own work from being plagiarized

    [ad_1]

    Close-up of hands typing on a laptop keyboard

    Bioinformatician Sam Payne stumbled on a manuscript in March that included figures that, he says, looked identical to those in a paper he published in 2021.Credit: Getty

    When bioinformatician Sam Payne was asked to review a manuscript on a topic relevant to his own work, he agreed — not anticipating just how relevant it would be.

    The manuscript, which was sent to Payne in March, was about a study on the effect of cell sample sizes for protein analysis. “I immediately recognized it,” says Payne, who is at Brigham Young University in Provo, Utah. The text, he says, was similar to that of a paper1 he’d authored three years earlier, but the most striking feature was the plots: several were identical down to the last data point. He fired off an e-mail to the journal, BioSystems, which promptly rejected the manuscript.

    In July, Payne discovered that the manuscript had been published2 in the journal Proteomics, and he alerted the editors. On 15 August, the journal retracted the paper. An accompanying statement cited “major unattributed overlap between the figures” in it and Payne’s work. In response to questions from Nature, a spokesperson for Wiley, which publishes Proteomics, said, “This paper was simultaneously submitted to multiple journals and included plagiarized images.”

    The retraction statement also stated that four of the authors said they “did not participate in the writing and submission of the article and gave no consent for publication”, and that the fifth author did not respond. However, Nature’s news team found links between several of the authors and International Publisher, a paper mill based in Moscow. Neither the authors nor International Publisher responded to Nature’s requests for comment.

    The alleged plagiarism of Payne’s paper highlights systemic vulnerabilities in the global research community, says Lisa Rasmussen, editor-in-chief of the journal Accountability in Research. According to one analysis, roughly 70,000 papers with characteristics common to work produced by paper mills were published in 2022 alone.

    Despite the scale of the problem, there is no Interpol equivalent for journals, nor an official authority to provide industry-wide alerts about suspicious manuscripts. “It was just a complete lucky break that the person asked to review it was the author,” Rasmussen says. “Obviously our system should not depend on that kind of serendipity.”

    Carbon copy

    Although some figures in the BioSystems manuscript were direct copies of those in Payne’s paper, others were simply replotted using his data, which are publicly available, he says. He shared the disconcerting experience on X, formerly known as Twitter. “Well, it happened,” he wrote. He was reviewing a manuscript, he wrote in a post, that included “a direct copy of the figures” in one of his own papers.

    A very close match: Comparison of Fig. 1a from Boekwig et al. 2021 and Fig. 3a from Popova et al. 2024.

    Source: Ref. 1 and Ref. 2

    When, months later, he discovered the Proteomics paper, he posted a follow-up. “Well. It REALLY happened” — the paper that he had been asked to review had been published. Two weeks later, Proteomics retracted the paper, citing plagiarism of images.

    Unlike the figures, the main text of the Proteomics paper is similar to that of Payne’s, but not identical. For example, Payne and his colleagues wrote:

    “From the large population of 10,000 cells, we subsampled a given number of cells n_sample ∈ [7, 16, 20, 30, 100] and calculated S/Vest.”

    The corresponding paragraph of the Proteomics paper features the same numbers and many of the same words:

    “The authors calculated S/Vest using sample n = [7, 16, 20, 30, 100] cells from a population of 10,000 cells.”

    The use of the third person caught Payne’s eye. He says such oddities led him to think his paper had been paraphrased using artificial intelligence (AI) to create believable but different text.

    Paper pushing

    In the course of reporting, Nature found links between authors of the Proteomics paper and a paper mill. Two authors, Dmitrii Babaskin and Tatyana Degtyarevskaya, both at the I.M. Sechenov First Moscow State Medical University, had separate articles3,4 retracted from the International Journal of Emerging Technologies in Learning. Both retraction statements, issued in July 2022, use the same language: “The work could be linked to a criminal paper mill selling authorships and articles for publication.”

    As evidence, the statements cited the work of Brian Perron — who studies social work at the University of Michigan in Ann Arbor and also works as a misconduct sleuth — and his colleagues, who had found links between both of the retracted papers and International Publisher. Neither Babaskin nor Degtyarevskaya responded to Nature’s requests for comment about the retractions.

    International Publisher’s website advertises a selection of more than 10,000 manuscripts, on topics as diverse as the metallurgy of aluminium-alloy welding and the biological features of quails. Prospective buyers can see the paper’s title, and sometimes its abstracts, as well as the expected ranking in the citation database Scopus of the journal of publication. They then select an author slot, with costs ranging from about US$500 to $3,000. The company promises that titles and abstracts shown online will be “completely changed” for publication. “No one will ever be able to find the manuscript anywhere,” the website declares.

    Nevertheless, in 2021, Perron and his colleagues reported on the scientific-fraud watchdog website Retraction Watch that they had identified nearly 200 published papers that probably originated from International Publisher. A number of the published titles “were almost word-for-word” the same as those listed for sale, Perron says. Many of the papers listed in the Retraction Watch report were later retracted. Asked for comment on allegations that it is a paper mill, International Publisher did not respond.

    Clearing the catalogue

    International Publisher removes paper listings from its online catalogue after papers are purchased. To get around this, Nature examined a database of past International Publisher paper listings, created by Perron, and combed through screenshots of the paper mill’s website taken by the non-profit organization Internet Archive, based in San Francisco, California. The search showed that the titles of multiple articles published by four of the five authors of the Proteomics study matched the titles of papers previously listed for sale by International Publisher.

    These paper listings do not include the full article text, but strong circumstantial evidence connects the paper mill’s listings to published studies. For example, a screenshot of the paper mill’s website taken in September 2021 shows that among the articles for sale was #1584, “The structure of forest vegetation on industrial dumps of different ages.” Degtyarevskaya was an author of a paper published in Ecology and Evolution5 in July 2023 with a nearly identical title and matching abstract. In response to an enquiry from the news team, Ecology and Evolution said that it is now investigating the matter.

    Although Nature’s news team was unable to locate a sales listing on International Publisher’s website for the Proteomics paper, Perron says that the paper has several hallmarks of paper-mill articles. Nature could not find any other studies published by the authors on the paper’s subject matter, protein analysis. Moreover, the manuscript was submitted to BioSystems while it was still under review at Proteomics. Perron says that submitting a manuscript to more than one journal simultaneously is a classic tactic of researchers trying to publish paper-mill products.

    A spokesperson for Wiley did not specify whether the allegedly plagiarized Proteomics paper came from a paper mill, but said: “Our investigation confirmed that systematic manipulation of the publication process was at play.”

    Check and check again

    In recent years, some publishers and journals have taken extra countermeasures against plagiarism and paper mills. One such effort, developed by the International Association of Scientific, Technical and Medical Publishers (STM), a trade organization in The Hague, the Netherlands, is the STM Integrity Hub, a resource for scientific publishers that includes a ‘paper mill checker tool’ and ‘duplicate submission checker tool’. The latter is in use at more than 150 journals and scans more than 20,000 papers each month. More than 1% are identified as duplicates.

    There are no metrics for how often researchers spot plagiarism of their own work, but several researchers responded to Payne’s social-media posts by sharing that they had found themselves in a similar situation.

    For Payne, the prospect of paper mills taking advantage of AI is a daunting one. “This, I think, is a pretty good con,” he says. “I think it’s going to happen more.”

    [ad_2]

    Source link