News
The AI Co-Author Science Cannot Ignore
Science has always depended on instruments that extend the limits of human perception. The telescope made distant planets visible. The microscope opened the cellular world. Statistical software turned oceans of measurements into patterns that no mind could hold alone. Artificial intelligence now enters the laboratory and the manuscript draft not as a single new instrument, but as a system that touches nearly every stage of research: reading, coding, designing experiments, analyzing data, writing papers, reviewing submissions and communicating results. The result is not a simple story of cheating machines replacing scholars. It is a deeper shift in how scientific knowledge is produced, filtered and trusted.
AI Is Already Inside the Scientific Workflow
The most visible use of AI in science is text generation. Researchers use chatbots to polish English, rewrite abstracts, summarize literature, draft introductions, produce cover letters, explain code and respond to reviewers. In some fields, especially computer science and biomedicine, the practice has moved from novelty to routine.
Yet writing is only the surface. AI systems are increasingly used to screen papers, extract findings from large literature sets, generate hypotheses, identify protein structures, design molecules, write analysis scripts and assist with statistical interpretation. In laboratories, machine-learning models can help select promising experiments before expensive physical testing begins. In hospitals, AI can flag patterns in imaging or patient data. In climate science, it can accelerate simulations. In materials science, it can search chemical spaces that would be impossible to explore manually.
This makes the current debate difficult. When people ask whether AI is “writing science,” they often mean whether chatbots are producing the prose of academic papers. But science is more than prose. A paper is the final interface of a much larger process. AI can shape the question, the method, the analysis and the narrative. Sometimes it is a harmless assistant. Sometimes it becomes an invisible intellectual partner. Sometimes it is a shortcut that disguises weak work as polished scholarship.
How Much Scientific Text Is Generated by AI?
No one knows the exact share of scientific literature generated by AI. The number depends on the field, the year, the definition of “generated,” and the detection method. A paper lightly edited by ChatGPT is different from a paper whose entire introduction was drafted by a model. A translated manuscript is different from fabricated research generated around public data. Detection tools also make mistakes, especially with non-native English writing, formulaic academic language and heavily edited AI output.
Still, the best evidence suggests that AI-assisted writing has become substantial. Stanford researchers examining large sets of papers and peer reviews reported that about 17.5% of computer-science papers and 16.9% of peer-review text had at least some content drafted or modified by large language models. Their estimate was based not on a simple detector, but on changes in word usage after the release of ChatGPT, particularly the sudden rise of words that LLMs tend to favor.
Another analysis of biomedical abstracts estimated that at least 13.5% of 2024 abstracts were processed with LLMs. The key phrase is “at least.” These studies generally measure detectable traces, not total usage. If an author uses AI for planning, code, translation, or lightly edited prose, the signal may disappear. If a model’s output is carefully revised by a human, detection becomes even harder.
A separate analysis of open-access medical articles found that the monthly share of papers classified as containing AI-generated text rose from 0% in January 2022 to about 11% in March 2025. The same study found that disclosure remained rare: among articles flagged as AI-generated, only a small fraction openly acknowledged AI use.
The picture is therefore clear enough even if the exact percentage is not. AI text in scientific writing is no longer marginal. It is especially common in fast-moving, publication-heavy fields. It is also underreported.
Why Scientists Use Chatbots
The reasons are practical. Scientific publishing rewards speed, volume and fluency. Researchers compete for grants, jobs, promotions and citations. Many are writing in English as a second or third language. Many must produce papers while also teaching, reviewing, managing students and running experiments. A chatbot that can turn rough notes into a polished paragraph in seconds is tempting because it solves a real bottleneck.
For non-native English speakers, AI can be an equalizer. It can reduce the penalty imposed by a publishing system dominated by English. A researcher with strong data and imperfect prose can use a chatbot to make the work clearer. In that case, AI may make science fairer, not weaker.
AI can also help with routine tasks that do not deserve a scientist’s most creative hours. It can draft a plain-language summary, convert a dense paragraph into a clearer one, suggest alternative titles, check consistency in terminology, or produce first-pass code for data cleaning. Used carefully, it is closer to a calculator, spellchecker, statistical package or reference manager than to a ghostwriter.
But the same convenience creates a trap. When AI makes writing effortless, it can encourage writing without thinking. It can produce confident explanations for results the author barely understands. It can fill gaps in reasoning with elegant filler. It can make an ordinary study look more complete than it is. The danger is not only fake science. It is frictionless mediocrity.
Is AI-Generated Scientific Text a Problem?
AI-generated text is not automatically a problem. The problem is undisclosed, unverified or intellectually empty AI use.
A scientist who uses a chatbot to improve grammar, then checks every sentence and remains fully responsible for the claims, is not undermining science. A team that uses AI to summarize thousands of papers before manually verifying the relevant ones may be improving efficiency. A reviewer who uses AI to organize their own notes, without uploading confidential manuscripts or outsourcing judgment, may not be violating the spirit of peer review.
The serious problems begin when AI replaces expertise rather than supporting it. Large language models are designed to generate plausible text, not to guarantee truth. They can hallucinate references, invent mechanisms, misread statistical claims and flatten uncertainty. They often produce prose that sounds balanced while concealing weak logic. In science, style can become camouflage.
There is also a scale problem. Before generative AI, producing a bad paper still required time. Now a low-quality manuscript can be generated quickly from public datasets, templated methods and superficial analysis. Some submissions may not be outright fraudulent, but they add little value and consume editorial attention. Others may be worse: paper-mill products, fabricated literature reviews, invented citations, manipulated images or statistical claims nobody has properly checked.
Peer review was already strained before the chatbot era. AI increases the pressure by raising the volume of plausible-looking submissions. It also threatens the review process itself. If reviewers use chatbots to generate reports without deeply reading the paper, the quality-control layer becomes thinner. A scientific system in which AI drafts the paper and AI drafts the review is not necessarily efficient. It may simply be a machine for producing academic noise.
Trustworthiness Depends on Accountability
The central question is not whether a paper used AI. The central question is whether accountable humans can defend every claim, method, data point and interpretation.
Trustworthy scientific work has several features. The methods are clear. The data are available when possible. The analysis can be reproduced. The limitations are stated honestly. The citations exist and are relevant. The conclusions do not outrun the evidence. AI does not remove the need for any of these standards. If anything, it makes them more important.
A paper can be written in beautiful human prose and still be false. A paper can be edited by AI and still be rigorous. The trust problem arises when AI makes it easier to hide weakness. The reader sees polished language and assumes careful thought. The editor sees a familiar structure and assumes scholarly competence. The reviewer sees fluent explanations and may miss that the references are irrelevant, the statistical model is unsuitable, or the claimed novelty is overstated.
This is why disclosure matters, but disclosure alone is not enough. A sentence saying that ChatGPT helped edit the manuscript does not prove that the work is sound. It only tells us something about the writing process. Journals need policies, but they also need better checks for data integrity, image manipulation, citation accuracy and reproducibility. Universities need to train researchers not only in how to use AI, but in when not to trust it.
The old academic honor system was built around the assumption that writing a paper required sustained engagement with the work. That assumption is weaker now. The new system must focus less on detecting whether a machine touched the prose and more on proving that the science can survive scrutiny.
The Limits of AI Detection
Many institutions are tempted to solve the problem with AI detectors. That is understandable, but risky. Detection tools can provide signals, not verdicts. They are vulnerable to false positives and false negatives. They may misclassify non-native English writing as AI-generated because both can have simplified, formal or predictable phrasing. They can miss AI text that has been revised by a human. They can be gamed through paraphrasing.
This matters because a false accusation can damage a researcher’s career. It also matters because overreliance on detectors can create a false sense of security. A manuscript may pass an AI detector and still contain fabricated data. Another may be flagged for AI-like language while containing excellent, honest science.
The better approach is layered. Editors can use detectors as one input, but not as final evidence. They should combine them with citation checks, statistical review, image screening, data-availability requirements, author contribution statements and targeted questions to authors. A suspicious manuscript should be evaluated for substance: Are the methods coherent? Do the cited papers support the claims? Are the data real? Can the authors explain the analysis?
The scientific community should resist turning prose style into a moral test. The goal is not to punish researchers for using new tools. The goal is to protect the chain of responsibility.
Journal Policies Are Converging
Major publishers have moved toward a common position. AI tools generally cannot be listed as authors because authorship requires responsibility, consent, accountability and the ability to handle conflicts of interest. A chatbot cannot answer for misconduct. It cannot approve a final manuscript. It cannot retract a claim. It cannot be held accountable by an institution.
Publishers also increasingly require disclosure when generative AI is used in manuscript preparation, especially beyond basic grammar correction. Many prohibit AI-generated images unless explicitly allowed, because scientific images are evidence, not decoration. A generated figure can mislead readers if it appears to represent real observations. Peer reviewers are often warned not to upload confidential manuscripts into public AI systems, because doing so may violate confidentiality and data-protection rules.
These policies are evolving. Early reactions were sometimes blunt, including near-total bans. The direction now is more pragmatic: allow some AI assistance, require transparency, keep humans responsible and restrict uses that threaten confidentiality or evidentiary integrity.
This is sensible. A blanket ban would be unrealistic and unevenly enforced. AI is already built into writing tools, search systems, coding environments and data-analysis platforms. The more useful question is not whether AI was used, but how it was used and whether the use affected the scientific claims.
The Risk of Scientific Homogenization
One underappreciated risk is that AI may make scientific writing more uniform. Academic prose is already formulaic. Chatbots tend to amplify this tendency. They prefer safe transitions, balanced paragraphs and familiar phrases. They often smooth away intellectual personality. They can make thousands of papers sound as if they were written by the same careful but unimaginative committee.
This may seem cosmetic, but style affects thought. Strong scientific writing is not merely polished; it is precise. It reflects judgment about what matters, what is uncertain and what is surprising. If AI nudges every argument toward generic phrasing, it can dull the edges of scientific debate.
There is also a citation risk. AI systems may recommend well-known papers over obscure but more relevant ones. They may reproduce dominant framings and marginalize dissenting perspectives. In fields where consensus is still forming, this can narrow the intellectual landscape. Science advances through disciplined disagreement. A literature increasingly mediated by models trained on past literature may become more conservative, more repetitive and less willing to ask strange questions.
AI as a Research Accelerator
The optimistic case remains powerful. AI can accelerate science in ways that are not merely about writing faster. It can help researchers explore vast hypothesis spaces, detect patterns in complex datasets and automate tedious analytical steps. In drug discovery, machine learning can prioritize candidate molecules. In biology, AI can help predict protein structures and interactions. In physics and materials science, it can guide simulations and experimental design. In public health, it can analyze large-scale signals that no individual team could process manually.
Chatbots also make scientific knowledge more accessible within research teams. A biologist can ask for an explanation of a statistical method. A physicist can get help translating an idea into Python. A clinician can summarize a cluster of papers before deciding which ones deserve close reading. Used responsibly, AI can reduce the distance between disciplines.
This may be especially important for early-career researchers and smaller institutions. Elite labs have always had advantages: senior mentors, grant writers, statisticians, professional editors and large networks. AI can provide some support to researchers who lack that infrastructure. It cannot replace mentorship or funding, but it can lower certain barriers.
The future of science with AI, therefore, is not simply darker. It may be faster, more collaborative and more open. But only if speed does not become the main value.
The Coming Shift: From Writing Tool to Scientific Agent
Today’s common chatbot use is mostly conversational. The researcher asks, the model answers. The next phase is more agentic. AI systems will not merely draft paragraphs; they will plan tasks, search databases, run code, compare results, generate figures and suggest next experiments. Some will operate as semi-autonomous research assistants.
This will create new productivity and new hazards. An AI agent that can run analyses may discover errors faster than a human. It may also produce a chain of mistakes too complex for a tired researcher to audit. If a model selects data, cleans it, chooses statistical tests and writes the interpretation, where exactly does human judgment enter? At the final approval stage? That may be too late.
Science will need stronger provenance systems. Future papers may need machine-readable records of how data were processed, which tools were used, which prompts were given, which code was generated and which outputs were manually verified. The traditional methods section may expand into an audit trail.
This could improve science beyond the AI issue. Many current papers are difficult to reproduce because methods are underspecified. AI may force journals to demand clearer workflows, versioned data, shared code and explicit responsibility. The arrival of a risky tool could push the system toward better documentation.
What Should Count as Acceptable AI Use?
A useful boundary is this: AI may assist with expression, exploration and execution, but it should not replace scientific responsibility.
Using AI to polish language is acceptable when the author verifies the final text. Using AI to generate code is acceptable when the code is tested and understood. Using AI to summarize literature is acceptable when key sources are checked directly. Using AI to brainstorm hypotheses is acceptable when the hypotheses are evaluated through proper methods.
By contrast, using AI to invent citations, fabricate data, produce fake peer reviews, generate images presented as observations, or write claims the authors cannot defend is misconduct or close to it. The same applies when researchers hide substantial AI use in contexts where disclosure is required.
The gray area is large. Suppose a chatbot drafts half of an introduction, and the author revises it heavily. Is that editing or ghostwriting? Suppose AI suggests an analysis pipeline that the researcher runs but only partly understands. Is that assistance or abdication? Suppose a reviewer uses AI to produce a first draft of feedback, then edits it carefully. Is that efficient or inappropriate? These questions will not be solved by slogans. They require field-specific norms.
The Human Role Becomes More Important
Paradoxically, AI makes human expertise more valuable. When machines can generate plausible text at scale, the scarce resource is not fluency. It is judgment.
A good scientist knows when a result is too clean, when a model assumption is fragile, when a citation is being stretched, when an effect size matters and when it only looks significant. A good reviewer can sense that a paper’s argument is elegant but hollow. A good editor can distinguish novelty from trend-chasing. These skills are harder to automate than paragraph generation.
The danger is that institutions may reward the wrong thing. If promotion systems continue to emphasize publication counts, AI will inflate the weakest incentives in academia. Researchers will be pushed to produce more papers, faster, with thinner contributions. Journals will receive more submissions. Reviewers will lean more heavily on automation. The literature will grow, but knowledge may not grow with it.
If institutions reward quality, reproducibility, data sharing, careful review and meaningful contribution, AI could become a genuine amplifier. The technology itself does not decide. The incentive system does.
Are AI-Assisted Works Trustworthy?
Some are. Some are not. The presence of AI is neither a stamp of fraud nor a badge of innovation.
A trustworthy AI-assisted paper is one where the human authors remain intellectually present. They understand the methods. They verify the references. They check the analysis. They disclose meaningful AI use according to the rules of the journal. They do not use polished language to overstate uncertain findings. They can answer detailed questions about every part of the work.
An untrustworthy paper is one where AI becomes a laundering mechanism. It turns shallow analysis into formal prose. It invents authority. It hides ignorance. It helps authors produce a manuscript they cannot truly defend. The reader’s problem is that both papers may look similar.
This is why the scientific community should not focus only on AI-generated text. The deeper question is whether the claims are traceable to evidence. Trust must move from style to verification.
The Future: More AI, More Disclosure, More Scrutiny
AI will not disappear from science. It will become more embedded, more capable and less visible. Future word processors, statistical tools, laboratory notebooks and journal platforms will include AI by default. The distinction between “AI-assisted” and “not AI-assisted” may eventually become less meaningful than the distinction between verified and unverified work.
The likely future is a hybrid scientific process. Researchers will use AI for literature mapping, coding, translation, drafting and quality checks. Journals will use AI for screening, plagiarism detection, image forensics, statistical red flags and reviewer matching. Reviewers may use controlled AI tools within secure systems. Readers may use AI to interrogate papers, compare claims against datasets and identify contradictions across the literature.
This future could be better than the present. Imagine reading a paper with an attached verification layer showing where each claim is supported, which data produced each figure, which code generated each result and which parts of the manuscript were AI-assisted. Imagine reviewers spending less time on formatting and more time on conceptual weaknesses. Imagine smaller labs gaining access to analytical support that once required large teams.
But the darker future is also plausible. The literature could be flooded with synthetic papers. Peer review could become an exchange of automated summaries. Real discoveries could be buried under polished noise. Public trust in science could erode if readers come to believe that papers are just machine-generated performances.
The difference between these futures will depend on governance, incentives and culture.
Science After the Chatbot Shock
The chatbot era forces science to clarify what it values. If science is merely the production of papers, AI will produce more of them. If science is the disciplined pursuit of reliable knowledge, AI must be subordinated to that mission.
The right response is neither panic nor blind adoption. Researchers should use AI where it improves clarity, speed and discovery. They should reject it where it weakens understanding, accountability or evidence. Journals should demand transparency without pretending that disclosure solves everything. Universities should teach AI literacy as part of research ethics. Funders and hiring committees should reward fewer, stronger contributions rather than inflated publication volume.
AI can help science think faster, but it cannot decide what is true. It can generate explanations, but it cannot take responsibility for them. It can map the literature, but it cannot replace the skeptical intelligence that turns information into knowledge.
The future of science with AI will be negotiated paper by paper, lab by lab and policy by policy. The best version is not machine-written science. It is human science with better instruments, stronger verification and a renewed respect for the difference between fluent text and reliable truth.