Research Integrity · Reference Verification

The errors behind retractions — and how verification catches them.

Retractions are rising faster than at any point in the recorded history of biomedical publishing. A growing share trace back to text-level errors — fabricated references, citations to already-retracted papers, statistical inconsistencies, and missing reporting items — that a systematic pre-submission check can catch before the journal ever sees the manuscript.

The premise

A retraction begins with a detectable error.

Not every retraction is preventable. Image manipulation, deliberate data fabrication that survives statistical scrutiny, plagiarism that requires similarity-database lookups, and ethics-process violations all sit outside what any text-only verification system can catch — and we say so explicitly later on this page.

But a meaningful share of retractions begin with errors that are visible in the manuscript text itself. References that do not exist. Citations to papers that were retracted years ago. Reported statistics that are arithmetically impossible. Reporting-guideline items that are simply missing. Each of these is detectable before submission, and each is the kind of error a careful systematic check eliminates.

The trend

What the numbers actually say.

The aggregate picture, sourced. Each figure links to the original audit or registry.
≈ 12×
Increase in fabricated references
A 2026 audit of nearly 2.5 million PubMed-indexed papers published in The Lancet found that the rate of papers containing a fabricated reference rose roughly twelve-fold in two years — from about 1 in 2,828 papers in 2023 to about 1 in 277 in early 2026.
Source: Topaz et al., The Lancet, 2026
63,000+
Retractions in the Retraction Watch Database
The Retraction Watch Database held more than 63,000 retractions as of the end of 2025, integrated into Crossref as the canonical machine-readable source for retraction status on a given DOI.
Source: Retraction Watch / Crossref
14,000+
Retraction notices in 2023 alone
More than 14,000 retraction notices were issued in 2023 — the highest annual total on record — followed by more than 9,000 in 2024. The rise reflects both increased detection and a real increase in problematic papers from paper mills and uncritical AI use.
Source: Retraction Watch annual recap

The sharpest rise in fabricated references began in mid-2024 and coincided with the broader adoption of generative AI writing tools. The authors of the Lancet audit are clear that the source of the rise is not uniformly malicious — paper mills, intentional misconduct, and uncritical use of AI assistants by otherwise legitimate authors all contribute. The common denominator is that none of these references would have survived a basic pre-submission check against PubMed or Crossref.

The patterns

Five error patterns that show up in the text.

Anonymized composites by error type. The patterns are real and documented in the post-publication literature; specific papers, authors, journals, and institutions have intentionally been omitted.
01

Fabricated / nonexistent citations

The pattern
An early-2020s oncology trial was retracted after a post-publication audit found that several of its cited references could not be located in any database. The references were plausible-sounding — author names that were real researchers in the field, journals that exist, year ranges that fit — but the specific paper-title-and-DOI combination did not resolve. The pattern is now the signature failure mode of uncritical generative-AI use during the writing process: the model produced a citation that looks correct and the author did not verify it.
How verification catches it
Every reference in a submitted manuscript is verified against PubMed and Crossref. References that cannot be matched to a published record are flagged as likely fabricated, with the specific failure (no DOI match, no title/author match, no PubMed record) shown so the author can investigate.
02

Citations to papers that were later retracted

The pattern
A 2022 systematic review in psychiatry cited a foundational paper that had been retracted three years earlier. Neither the review's authors nor the journal noticed; the retracted finding propagated into the meta-analytic estimate. This pattern — sometimes called a "zombie citation" — is common because retractions are not always salient to authors who first encountered the cited work years earlier. A 2021 cardiology consensus statement faced the same issue when an underpinning observational study was retracted between submission and acceptance.
How verification catches it
Every DOI-bearing reference is checked against the Retraction Watch dataset (now integrated into Crossref). Cited references that are themselves retracted are flagged in the reference verification output. Authors see the retraction notice, the date, and the reason so they can remove or replace the citation before submission.
03

Citation-claim mismatch

The pattern
A 2023 nephrology narrative review cited a randomized trial in support of a clinical recommendation. The cited trial was real, well-conducted, and indexed — but it did not actually report the outcome the reviewer claimed. The author had remembered the wrong paper. This is one of the most common reasons editorial concerns are raised post-publication: the citation exists, the source is legitimate, but the source does not say what the citing manuscript implies.
How verification catches it
For each substantive claim that carries a citation, the cited reference's abstract is retrieved and the claim is checked against what the source actually reports. Mismatches — claims unsupported by, contradicted by, or substantially exceeding the cited source — are flagged with the specific reference and the relevant abstract passage so the author can fix the citation, soften the claim, or replace the source.
04

Statistical inconsistencies

The pattern
A 2023 surgical case series was retracted after the editor noticed that several reported p-values were arithmetically incompatible with the reported sample sizes and test statistics. The values had been transcribed incorrectly during late revisions, but the inconsistency made the analysis impossible to verify. A 2022 nutrition trial was retracted on a similar basis when reported percentages did not reconcile with the integer counts in the same table.
How verification catches it
Reported statistics are scanned for internal consistency: p-values that do not match the test statistic and degrees of freedom, percentages that do not reconcile with the underlying counts, confidence intervals incompatible with their point estimates, and means with implausibly small variance for the reported sample size. Inconsistencies are flagged with the specific values and the expected range.
05

Reporting-guideline gaps that mask integrity problems

The pattern
A 2020 observational study was retracted years after publication when reanalysis revealed that confounding had not been adequately controlled — but the original STROBE-relevant items (which variables were controlled, how they were measured, how missing data were handled) had been omitted from the methods. The retraction was traced not to fabrication but to reporting omissions that prevented anyone, including the authors themselves on later inspection, from reproducing the analysis. Similar patterns have driven retractions of trials with under-specified randomization and reviews with under-specified search strategies.
How verification catches it
On every manuscript, the study type is auto-detected and the relevant reporting guideline (CONSORT, STROBE, PRISMA, STARD, ARRIVE, and others) is loaded and evaluated item by item. Missing items are flagged with the specific text the guideline requires. The check covers not just whether a topic is mentioned but whether the specific information the guideline demands is actually present.
What we run

Six checks, every review.

Reference verification and retraction screening are first-class parts of every review — read the reference verification deep-dive for the full scope.
01
DOI + bibliographic verification
Every reference is verified against Crossref and PubMed — by DOI, by title and author, and by other published metadata. Each reference is reported as verified, partially matched, or unresolved, with the specific gap shown.
02
Retraction screening on every cited DOI
Cited DOIs are checked against Retraction Watch (via Crossref's integrated retraction metadata). Cited papers that are themselves retracted are flagged with the retraction notice, date, and reason.
03
Citation-claim alignment
For each substantive claim that carries a citation, the cited source is checked against what the source actually reports. Mismatches between the claim and the source are surfaced with the relevant passage.
04
Statistical red-flag checks
Reported statistics are scanned for internal arithmetic consistency — p-value vs. test statistic, percentage vs. count, confidence interval vs. point estimate. Implausible variance, impossible distributional values, and rounding inconsistencies are flagged.
05
Reporting-guideline compliance, item by item
The correct reporting guideline (CONSORT, STROBE, PRISMA, STARD, ARRIVE, and others) is identified automatically and evaluated item by item. Missing items — including the methodological details whose omission has historically masked integrity problems — are flagged with what the guideline requires.
06
Run automatically on every review
All five checks run on every Peer Review ($29) and Author Review ($79) submission. There is no separate retraction-screening or reference-verification tier; the verification pass is part of the base product.
What verification cannot catch

The honest limits.

Text-only verification is one layer of defense, not the whole defense. The categories below are real causes of retractions that no system reading the manuscript text alone can detect.
01
Image manipulation and duplication
Forensic detection of duplicated, spliced, or digitally altered figures requires image analysis — pixel-level pattern matching against the figure itself and against image libraries. Reading the manuscript text cannot detect any of it. Authors should run figures through a dedicated image-integrity tool, and journals should run incoming submissions through one before acceptance.
02
Undetectable data fabrication
If a dataset is fabricated cleanly — internally consistent statistics, plausible distributions, no arithmetic contradictions in the reported numbers — text-level checks cannot distinguish it from a real dataset. Detection in this case typically requires access to the raw data, replication studies, or a sleuth noticing a deeper anomaly the manuscript text does not expose.
03
Plagiarism
Detecting text re-used from other manuscripts requires similarity-database infrastructure (the kind journals license from dedicated plagiarism vendors). PeerReviewAI does not provide that and does not claim to. Authors and journals should run a dedicated similarity check separately.
04
Peer-review fraud and authorship issues
Compromised peer review, fake reviewer suggestions, ghost-authorship, and gift-authorship are organizational and procedural failures. They are visible to journal editors with the right metadata but not in the manuscript text itself. Editorial integrity processes — not text verification — are the right tool.
05
Ethics-process violations
IRB approval irregularities, consent problems, and ethics-board involvement are typically reported in a few sentences of the methods — and a manuscript can contain the standard sentences while the underlying process was deficient. Verification of ethics processes requires institutional records, not text reading.
Go deeper

Read the tool pages.

Both pages cover what gets checked and what gets flagged, with the same anonymized honesty about scope.
Reference verification · Run on every review

Catch the errors before the journal does.

FAQ

Questions, answered.

Don't see yours? Email us — we read every one.