Methodology — ResearchFace

Pillar A — Detection

The Psychology-Statistic Lens: Understanding the “Why” Behind the Fraud

Most forensic approaches ask “Is this data real?” We start with a different question: “Why would someone fabricate this?” By combining behavioral psychology with statistical forensics, we create a dual-layer detection system that is both mathematically rigorous and psychologically informed.

Human beings are poor random number generators. When researchers fabricate data, their cognitive biases leave statistical fingerprints — patterns of “too-clean” results, terminal digit preferences, and distributions that lack the natural noise of authentic measurement. Our approach detects these signatures using:

GRIM Testing — Verifying whether reported means are mathematically possible given stated sample sizes
SPRITE Reconstruction — Reverse-engineering plausible raw datasets from reported summary statistics
P-Curve Analysis — Evaluating whether a body of work shows the distribution of p-values expected from genuine effects versus selective reporting
Benford’s Law Application — Identifying unnatural digit distributions in large quantitative datasets
Behavioral Pattern Analysis — Assessing “motivational context” — career pressure, grant cycles, and promotion timelines that correlate with anomalous output

Pillar B — Mapping

Bibliometric Network Analysis: Exposing Hidden Structures

Individual fraud is a symptom. Systemic manipulation is the disease. Pillar B uses scientometric and bibliometric tools to map the networks, cartels, and mill operations that operate beneath the surface of legitimate scholarly publishing.

Citation cartels — closed networks of editors, authors, or journals that artificially inflate each other’s metrics — are invisible to traditional analysis. They only become visible when you map the relational structure of citations across time and geography. Paper mills are similarly hidden until you analyze co-authorship patterns, submission velocities, and template reuse at scale.

Co-Citation Clustering — Identifying journals and authors that cite each other at statistically improbable rates
Reciprocity Ratio Analysis — Measuring bilateral citation exchange to flag coordinated inflation
Authorship Network Mapping — Detecting “stranger clusters” — co-authors with no institutional, geographic, or disciplinary connection
Submission Velocity Auditing — Flagging authors with publication rates that exceed human research capacity
Template Fingerprinting — Identifying structural similarities across manuscripts that suggest automated or mill-based production

Pillar C — Accountability

The RI² Index: Research Integrity Risk at the Institutional Level

Individual papers can be retracted. Individual researchers can be investigated. But what about the institutions that create the conditions for misconduct? The RI² (Research Integrity Risk Index), inspired by the frameworks proposed by Meho and Ioannidis, shifts the lens from individual actors to institutional environments.

The RI² evaluates institutions across three core risk metrics:

D-Rate (Delisted Journal Exposure) — What percentage of an institution’s total publications appear in journals that have been delisted from Scopus or Web of Science for quality concerns? A high D-Rate suggests systemic tolerance for low-quality publishing channels.
R-Rate (Retraction Density) — What is the ratio of retracted papers to total output, normalized by field and institution size? The R-Rate reveals whether retractions are isolated incidents or structural patterns.
S-Rate (Self-Citation Inflation) — What proportion of an institution’s citation impact is generated internally through self-citation? Elevated S-Rates suggest metric gaming at the organizational level.

Combined, these three metrics produce a composite RI² score that classifies institutions as Green (low risk), Amber (moderate risk / monitoring recommended), or Red (high risk / investigation warranted).