An experimental test of the effects of redacting grant applicant identifiers on peer review outcomes

  1. Richard K Nakamura
  2. Lee S Mann
  3. Mark D Lindner
  4. Jeremy Braithwaite
  5. Mei-Ching Chen
  6. Adrian Vancea
  7. Noni Byrnes
  8. Valerie Durrant
  9. Bruce Reed  Is a corresponding author
  1. Retired, formerly Center for Scientific Review, National Institutes of Health, United States
  2. Center for Scientific Review, National Institutes of Health, United States
  3. Social Solutions International, United States
2 figures, 9 tables and 4 additional files

Figures

Study background and timeline.
Distributions of preliminary overall impact scores according to race of PI and format in which the applications were reviewed.

Boxes delineate the central 50% of scores those falling between the 25th and 75th percentiles (Interquartile Range, IQR). Whiskers extend 1.5X the IQR. Dots mark outliers. Horizontal lines within boxes indicate the median, and “x” marks the mean value. Lower scores are better.

Tables

Table 1
PI demographics and application characteristics by sample.
Match criteriaBlack (n = 400)White matched (n = 400)White random (n = 400)
Gender
 Male232233276
 Female166167120
 Unknown24
Institution NIH mean (SD) awarded dollars in $millions182.88 (172.02)171.12 (159.85)176.92 (157.13)
Type of application
 Type 1 (New)370369334
 Type 2 (Renewal)303166
Revision or resubmission
 A0 (original submission)290290263
 A1 (resubmission)110110137
Early stage investigator
 Yes10210247
 No298298353
Investigator age mean (SD)48.66 (9.31)50.27 (10.20)51.96 (9.96)
Behavioral/social science IRG
 Yes17417375
 No226227325
Degree held
 MD807254
 PhD237267289
 MD/PhD373340
 Others24168
 Unknown22129
Original preliminary overall impact scores: mean (SD)4.35 (1.46)4.34 (1.36)3.94 (1.26)
% with multiple PIs241821
Table 2
Reviewer’s guesses of applicant race in relation to actual race by application format.
Reviewer guess of PI raceStandard format applicationsRedacted format applications
Black PIsWhite PIsBlack PIsWhite PIs
Black683 (58%)49 (2%)336 (28%)48 (2%)
White432 (36%)2234 (93%)723 (61%)2081 (87%)
Other45 (4%)66 (3%)78 (7%)172 (7%)
No guess25 (2%)41 (2%)52 (4%)90 (4%)
Table 3
Reviewer confidence regarding their guesses of investigator demographics.
ApplicantcharacteristicBlack investigatorsWhite matched investigatorsWhite random investigators
Standard reviewsAnonymized reviewsStandard reviewsAnonymized reviewsStandard reviewsAnonymized reviews
Race3.22.13.22.23.42.2
Gender4.32.34.42.34.52.3
Institution4.23.24.33.34.43.3
Career stage4.23.14.23.24.43.2
  1. Note: 5-point scale, 1 = low confidence, 5 = high confidence.

Table 4
Rates of reviewer identification of name/research group in redacted applications.
PI raceCorrectIncorrectNo guess
Overall(3580)21.6%(775)6.1%(217)72.3%(2588)
Black(1189)18.9%(225)5.6%(67)75.4%(897)
White (matched sample)(1194)19.4%(232)7.0%(84)73.5%(878)
White (random sample)(1197)26.6%(318)5.5%(66)67.9%(813)
Table 5
Primary analysis.

Effects of race and application format on overall impact scores in matched White and Black application sets.

Estimatep-Value95% Confidence interval (CI)
Fixed effects
Race–0.170.01(−0.31,–0.04)
Application format–0.100.02(−0.19,–0.02)
Race × application format–0.120.17(–0.29, 0.05)
Intercept4.06< 0.001(3.99, 4.13)
Random effects
Application intercept0.61(0.51, 0.72)
  1. Note: The reference category for race is the Black group. The reference category for application format is the redacted format.

Table 6
Simple contrasts of average preliminary impact scores for redacted vs. standard format applications by PI race.

Matched White application set.

RaceAnonymization condition
StandardAnonymizedSimple contrast (SE)Effect size
Black4.134.170.04 (0.06)0.04
White matched3.894.050.16* (0.06)0.14
Simple contrast (SE)–0.23* (0.08)–0.12 (0.08)
Effect size for race0.200.10
  1. *

    p <.05 (Bonferroni-adjusted).

Table 7
Parameter estimates and standard errors from nested models predicting overall impact scores in the Black and random White application sets.
Model 1(n = 4764800 applications)Model 2(n = 4728794 applications)Model 3(n = 4728794 applications)Model 4(n = 4315794 applications)
Fixed effectsCoef.SECoef.SECoef.SECoef.SE
DemographicsRace (White = 1)0.266a0.0690.132c0.0650.132c0.065–0.1240.068
Type 2 application0.492a0.1010.491a0.1010.484a0.104
A1 application0.420a0.0690.420a0.0690.415a0.072
Gender–0.0050.067–0.0050.0670.0130.069
Early-stage investigator0.178c0.0840.178c0.0840.186c0.087
Low NIH institutional funding0.618a0.0940.618a0.0940.612a0.097
Experimental covariates
Format (standard = 1)0.144a0.042–0.0220.041
Format × race0.186b0.0830.237b0.080
Perceptions
PI race guess Black0.155b0.069
PI gender guess female–0.0690.061
PI career stage guessEarly-stage investigator0.0910.063
Institutional funding guess ‘low’0.447a0.134
Grantsmanship indicators
Grant 10.519a0.027
Grant 20.204a0.029
Random effects
Grant 1 slope0.052
Institution slope0.4890.4890.477
Application intercept0.6140.4000.4020.511
Residual2.0442.0412.0321.561
  1. Note: Statistically significant parameter estimates are bolded; ap ≤ 0.001, bp ≤ 0.025, cp < 0.05.

Table 8
Simple contrasts of average preliminary impact scores for redacted vs. standard format applications by PI race.

Randomly selected White application set.

RaceAnonymization condition
StandardAnonymizedDifference (SE)Effect size
Black4.134.170.04 (0.06)0.04
White random3.764.010.25* (0.06)0.21
Difference (SE)–0.37* (0.08)–0.16 (0.08)
Effect size for race0.310.15
  1. *

    p < .05 (Bonferroni-adjusted).

Author response table 1
Distribution of change (redacted score – standard score) according to PI race.
Min1st quarterMedianMean3rd quarterMax
Black–3.67–0.6700.0413.33
White matched–3–0.6700.1613.33
White random–3.83–0.330.330.2414.33

Additional files

Supplementary file 1

Additional methodological details.

(A) Sample size calculations. (B) Number of matched white applications per matching criteria. (C) Redacted fields and their locations. (D) Differences between standard National Institutes of Health (NIH) review and this study.

https://cdn.elifesciences.org/articles/71368/elife-71368-supp1-v2.docx
Supplementary file 2

Data collection instrument.

https://cdn.elifesciences.org/articles/71368/elife-71368-supp2-v2.docx
Transparent reporting form
https://cdn.elifesciences.org/articles/71368/elife-71368-transrepform1-v2.docx
Source data 1

Data used in reported analyses.

https://cdn.elifesciences.org/articles/71368/elife-71368-supp3-v2.xlsx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Richard K Nakamura
  2. Lee S Mann
  3. Mark D Lindner
  4. Jeremy Braithwaite
  5. Mei-Ching Chen
  6. Adrian Vancea
  7. Noni Byrnes
  8. Valerie Durrant
  9. Bruce Reed
(2021)
An experimental test of the effects of redacting grant applicant identifiers on peer review outcomes
eLife 10:e71368.
https://doi.org/10.7554/eLife.71368