Early prediction of level-of-care requirements in patients with COVID-19
Tables
Hospitalization prediction model (test performance).
The values inside the parentheses refer to the standard deviation of the corresponding metric. Random refers to test set results from the five random training/test splits. BWH refers to training on four other hospitals and testing on data from BWH. SVM-L1 and LR-L1 refer to the ℓ1-norm regularized SVM and LR models. For the parsimonious model, we list the LR coefficients of each variable (Coef), the correlation of the variable with the outcome (Y-corr), the mean of the variable (Y1-mean) in the positive class (hospitalized for this table), and the mean of the variable (Y0-mean) in the negative class (non-hospitalized). Binary Coef denotes the coefficient of the variables in the binarized model. We report the corresponding odds ratio (OR) and the 95% confidence intervals (CI). Thresholds used for the binarized model are provided in Appendix 1—table 5.
Algorithm | AUC | F1-weighted | |||||||
---|---|---|---|---|---|---|---|---|---|
Random | BWH | Random | BWH | ||||||
Models using all 106 features | |||||||||
LR-L2 | 87.0% (1.7%) | 85.9% | 81.6% (1.3%) | 84.2% | |||||
SVM-L1 | 87.0% (1.6%) | 85.8% | 81.5% (1.5%) | 83.9% | |||||
XGBoost | 87.8% (1.9%) | 87.7% | 80.9% (1.8%) | 83.3% | |||||
RF | 88.2% (1.6%) | 88.1% | 81.2% (1.1%) | 83.2% | |||||
Models using 74 statistically selected features | |||||||||
LR-L2 | 87.1% (1.7%) | 86.0% | 82.0% (1.3%) | 83.9% | |||||
SVM-L1 | 87.1% (1.7%) | 85.8% | 82.0% (1.4%) | 84.0% | |||||
XGBoost | 87.9% (1.9%) | 87.6% | 81.2% (1.9%) | 84.2% | |||||
RF | 88.0% (1.7%) | 88.1% | 80.8% (1.7%) | 83.9% | |||||
Parsimonious Model using 11 features | |||||||||
LR-L2 | 83.4% (1.7%) | 83.7% | 78.7% (0.9%) | 81.0% | |||||
SVM-L1 | 83.4% (1.7%) | 83.8% | 78.1% (1.1%) | 79.9% | |||||
Variables for the Parsimonious Model | |||||||||
Variable | Coef | Y1 mean | Y0 mean | p-value | Y-corr | Coef binary | OR | OR 95% CI | |
SpO2 (%) | −11.90 | 95.44 | 97.11 | <0.001 | −0.29 | 1.74 | 5.67 | 3.97 | 8.12 |
Temperature | 10.36 | 37.21 | 37.06 | <0.001 | 0.08 | 0.86 | 2.36 | 1.76 | 3.18 |
Respiratory Rate | 7.20 | 22.82 | 20.83 | <0.001 | 0.18 | −0.13 | 0.88 | 0.69 | 1.13 |
Age | 5.14 | 62.31 | 46.02 | <0.001 | 0.41 | 0.88 | 2.4 | 1.86 | 3.11 |
Pulse | 4.60 | 90.09 | 90.4 | <0.001 | −0.01 | 0.7 | 2.01 | 1.49 | 2.71 |
Diastolic BP | −3.56 | 73.07 | 77.21 | <0.001 | −0.23 | 1.51 | 4.51 | 2.88 | 7.06 |
Adrenal Insufficiency | 3.09 | 0.013 | 0.001 | <0.001 | 0.08 | 2.58 | 13.14 | 1.57 | 110.37 |
BMI | 2.30 | 31.34 | 31.64 | <0.001 | −0.04 | −0.09 | 0.91 | 0.71 | 1.17 |
Transplantation | 1.90 | 0.023 | 0.002 | <0.001 | 0.1 | 1.43 | 4.19 | 1.04 | 16.87 |
Dyspnea | 1.85 | 0.17 | 0.02 | <0.001 | 0.26 | 2 | 7.41 | 4.85 | 11.32 |
CKD | 1.55 | 0.14 | 0.02 | <0.001 | 0.25 | 0.81 | 2.25 | 1.35 | 3.74 |
Intercept | −2.51 |
-
SpO2: oxygen saturation; BP: Blood pressure; BMI: Body Mass Index; CKD: Chronic Kidney Disease.
ICU prediction model (test performance).
Abbreviations are as in Table 1. Thresholds for the binarized model, PSI and CURB-65 scores are in the Appendix.
ICU prediction results with 2513 patients | |||||||||
---|---|---|---|---|---|---|---|---|---|
Algorithm | AUC | F1-weighted | |||||||
Random | BWH | Random | BWH | ||||||
Models using all 130 features | |||||||||
XGBoost | 86.0% (2.8%) | 83.1% | 90.0% (1.7%) | 91.7% | |||||
SVM-L1 | 85.9% (2.5%) | 80.2% | 89.9% (1.0%) | 89.2% | |||||
LR-L1 | 84.6% (2.8%) | 76.8% | 89.7% (1.0%) | 89.9% | |||||
RF | 86.9% (2.4%) | 83.7% | 90.4% (1.1%) | 91.1% | |||||
Models using 56 statistically selected features | |||||||||
XGBoost | 86.8% (3.1%) | 82.8% | 90.4% (1.4%) | 91.3% | |||||
SVM-L1 | 86.2% (2.6%) | 82.6% | 90.6% (1.2%) | 90.8% | |||||
LR-L1 | 85.8% (2.9%) | 81.8% | 90.2% (1.3%) | 91.3% | |||||
RF | 86.7% (2.0%) | 83.2% | 90.5% (1.7%) | 91.5% | |||||
Parsimonious Model using 10 features | |||||||||
LR-L1 | 85.8% (2.6%) | 83.9% | 90.0% (1.4%) | 89.1% | |||||
LR-L1 (binarized model) | 84.2% (2.2%) | 82.5% | 89.8% (1.1%) | 88.1% | |||||
Model using PSI or CURB-65 score | |||||||||
PSI score | 72.9% (4.9%) | 78.8% | 86.8% (0.7%) | 88.2% | |||||
CURB-65 score | 67.0% (5.0%) | 75.4% | 87.0% (0.5%) | 88.1% | |||||
Variables for the parsimonious model | |||||||||
Variable | Coef | Y1 mean | Y0 mean | p-value | Y-corr | Coef binary | OR | OR 97.5% CI | |
Radiology Opacities | 0.54 | 0.76 | 0.27 | <0.001 | 0.30 | 1.41 | 4.08 | 2.83 | 5.89 |
Respiratory Rate | 0.46 | 24.61 | 21.37 | <0.001 | 0.16 | 0.50 | 1.66 | 1.14 | 2.41 |
Age | 0.45 | 62.61 | 50.58 | <0.001 | 0.18 | 0.56 | 1.76 | 1.27 | 2.43 |
Fever | 0.40 | 0.64 | 0.33 | <0.001 | 0.18 | 0.61 | 1.83 | 1.32 | 2.55 |
Male | 0.35 | 0.64 | 0.44 | <0.001 | 0.12 | 0.50 | 1.65 | 1.21 | 2.26 |
Albumin | −0.34 | 3.68 | 3.84 | <0.001 | −0.16 | 0.58 | 1.78 | 1.10 | 2.90 |
Anion Gap | 0.33 | 16.40 | 15.35 | <0.001 | 0.13 | −0.05 | 0.95 | 0.46 | 1.98 |
SpO2 (%) | −0.22 | 94.72 | 96.72 | <0.001 | −0.24 | 0.83 | 2.29 | 1.63 | 3.21 |
LDH | 0.22 | 400.40 | 327.48 | <0.001 | 0.15 | 0.96 | 2.62 | 1.74 | 3.94 |
Calcium | −0.21 | 8.84 | 9.01 | <0.001 | −0.10 | 0.55 | 1.73 | 1.21 | 2.48 |
Intercept | −0.93 |
-
SpO2: oxygen saturation; LDH: Lactate dehydrogenase.
Restricted ICU prediction model (test performance).
Abbreviations are as in Table 1. Thresholds for the binarized model, PSI and CURB-65 scores are in the Appendix.
ICU prediction results with 628 patients | |||||||||
---|---|---|---|---|---|---|---|---|---|
Algorithm | AUC | F1-weighted | |||||||
Random | BWH | Random | BWH | ||||||
Models using all 130 features | |||||||||
XGBoost | 82.5% (1.9%) | 67.3% | 81.4% (0.7%) | 72.6% | |||||
SVM-L1 | 77.8% (3.8%) | 72.8% | 79.7% (1.2%) | 73.6% | |||||
LR-L1 | 75.9% (3.6%) | 69.7% | 79.2% (2.5%) | 73.7% | |||||
RF | 80.9% (2.7%) | 76.9% | 78.8% (1.9%) | 73.6% | |||||
Models using 29 statistically selected features | |||||||||
XGBoost | 82.7% (2.7%) | 76.2% | 80.6% (2.1%) | 72.6% | |||||
SVM-L1 | 77.9% (3.7%) | 73.1% | 78.5% (1.4%) | 73.6% | |||||
LR-L1 | 78.4% (4.1%) | 71.5% | 79.5% (2.6%) | 74.4% | |||||
RF | 82.1% (2.8%) | 74.1% | 79.0% (2.4%) | 75.4% | |||||
Parsimonious Model using 8 features | |||||||||
LR-L1 | 80.1% (2.9%) | 74.2% | 80.9% (2.1%) | 77.2% | |||||
LR-L1 (binarized model) | 72.5% (5.4%) | 69.9% | 73.4% (2.8%) | 69.7% | |||||
Model using PSI or CURB-65 score | |||||||||
PSI score | 58.8% (7.4%) | 68.3% | 66.7% (2.2%) | 65.3% | |||||
CURB-65 score | 56.8% (4.5%) | 76.9% | 66.2% (1.5%) | 63.8% | |||||
Variables for the parsimonious model | |||||||||
Variable | Coef | Y1 mean | Y0 mean | p-value | Y-corr | Coef binary | OR | OR 97.5% CI | |
LDH | 0.53 | 519.88 | 304.40 | <0.001 | 0.15 | 1.59 | 4.88 | 2.65 | 8.99 |
CRP (mg/L) | 0.47 | 127.17 | 67.43 | <0.001 | 0.35 | 0.76 | 2.13 | 0.70 | 6.47 |
Calcium | −0.35 | 8.83 | 9.01 | <0.001 | −0.13 | 0.71 | 2.03 | 1.25 | 3.31 |
IDDM | 0.30 | 0.25 | 0.12 | 0.003 | 0.15 | 1.00 | 2.73 | 1.62 | 4.60 |
SpO2 (%) | −0.29 | 94.13 | 95.59 | 0.003 | −0.22 | 0.34 | 1.41 | 0.92 | 2.16 |
Radiology Opacities | 0.25 | 0.88 | 0.71 | <0.001 | 0.16 | 0.62 | 1.86 | 1.05 | 3.29 |
Anion Gap | 0.20 | 16.66 | 15.28 | <0.001 | 0.20 | 0.34 | 1.40 | 0.48 | 4.12 |
Sodium | −0.16 | 136.13 | 137.53 | <0.001 | −0.14 | 0.47 | 1.60 | 1.05 | 2.43 |
Intercept | −0.34 |
-
LDH: Lactate dehydrogenase; CRP: C-reactive protein; IDDM: Insulin-dependent diabetes mellitus; SpO2: oxygen saturation.
Ventilation prediction model (test performance).
Abbreviations are as in Table 1. Thresholds for the binarized model, PSI and CURB-65 scores are in the Appendix.
Ventilation prediction results with 2525 patients | |||||||||
---|---|---|---|---|---|---|---|---|---|
Algorithm | AUC | F1-weighted | |||||||
Random | BWH | Random | BWH | ||||||
Models using all 130 features | |||||||||
XGBoost | 85.8% (4.0%) | 83.8% | 91.0% (0.4%) | 91.6% | |||||
SVM-L1 | 82.6% (4.9%) | 83.8% | 90.9% (0.8%) | 91.6% | |||||
LR-L1 | 80.7% (5.4%) | 81.7% | 90.4% (1.2%) | 91.4% | |||||
RF | 85.7% (3.9%) | 83.7% | 91.2% (0.9%) | 91.8% | |||||
Models using 55 statistically selected features | |||||||||
XGBoost | 85.7% (3.3%) | 86.3% | 91.1% (0.6%) | 91.6% | |||||
SVM-L1 | 83.9% (3.7%) | 84.8% | 90.9% (1.1%) | 91.7% | |||||
LR-L1 | 83.3% (4.0%) | 83.9% | 90.8% (1.3%) | 91.4% | |||||
RF | 86.4% (3.4%) | 86.7% | 91.4% (1.1%) | 91.3% | |||||
Parsimonious Model using 8 features | |||||||||
LR-L1 | 85.2% (2.3%) | 87.0% | 90.3% (0.3%) | 90.7% | |||||
LR-L1 (binarized model) | 81.3% (3.1%) | 82.6% | 90.0% (0.6%) | 90.2% | |||||
Model using PSI or CURB-65 score | |||||||||
PSI score | 73.6% (4.1%) | 80.7% | 89.4% (0.4%) | 90.3% | |||||
CURB-65 score | 66.8% (3.1%) | 75.9% | 89.7% (0.1%) | 90.0% | |||||
Variables for the Parsimonious Model | |||||||||
Variable | Coef | Y1 mean | Y0 mean | p-value | Y-corr | Coef binary | OR | OR 97.5% CI | |
Radiology opacities | 0.86 | 0.77 | 0.28 | <0.001 | 0.27 | 1.58 | 4.86 | 3.25 | 7.25 |
Albumin | −0.45 | 3.65 | 3.83 | <0.001 | −0.16 | 1.07 | 2.91 | 1.80 | 4.72 |
Fever | 0.43 | 0.66 | 0.33 | <0.001 | 0.17 | 0.72 | 2.05 | 1.42 | 2.95 |
Respiratory rate | 0.42 | 24.70 | 21.44 | <0.001 | 0.15 | 0.50 | 1.64 | 1.09 | 2.47 |
Glucose | 0.38 | 170.17 | 138.32 | <0.001 | 0.15 | 0.97 | 2.63 | 1.71 | 4.06 |
Male | 0.34 | 0.64 | 0.44 | <0.001 | 0.10 | 0.43 | 1.54 | 1.09 | 2.18 |
LDH | 0.33 | 408.56 | 328.78 | <0.001 | 0.14 | 0.91 | 2.48 | 1.58 | 3.89 |
Anion gap | 0.31 | 16.50 | 15.37 | <0.001 | 0.13 | 0.27 | 1.31 | 0.53 | 3.25 |
Intercept | −1.06 |
-
LDH: Lactate dehydrogenase.
Restricted ventilation prediction model (test performance).
Abbreviations are as in Table 1.Thresholds for the binarized, PSI and CURB-65 scores are in the Appendix.
Ventilation prediction results with 635 patients | |||||||||
---|---|---|---|---|---|---|---|---|---|
Algorithm | AUC | F1-weighted | |||||||
Random | BWH | Random | BWH | ||||||
Models using all 130 features | |||||||||
XGBoost | 80.6% (1.9%) | 74.7% | 79.4% (2.6%) | 75.7% | |||||
SVM-L1 | 79.4% (5.2%) | 71.3% | 80.8% (2.0%) | 75.7% | |||||
LR-L1 | 76.9% (3.9%) | 68.2% | 78.6% (3.2%) | 73.4% | |||||
RF | 81.0% (3.1%) | 75.8% | 79.8% (4.2%) | 72.7% | |||||
Models using 29 statistically selected features | |||||||||
XGBoost | 81.6% (3.2%) | 76.9% | 79.0% (2.9%) | 71.7% | |||||
SVM-L1 | 79.1% (4.6%) | 69.4% | 80.6% (2.5%) | 75.7% | |||||
LR-L1 | 80.9% (3.6%) | 70.9% | 80.4% (2.2%) | 75.7% | |||||
RF | 81.3% (2.6%) | 75.4% | 79.2% (1.7%) | 69.6% | |||||
Parsimonious Model using 5 features | |||||||||
LR-L1 | 82.4% (3.7%) | 75.2% | 81.8% (1.7%) | 71.7% | |||||
LR-L1 (binarized model) | 71.4% (6.2%) | 65.5% | 76.6% (3.5%) | 68.3% | |||||
Model using PSI or CURB-65 score | |||||||||
PSI score | 57.6% (4.5%) | 67.4% | 73.2% (1.3%) | 71.2% | |||||
CURB-65 score | 56.9% (7.1%) | 74.0% | 72.4% (0.2%) | 68.3% | |||||
Variables for the parsimonious model | |||||||||
Variable | Coef | Y1 mean | Y0 mean | p-value | Y-corr | Coef binary | OR | OR 97.5% CI | |
CRP (mg/L) | 0.60 | 134.52 | 69.62 | <0.001 | 0.35 | 0.42 | 1.53 | 0.51 | 4.59 |
LDH | 0.55 | 550.41 | 311.01 | <0.001 | 0.16 | 1.87 | 6.47 | 3.19 | 13.10 |
Calcium | −0.39 | 8.82 | 9.00 | <0.001 | −0.13 | 0.58 | 1.79 | 1.07 | 2.98 |
IDDM | 0.36 | 0.26 | 0.12 | 0.002 | 0.15 | 1.18 | 3.26 | 1.90 | 5.58 |
Anion Gap | 0.29 | 16.81 | 15.32 | <0.001 | 0.19 | 18.66 | 1.27E+08 | 0.00 | inf |
Intercept | −0.39 |
-
CRP: C-reactive protein; LDH: Lactate dehydrogenase; IDDM: Insulin-dependent diabetes mellitus.
Mean and median hours between reference date/lab results to outcomes in full/restricted ICU and ventilation model prediction.
From reference date (mean) | From reference date (median) | From lab results (mean) | From lab results (median) | |
---|---|---|---|---|
Restricted ICU | 38.13 | 28.08 | 22.55 | 9.90 |
Restricted intubation | 35.36 | 26.40 | 22.37 | 10.39 |
Full ICU | 22.86 | 17.28 | 15.86 | 12.99 |
Full intubation | 25.62 | 22.20 | 10.23 | 8.97 |
Representative patient statistics.
Admitted (36.2%) | ICU (10.6%) | Intubated (8.5%) | |||||||
---|---|---|---|---|---|---|---|---|---|
Yes | No | p-value | Yes | No | p-value | Yes | No | p-value | |
Age | 62.3 | 46.0 | <0.001 | 63.3 | 50.6 | <0.001 | 63.3 | 50.9 | <0.001 |
Gender (male) | 55.3% | 40.1% | <0.001 | 63.0% | 43.5% | <0.001 | 63.6% | 43.9% | <0.001 |
Asian | 3.7% | 4.0% | 0.97 | 3.7% | 3.9% | 1 | 3.7% | 3.9% | 1 |
Black/African American | 15.7% | 17.8% | 0.61 | 14.7% | 17.3% | 0.75 | 14.3% | 17.3% | 0.74 |
Hispanic/Latino | 4.9% | 5.9% | 0.81 | 6.6% | 5.4% | 0.88 | 6.9% | 5.4% | 0.83 |
White | 45.4% | 43.9% | 0.91 | 39.6% | 45.0% | 0.40 | 39.6% | 44.9% | 0.53 |
Hypertension | 61.7% | 26.4% | <0.001 | 62.3% | 36.5% | <0.001 | 61.8% | 37.1% | <0.001 |
Diabetes | 34.2% | 9.7% | <0.001 | 40.7% | 15.9% | <0.001 | 42.9% | 16.3% | <0.001 |
Alzheimer | 6.7% | 0.6% | <0.001 | 2.6% | 2.8% | 1 | 3.2% | 2.7% | 0.98 |
Congestive Heart Failure (CHF) | 11.3% | 0.8% | <0.001 | 9.5% | 4.0% | <0.001 | 8.8% | 4.2% | 0.025 |
Chronic Kidney Disease (CKD) | 14.4% | 1.7% | <0.001 | 12.8% | 5.5% | <0.001 | 11.5% | 5.8% | 0.011 |
ACE Inhibitors (ACEIs) | 17.5% | 8.4% | <0.001 | 20.5% | 10.7% | <0.001 | 19.8% | 11.0% | 0.002 |
Acetaminophen Tylenol | 39.8% | 17.8% | <0.001 | 31.9% | 25.1% | 0.12 | 30.4% | 25.4% | 0.45 |
Amiodarone | 1.6% | 0.1% | <0.001 | 1.5% | 0.5% | 0.32 | 0.9% | 0.6% | 0.95 |
Anticoagulants | 9.4% | 1.7% | <0.001 | 9.9% | 3.8% | <0.001 | 11.1% | 3.8% | <0.001 |
Anti-depressants | 25.4% | 16.7% | <0.001 | 20.5% | 19.8% | 0.99 | 22.6% | 19.6% | 0.77 |
Angiotensin Receptor Blockers (ARBs) | 12.0% | 5.2% | <0.001 | 15.4% | 6.8% | <0.001 | 17.1% | 6.8% | <0.001 |
Aspirin related | 32.3% | 11.6% | <0.001 | 33.7% | 17.4% | <0.001 | 33.2% | 17.8% | <0.001 |
Beta-Blockers | 28.1% | 10.4% | <0.001 | 25.6% | 15.7% | <0.001 | 25.8% | 16.0% | 0.003 |
Calcium Chanel Blockers (CCBs) | 2.6% | 0.7% | 0.001 | 4.4% | 1.0% | <0.001 | 4.6% | 1.1% | <0.001 |
Coumadin warfarin | 3.5% | 0.7% | <0.001 | 1.8% | 1.7% | 1 | 1.8% | 1.7% | 1 |
Diuretics | 16.0% | 4.5% | <0.001 | 13.9% | 8.1% | 0.015 | 13.4% | 8.3% | 0.089 |
Immuno- suppressants | 5.3% | 2.6% | 0.005 | 3.7% | 3.5% | 1 | 4.1% | 3.5% | 0.97 |
Insulin related | 14.6% | 3.5% | <0.001 | 19.0% | 6.2% | <0.001 | 21.2% | 6.3% | <0.001 |
Metformin related | 19.5% | 8.6% | <0.001 | 23.8% | 11.2% | <0.001 | 24.9% | 11.4% | <0.001 |
Nonsteroidal anti-inflammatory drugs (NSAIDs) | 21.9% | 21.0% | 0.95 | 19.0% | 21.6% | 0.82 | 18.0% | 21.6% | 0.66 |
Proton Pump Inhibitors (PPIs) | 26.6% | 15.0% | <0.001 | 24.5% | 18.5% | 0.13 | 25.8% | 18.6% | 0.081 |
Statins | 45.1% | 17.3% | <0.001 | 47.6% | 24.9% | <0.001 | 45.6% | 25.7% | <0.001 |
Steroids | 30.5% | 23.0% | <0.001 | 30.8% | 25.2% | 0.26 | 30.4% | 25.3% | 0.44 |
Cough | 65.6% | 29.6% | <0.001 | 68.1% | 39.6% | <0.001 | 69.1% | 40.2% | <0.001 |
Dyspnea | 16.6% | 2.2% | <0.001 | 21.6% | 5.7% | <0.001 | 23.5% | 5.9% | <0.001 |
Chest pain | 21.1% | 5.6% | <0.001 | 22.0% | 9.9% | <0.001 | 24.4% | 10.0% | <0.001 |
Fever | 57.4% | 23.7% | <0.001 | 61.2% | 32.9% | <0.001 | 63.6% | 33.4% | <0.001 |
SpO2 | 95.2 | 97.4 | <0.001 | 93.4 | 96.7 | <0.001 | 93.3 | 96.7 | <0.001 |
Diastolic BP | 72.5 | 78.1 | <0.001 | 72.0 | 75.6 | <0.001 | 70.9 | 75.6 | <0.001 |
Pulse | 90.6 | 88.3 | <0.001 | 93.3 | 88.8 | 0.003 | 94.1 | 88.9 | 0.01 |
Respiratory Rate (RR) | 23.1 | 20.3 | <0.001 | 25.6 | 21.2 | <0.001 | 25.9 | 21.3 | <0.001 |
Temperature (oC) | 37.2 | 37.0 | <0.001 | 37.3 | 37.1 | 0.001 | 37.3 | 37.1 | 0.001 |
Anion Gap | 15.8 | 17.0 | 15.1 | <0.001 | 17.1 | 15.1 | <0.001 | ||
Sodium | 137.0 | 136.3 | 137.4 | <0.001 | 136.2 | 137.3 | <0.001 | ||
Calcium | 9.0 | 8.8 | 9.0 | <0.001 | 8.8 | 9.0 | <0.001 | ||
Lactic acid | 1.8 | 2.1 | 1.6 | <0.001 | 2.1 | 1.6 | <0.001 | ||
Glomerular filtration rate (GFR) | 67.0 | 64.8 | 72.3 | <0.001 | 64.7 | 71.9 | <0.001 | ||
Chloride | 98.1 | 97.2 | 98.8 | <0.001 | 97.1 | 98.8 | <0.001 | ||
Glucose | 149.6 | 171.5 | 135.8 | <0.001 | 173.9 | 137.2 | <0.001 | ||
Lactate Dehydrogenase (LDH) | 377.2 | 524.6 | 303.9 | <0.001 | 551.8 | 310.6 | <0.001 | ||
Albumin | 3.8 | 3.6 | 3.9 | <0.001 | 3.6 | 3.9 | <0.001 | ||
D-Dimer | 1373.5 | 1525.0 | 1223.7 | <0.001 | 1614.5 | 1214.0 | <0.001 | ||
C-reactive Protein (CRP) | 89.6 | 133.1 | 65.5 | <0.001 | 140.1 | 68.1 | <0.001 | ||
Blood Urea Nitrogen (BUN) | 21.4 | 24.3 | 18.5 | <0.001 | 23.8 | 18.9 | <0.001 | ||
Creatine Kinase (CK) | 385.2 | 563.4 | 282.7 | <0.001 | 620.3 | 285.1 | <0.001 | ||
Ferritin | 854.2 | 1349.5 | 601.6 | <0.001 | 1477.1 | 621.8 | <0.001 | ||
Mean Platelet Volume (MPV) | 10.5 | 10.6 | 10.5 | <0.001 | 10.6 | 10.5 | <0.001 | ||
Atelectasis | 19.0% | 4.6% | <0.001 | 15.8% | 9.2% | 0.008 | 16.6% | 9.2% | 0.007 |
Consolidation | 5.9% | 0.6% | <0.001 | 10.3% | 1.6% | <0.001 | 11.1% | 1.7% | <0.001 |
Nodule | 4.9% | 0.6% | <0.001 | 4.4% | 1.9% | 0.072 | 3.7% | 2.0% | 0.47 |
Opacity | 64.8% | 13.7% | <0.001 | 78.4% | 26.7% | <0.001 | 80.6% | 27.8% | <0.001 |
Pleural Effusion | 8.8% | 1.1% | <0.001 | 11.7% | 3.0% | <0.001 | 13.8% | 3.0% | <0.001 |
Distribution of patients in different hospitals and outcome groups.
Hospital | Positive | Admitted | ICU | Intubated |
---|---|---|---|---|
Brigham and Women's Hospital (BWH) | 648 | 171 | 67 | 56 |
Newton-Wellesley Hospital (NWH) | 434 | 145 | 33 | 18 |
Massachusetts General Hospital (MGH) | 1195 | 475 | 144 | 121 |
North Shore Medical Center (NSM) | 97 | 63 | 16 | 12 |
Faulkner Hospital (FH) | 192 | 76 | 13 | 10 |
Total | 2566 | 930 | 273 | 217 |
List of 164 features used for hospitalization, ICU, and ventilation models.
Category | Features |
---|---|
Demographics | Marital status, Gender, Race, Age, Language, Tobacco, Alcohol, Height, Weight, BMI |
Vitals | Systolic BP, Diastolic BP, Temperature, Pulse, Respiratory Rate, SpO2 percentage |
Symptoms | Fever, Cough, Dyspnea, Fatigue, Diarrhea, Nausea, Vomiting, Abdominal pain, Loss of smell, Loss of taste, Chest pain, Headache, Sore throat, Hemoptysis, Myalgia |
Pre-existing medications | Steroids, ACEIs, ARBs, NSAIDs, Anti-depressants, CCBs, Diuretics, Digoxin, Statins, Beta-Blockers, Acetaminophen Tylenol, Immunosuppressants, Anticoagulants, Aspirin related, Coumadin warfarin, Amiodarone, Insulin related, Metformin related, PPIs |
Comorbidities | Hypertension, COPD, Diabetes, CKD, CAD, MI, Asthma, Osteoarthritis arthritis, SLE, HLD, Arrhythmia, Thyroid disease, Stroke, Migraine, Epilepsy, Alzheimer, Parkinson, Nephrolithiasis, Cushing, Adrenal Insufficiency, Diverticulosis, GERD, IBS, IBD, Cholelithiasis, Inguinal hernia, Hepatitis, Cirrhosis, Valvular disease, CHF, PAD, Osteoporosis, Cancer, TB, Cardiomyopathy, AAA, DVT, vWD, Anemia, Transplantation, HIV, Depression, Anxiety |
Radiology | Opacity, Atelectasis, Consolidation, Pleural Effusion, Pneumothorax, Nodule |
Labs | RDW, PLT, MCH, HGB, MCHC, HCT, MCV, RBC, WBC, MPV, NRBC (%), GFR (estimated), Creatinine, Potassium, Chloride, Sodium, Anion Gap, BUN, Glucose, Calcium, Carbon Dioxide, Absolute Neutrophil count, Absolute Lymphocyte count, Absolute Monocyte count, Absolute Eosinophil count, Absolute Basophil count, Immature Granulocytes, ALT, Total Protein, Albumin, Globulin, AST, Bilirubin (Total), Alkaline phosphatase, NRBC Auto (#), LDH, Ferritin, CK, Magnesium, CRP, PT, D-Dimer, Lactic acid, Phosphorus, PTT, PCO2 (Venous), pH (Venous), Fibrinogen, Lipase, Bands (manual), PO2 (Venous), Base Deficit (Venous), Iron, Bilirubin (Direct), Myelocytes, HCO3 (unspecified), TIBC, Base Deficit (Arterial), PCO2 (Arterial), Metamyelocytes, Plasma cells (%), PO2 (Arterial), Ionized Calcium, pH (Arterial), Osmolality |
Performance of the NLP models.
Precision (%) | Recall (%) | F1-score (%) | |
---|---|---|---|
NER+NLI model | 93.60 | 87.97 | 90.70 |
Regular expression matching | 99.01 | 96.15 | 97.56 |
Abnormal ranges for laboratory tests and vitals.
Variable | Abnormal range |
---|---|
Albumin | <3.3 |
Chloride | <95 |
Lactic acid | ≥2 |
LDH | ≥250 |
CRP (mg/L) | ≥10 |
Calcium | ≤8.5 |
Anion gap | ≥12 |
Glucose | ≥110 |
Total protein | ≤6.5 or ≥8.3 |
D-Dimer (ng/mL) | ≥500 |
GFR | ≤60 |
Sodium | <135 |
Globulin | ≤2 or ≥4 |
SpO2 | ≤94 |
Systolic blood pressure | ≤100 |
Pulse | ≥100 |
Respiratory rate | ≥20 |
Age | ≥65 |
Diastolic blood pressure | ≤60 |
BMI | ≥30 |
Temperature | ≥37.5 °C or ≥98.7 °F |
Derivation cohort performance for the hospitalization prediction model.
Abbreviations and metrics reported are as in Table 1.
Algorithm | AUC | F1-weighted | ||
---|---|---|---|---|
Random | BWH | Random | BWH | |
Models using all 106 features | ||||
LR-L2 | 88.3% (0.4%) | 88.3% | 82.9% (0.5%) | 82.3% |
SVM-L1 | 88.2% (0.4%) | 88.2% | 82.8% (0.5%) | 82.1% |
XGBoost | 91.5% (2.1%) | 90.9% | 85.7% (2.3%) | 85.2% |
RF | 96.0% (0.7%) | 95.3% | 92.9% (1.2%) | 90.8% |
Models using 74 statistically selected features | ||||
LR-L2 | 87.8% (0.4%) | 87.8% | 82.4% (0.4%) | 81.7% |
SVM-L1 | 87.8% (0.4%) | 87.7% | 82.5% (0.7%) | 81.7% |
XGBoost | 91.9% (1.8%) | 91.9% | 86.0% (1.8%) | 86.2% |
RF | 94.9% (0.9%) | 96.6% | 91.3% (1.3%) | 93.2% |
Parsimonious Model using 11 features | ||||
LR-L2 | 82.6% (0.5%) | 82.4% | 77.6% (0.1%) | 76.9% |
SVM-L1 | 82.5% (0.5%) | 82.3% | 77.5% (0.3%) | 76.9% |
Derivation cohort performance for the ICU prediction model.
Abbreviations and metrics reported are as in Table 1.
ICU prediction results (training performance) with 2513 patients | ||||
---|---|---|---|---|
Algorithm | AUC | F1-weighted | ||
Random | BWH | Random | BWH | |
Models using all 130 features | ||||
XGBoost | 94.5% (3.6%) | 96.1% | 94.0% (1.7%) | 94.1% |
SVM-L1 | 89.7% (0.7%) | 91.4% | 91.5% (0.4%) | 91.9% |
LR-L1 | 91.3% (0.6%) | 92.9% | 91.5% (0.5%) | 91.9% |
RF | 93.4% (3.2%) | 97.0% | 94.3% (1.6%) | 95.4% |
Models using 56 statistically selected features | ||||
XGBoost | 94.1% (1.5%) | 95.1% | 93.6% (0.6%) | 93.7% |
SVM-L1 | 88.5% (0.7%) | 89.7% | 91.2% (0.4%) | 91.4% |
LR-L1 | 89.3% (0.7%) | 90.4% | 91.2% (0.2%) | 91.4% |
RF | 91.0% (1.9%) | 94.9% | 93.0% (1.0%) | 94.2% |
Parsimonious Model using 10 features | ||||
LR-L1 | 86.2% (0.6%) | 83.8% | 90.4% (0.4%) | 89.1% |
LR-L1 (binarized model) | 84.0% (0.6%) | 80.6% | 89.4% (0.1%) | 88.2% |
Model using PSI or CURB-65 score | ||||
PSI score | 74.3% (1.2%) | 72.3% | 87.5% (0.2%) | 87.1% |
CURB-65 score | 67.9% (1.3%) | 65.3% | 87.3% (0.2%) | 86.8% |
Derivation cohort performance for the restricted ICU prediction model.
Abbreviations and metrics reported are as in Table 1.
ICU prediction training performance with 628 patients | ||||
---|---|---|---|---|
Algorithm | AUC | F1-weighted | ||
Random | BWH | Random | BWH | |
Models using all 130 features | ||||
XGBoost | 89.6% (4.8%) | 92.5% | 85.4% (5.8%) | 87.6% |
SVM-L1 | 80.1% (0.6%) | 80.8% | 79.4% (0.5%) | 80.4% |
LR-L1 | 87.1% (0.8%) | 88.0% | 83.5% (0.5%) | 83.6% |
RF | 95.6% (2.9%) | 95.7% | 91.0% (3.3%) | 90.2% |
Models using 29 statistically selected features | ||||
XGBoost | 86.3% (1.0%) | 87.4% | 81.9% (0.4%) | 83.8% |
SVM-L1 | 80.5% (0.9%) | 80.4% | 79.1% (0.5%) | 80.4% |
LR-L1 | 80.9% (1.0%) | 81.6% | 79.0% (0.3%) | 80.3% |
RF | 89.8% (2.6%) | 92.8% | 85.0% (1.9%) | 88.2% |
Parsimonious Model using 8 features | ||||
LR-L1 | 80.4% (0.9%) | 81.4% | 79.7% (0.5%) | 80.0% |
LR-L1 (binarized model) | 75.4% (1.1%) | 77.2% | 75.2% (0.7%) | 77.5% |
Model using PSI or CURB-65 score | ||||
PSI score | 60.5% (1.7%) | 59.0% | 68.6% (0.5%) | 68.7% |
CURB-65 score | 60.2% (1.2%) | 57.2% | 67.5% (0.4%) | 67.3% |
Derivation cohort performance for the ventilation prediction model.
Abbreviations and metrics reported are as in Table 1.
Ventilation prediction training performance with 2525 patients | ||||
---|---|---|---|---|
Algorithm | AUC | F1-weighted | ||
Random | BWH | Random | BWH | |
Models using all 130 features | ||||
XGBoost | 97.2% (1.5%) | 95.2% | 95.8% (1.0%) | 94.5% |
SVM-L1 | 92.3% (0.7%) | 92.8% | 93.1% (0.1%) | 93.4% |
LR-L1 | 93.8% (0.6%) | 94.3% | 93.3% (0.2%) | 93.2% |
RF | 95.1% (0.8%) | 94.7% | 95.4% (0.5%) | 94.3% |
Models using 55 statistically selected features | ||||
XGBoost | 96.9% (1.4%) | 98.3% | 95.6% (0.9%) | 96.6% |
SVM-L1 | 90.8% (0.7%) | 91.3% | 92.7% (0.2%) | 93.0% |
LR-L1 | 91.4% (0.7%) | 92.0% | 92.6% (0.3%) | 92.8% |
RF | 94.8% (0.7%) | 94.1% | 95.5% (0.3%) | 94.8% |
Parsimonious Model using 8 features | ||||
LR-L1 | 86.9% (0.5%) | 88.1% | 91.6% (0.2%) | 91.9% |
LR-L1 (binarized model) | 84.4% (0.7%) | 86.7% | 91.1% (0.2%) | 91.2% |
Model using PSI or CURB-65 score | ||||
PSI score | 74.0% (1.0%) | 71.4% | 89.9% (0.1%) | 89.6% |
CURB-65 score | 67.6% (0.8%) | 64.7% | 89.7% (0.0%) | 89.6% |
Derivation cohort performance for the restricted ventilation prediction model. Abbreviations and metrics reported are as in Table 1.
Ventilation prediction training performance with 635 patients | ||||
---|---|---|---|---|
Algorithm | AUC | F1-weighted | ||
Random | BWH | Random | BWH | |
Models using all 130 features | ||||
XGBoost | 91.8% (2.2%) | 98.6% | 87.4% (2.0%) | 95.3% |
SVM-L1 | 81.2% (0.7%) | 83.2% | 82.4% (1.1%) | 83.9% |
LR-L1 | 89.7% (0.6%) | 89.6% | 86.9% (1.0%) | 85.8% |
RF | 93.5% (4.2%) | 93.7% | 89.5% (3.8%) | 89.7% |
Models using 29 statistically selected features | ||||
XGBoost | 89.9% (2.3%) | 89.9% | 86.1% (1.6%) | 86.0% |
SVM-L1 | 81.5% (1.6%) | 84.4% | 82.2% (1.2%) | 83.7% |
LR-L1 | 82.6% (0.7%) | 84.0% | 83.0% (0.9%) | 83.6% |
RF | 92.3% (4.8%) | 94.3% | 88.8% (3.7%) | 89.3% |
Parsimonious Model using 5 features | ||||
LR-L1 | 80.3% (1.0%) | 79.0% | 82.1% (0.7%) | 81.7% |
LR-L1 (binarized model) | 73.1% (1.4%) | 66.5% | 78.3% (0.9%) | 73.5% |
Model using PSI or CURB-65 score | ||||
PSI score | 58.8% (1.0%) | 57.2% | 73.9% (0.3%) | 74.2% |
CURB-65 score | 58.5% (1.7%) | 55.8% | 73.2% (0.1%) | 73.7% |
AUC and weighted F1-score on an extended BWH test set, where patients with lab-to outcome time smaller than or equal to certain gaps are excluded.
Time gap | 6hr | 12 hr | 18 hr | 24 hr | 48 hr |
---|---|---|---|---|---|
Restricted ICU model - AUC | 86.05% | 84.73% | 86.85% | 86.14% | 84.62% |
Restricted ICU model - weighted-F1 | 83.10% | 82.17% | 86.47% | 86.09% | 86.28% |
Restricted intubation model - AUC | 68.00% | 64.44% | 63.85% | 63.85% | 64.34% |
Restricted intubation model - weighted-F1 | 65.75% | 66.59% | 69.81% | 69.81% | 72.33% |
Additional files
-
Source code 1
COVID-19 Early Prediction Scripts, Network Optimization and Control Lab, Boston University.
- https://cdn.elifesciences.org/articles/60519/elife-60519-code1-v2.zip
-
Transparent reporting form
- https://cdn.elifesciences.org/articles/60519/elife-60519-transrepform-v2.docx