Structured Definitions | Supervised Machine Learning | ||||||
---|---|---|---|---|---|---|---|
Recall | Precision | Recall | Precision | ||||
(Sensitivity) | Specificity | (PPV) | (Sensitivity) | Specificity | (PPV) | ||
Single ICD9 710.0 | 0.99 | 0.97 | 0.79 | All ICD-9 codes and counts1 | 0.89 | 0.99 | 0.89 |
Single ICD9 710.0 + any lupus medication | 0.96 | 0.98 | 0.86 | All ICD-9 codes and counts + NLP of clinical notes2 | 0.90 | 0.99 | 0.89 |
Single ICD9 710.0 + any lupus medication + any positive lupus-serology | 0.93 | 0.98 | 0.87 | All ICD-9 codes and counts + NLP of clinical notes + all serologic data3 + all medication data | 0.91 | 0.99 | 0.92 |
All ICD-9 codes and counts + NLP of clinical notes + all serologic data + all medication data + demographics4 | 0.85 | 0.99 | 0.96 |
↵1 Supervised Machine Learning algorithms included all available ICD-9 codes for patients as well as counts and locations in the medical records in which they were found (i.e. clinical encounters, problems lists, medications orders, etc.)
↵2 All text data from clinical notes associated with a patient’s medical record were included in the ML algorithm
↵3 Serologic data included ANA, double-stranded DNA, anti-Smith antibody, anti-RNP, SSA, and SSB
↵4 Demographic information included age, gender, race/ethnicity, insurance status, and employment status