Background Studying births to women with systemic lupus erythematosus (SLE) is difficult given its rarity and the challenges of prospective cohort studies. While the electronic health record (EHR) is a powerful tool to capture coded diagnoses at a population level, accurately identifying SLE births is challenging. Our objective was to develop and externally validate algorithms for identifying births to SLE patients.
Methods We used two EHR-based datasets: Vanderbilts Synthetic Derivative and Dukes Clarity. Potential cases had at least 1 SLE code (ICD-9: 710.0 or ICD-10:M32.1*, M32.8, M32.9) and at least 1 ICD-9 or ICD-10 code for pregnancy-related diagnoses. At Vanderbilt, 100 potential cases were randomly selected for chart review and each classified as a case if SLE was diagnosed by a rheumatologist, nephrologist, or dermatologist. Using this dataset, positive predictive values (PPVs) and sensitivity were calculated for combinations of counts of SLE ICD-9 or ICD-10 codes provided by any clinician and by a rheumatologist (rheumatology coded), antimalarial use, positive ANA, and checked lupus labs (dsDNA, C3 or C4). F-score measured the performance of each algorithm. At Duke, potential cases were compared with the Duke Autoimmunity in Pregnancy Registry; cases outside of this registry underwent chart review. Vanderbilt served as a training set; Duke served as validation.
Results From Vanderbilts 2.8 million subject records, we identified 433 potential cases. Of the 100 cases randomly selected for chart review, 39 had confirmed SLE and a history of a birth. Of Dukes 659 potential cases, 545 were included in a validation set of which 208 had confirmed SLE. In the training set, algorithms with ICD-10 codes had higher PPVs than algorithms with ICD-9 codes (table 1). The algorithm with the highest F-score of 88% was 4 counts of ICD-9 or ICD-10 codes and checked lupus labs. Algorithms validated well in the Duke dataset. In the validation set, 1 ICD-9 or ICD-10 code (by a rheumatologist) performed best (F-score: 82%).
Conclusions We have developed and validated algorithms to detect SLE patients with births in the EHR. The highest performing algorithms use SLE ICD-9 or ICD-10 codes and clinical parameters or ICD-10 codes alone. Algorithms using more SLE coded visits have greater PPVs at a cost to sensitivity. While the PPV and sensitivity nears 90%, EHR cohorts remain complementary to prospective cohorts. However, in the era of big data, developing methods to identify SLE births accurately is critical to examine adverse outcomes such as preterm births.
Funding Source(s): None
Research reported in this abstract was supported by NIH/NICHD 5K12HD043483-12 (Barnado), NIH/NIAMS 1 K08 AR072757-01 (Barnado), and NIH/NCATSNational 1UL1TR002553 (Duke). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.