Article Text

Download PDFPDF

604 Predicting adverse pregnancy outcomes in women with systemic lupus erythematosus: external validation of the promisse model using multiple independent cohorts
  1. Melissa Fazzari1,
  2. Marta Guerra2,
  3. Marta Mosca3,
  4. Dina Zucchi3,
  5. Jill Buyon4,
  6. Anna Brode5,
  7. Jane Salmon2,6 and
  8. Mimi Kim1
  1. 1Albert Einstein College of Medicine, New York, USA
  2. 2Hospital for Special Surgery, New York, USA
  3. 3University of Pisa, Pisa, Italy
  4. 4New York University School of Medicine, New York, USA
  5. 5Hackensack Meridian School of Medicine, New Jersey, USA
  6. 6Weill Cornell Medical College, New York, USA

Abstract

Background Nearly 20% of pregnancies in patients with Systemic lupus erythematosus (SLE) result in an adverse pregnancy outcome (APO); early identification of women with SLE who are at high risk of APO is vital. We previously examined several regression and machine learning (ML) predictive models for APO using data from the PROMISSE Study, a large multi-center, multi-ethnic/racial study of APO in women with mild/moderate SLE and/or aPL. Penalized logistic regression (LASSO), as well as several “black box” ML algorithms (Random Forest, Support Vector Machine, and Super Learner) each achieved good internal cross-validated performance, with area under the receiver operating curve (AUC) of 0.77-0.78. The goal of this study was to externally validate the performance of these promising APO risk models using three independent, external cohorts.

Methods The PROMISSE data set used to develop the initial APO prediction models consisted of N=385 pregnancies, 71 APO events (18.4%), and 32 known or potential APO risk factors that are routinely measured in clinical practice early in pregnancy. APO was defined as preterm delivery due to placental insufficiency or preeclampsia, fetal or neonatal death, or fetal growth restriction. Three independent prospective cohorts were provided by a team of international investigators with expertise in SLE pregnancy (Bronx, NY: N=96; NYC, NY: N=62; Pisa, Italy: N=152). Patient demographics were summarized for each cohort and missing data handled using multiple imputation with chained equations. Using the APO risk models developed with the PROMISSE data, we computed for each cohort: 1) the standard deviation (SD) of predicted risk scores to summarize the degree of heterogeneity in patient characteristics and 2) the area under the receiver operating curve (AUC) to summarize the ability of each model to discriminate patients with and without APO.

Results The three external cohorts and the PROMISSE development cohort showed distributional differences in previously identified APO risk factors (table 1). Non-Hispanic White comprised 49.3% of the PROMISSE, compared to 98.7% in Pisa, 27.4% in NYC, and 0% in the Bronx. LAC positivity varied from 8.1% in PROMISSE to 22.6% in the NYC cohort, while PGA > 1 varied from 10.6% in the development cohort to 4.4% in the Bronx, NY cohort. Current anti-hypertensive use was 8.6% in PROMISSE, higher in the Bronx cohort (12.6%), and lower in the NYC (4.8%) and Pisa (5.3%) cohorts. APO rates were the same in PROMISSE and Pisa (18.4%) and higher in the Bronx (24%) and NYC cohorts (25.8%). Prediction risk score SD indicated similar levels of heterogeneity within each external cohort compared to the PROMISSE cohort. Model performance in external validation cohorts varied depending on the algorithm used. As expected, AUCs in the external cohorts were generally lower than cross-validated internal estimates, but still indicated satisfactory performance of the different models with the independent data sets (table 2). Super Learner, the highest performing algorithm in PROMISSE, performed well across all three external cohorts, with a minimum AUC of 0.63 in the NYC cohort and a maximum of 0.71 in the Pisa cohort (table 2). LASSO also maintained consistent external performance with minimum AUC of 0.60 and maximum of 0.66. Overall, performance was highest using data from the Pisa cohort, which was the largest and most complete of the three external validation data sets.

Conclusions Penalized regression and ML approaches using variables obtained early in pregnancy show potential in discriminating pregnancies with high APO risk from those pregnancies with lower risk. This study provides confirmation of the geographic transportability of the best performing algorithms developed with PROMISSE. While Super Learner showed the most satisfactory performance across external cohorts, LASSO also performed well and yielded a parsimonious model that may be easier and more efficient to use as a risk assessment tool in practice. Data from additional external cohorts from the US and abroad will be obtained in the future for further validation and refinement of our APO prediction models.

Abstract 604 Table 1

Patient demographics by SLE Pregnancy Cohort

Abstract 604 Table 2

AUC (95% CI) of all algorithms based on internal and external assessments

Acknowledgments This work was supported by NIH grant R21 AR076612

Trial Registration ClinicalTrials. gov Identifier: NCT00198068

Lay summary Nearly 20% of pregnancies in patients with Systemic lupus erythematosus (SLE) result in an adverse pregnancy outcome (APO); early identification of women with SLE who are at high risk of APO is vital. We previously explored several regression and machine learning methods to predict APO using data from the PROMISSE Study, a large multi-center, multi-ethnic/racial study of APO in women with mild/moderate SLE and/or aPL. We sought to determine which of the best performing algorithms in PROMISSE continued to perform well using data from other SLE pregnancy cohorts in the US and abroad. Most models showed satisfactory performance across cohorts in the ability to differentiate patients who did and not have an APO using variables measured early in pregnancy, indicating their potential for use in clinical practice to manage pregnant SLE patients.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ .

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.