Article Text
Abstract
Introduction To investigate health disparities in SLE, geographic variables must be considered in predicting outcomes of patients with systemic lupus erythematosus (SLE). The Lupus Index is a GIS-based data set was developed by the Lupus Research Alliance in collaboration with the Center for Medicare Services and the National Minority Quality Forum. This database includes diagnosis codes, geographic, demographic, and provider specialty information, and outcomes from fee for service Medicare patients in the years 2014–2016. Mortality, prevalence, hospital and emergency room visit rates, and cost of care outcomes can be exported at state, 3-digit zip code, zip code, metropolitan area, US Congressional district, state upper and lower house district, and county levels. This study was designed to evaluate algorithm performance of CMS data to define SLE and lupus nephritis (LN) in a well clinically phenotyped population of lupus and control patients in a both a longitudinal registry and the rheumatology clinics at the Medical University of South Carolina.
Methods Extracting identified data from the Lupus Index is not possible, so we created a shadow CMS population to simulate the Lupus Index data set. All claims for all patients in the linkage CMS data from 1/1/2016 to 12/31/2018 who had one or more claims for on any type of claim submitted by a limited set of National Provider Identifier numbers (NPIs) were extracted and limited with NPIs from rheumatology providers at the Medical University of South Carolina. The Division of Rheumatology and Immunology at the Medical University of South Carolina maintains a longitudinal registry of SLE and healthy control volunteers that spans the years of 2003 to the present. Attending providers classify SLE patients by the American College of rheumatology 1997 criteria or one immunologic criterion and biopsy-proven LN. An honest broker limited this data set to participants in the MUSC Rheumatology longitudinal registry using previously published methods of linkage. A separate cohort was generated by selecting a random 5% sample for all Medicare patients in the MUSC Research Data Warehouse with one or more claims of any type submitted by any of the above NPIs, from 1/1/2016 to 12/31/2018. The diagnosis of SLE and LN and the number of ACR criteria was determined by keyword text search of electronic health records and manual chart review. For each patient, the number of claims with the ICD 10 code M32.* and the time interval between these codes and all encounter codes was determined. For lupus nephritis, codes of M32.14 and M21.15 or combined codes of M32.* and proteinuria or nephritis codes were used. The performance characteristics of algorithms using one to four SLE or LN diagnosis codes versus the number of codes separated by zero, 30, 60, and 90 days, respectively were described by the sensitivity, specificity, and receiver operator characteristics area under the curve (ROC AUC) for prediction of SLE classification or LN diagnosis.
Results Of 1,474 suspected SLE and control volunteers, 701 were confirmed by SLE classification criteria. 596 controls were confirmed as having no features of connective tissue disease. Of these, 254 confirmed SLE and five control patients linked to the CMS data set. Eighty-two had one to three ACR criteria for SLE and had alternative diagnoses of undifferentiated connective tissue disease, cutaneous lupus, inflammatory arthritis, mixed connective tissue disease, Sjogren syndrome, systemic sclerosis, drug induced lupus, sarcoidosis, multiple sclerosis, IgA nephropathy, and juvenile idiopathic arthritis. Similarly, from the random sample of patients seen at MUSC eight were confirmed as having SLE by manual chart review, and eighty-six were confirmed as non-connective tissue disease by manual chart review. The manually validated Medicare SLE cases within the SC SLE longitudinal cohort tended to be younger than the controls (53 ± 15 vs. 64 ± 13, p < 0.001), and the validated cohort tended to be more female (91% vs 83%, p = 0.02). Reflecting the demographics of SLE and the country, Black patients made up the majority of the SLE patients, while White patients were more represented in controls (75% vs. 27% respectively, p < 0.001).
The ROC AUC for all SLE algorithms was good for both the time agnostic and the time limited algorithms (0.877 (95% confidence interval (CI) = 0.838 – 0.917) vs. 0.881 (0.842 – 0.921) respectively). The curves were matching except at the 2-diagnosis code cut point for both algorithms (figure 1 A). The specificity was slightly different for the 2-diagnosis code algorithm if the codes were separated by 30 days or more (0.768 vs. 0.761).
265 individuals had defined SLE and LN status. Of these, ninety-four had confirmed LN. The ROC AUC for the number of LN codes (M32.14 and m32.15) was 0.898 (95% CI 0.848–0.947), with all but one coded as M32.14. The sensitivity and specificity of a single LN code was 0.798 and 0.994. Two LN codes reduced the sensitivity to 0.596 but increased the specificity to one. Combining SLE and either proteinuria or nephritis codes produced and ROC AUC of 0.529 (0.455–0.603) (figure 1B). Adding 30-day intervals between codes did not improve the specificity of the algorithm but significantly reduced the sensitivity. All individuals with a single LN code had at least two SLE codes. The sensitivity and specificity of coordinates on each curve are detailed in table 1A for SLE and table 1B for LN.
Conclusions and Discussion For most purposes, an algorithm using two SLE diagnosis codes either with or without a 30-day interval was sufficient for defining SLE, while a single LN code and at least 2 SLE codes had sufficient sensitivity and specificity to define LN. Proteinuria and nephritis codes associated with SLE codes did not improve the specificity but significantly reduced the sensitivity for diagnosing LN and would likely not be helpful in the setting of rheumatology practices where the LN-specific codes are used. It is quite possible that the LN codes would be less sensitive outside the setting of nephrology and rheumatology practices. The Lupus Index allows users to specify coding from rheumatology practices to achieve the specificity for a LN diagnosis that was observed in this study. When less specificity is necessary, inclusion criteria could be loosened to include non-rheumatology providers. These findings allow investigators to select ICD10 code-based algorithms in a Medicare SLE population to achieve the desired sensitivity and specificity of SLE and LN diagnosis for their specific study. The ability to export data at multiple geographic levels gives investigators the ability to overlay geographic environmental data with outcome data for health disparities research.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.