Article Text

Original research
Selection of indicators reporting response rate in pharmaceutical trials for systemic lupus erythematosus: preference and relative sensitivity
  1. Jingru Tian1,2,
  2. Shuntong Kang3,
  3. Dingyao Zhang4,5,
  4. Yaqing Huang6,
  5. Xu Yao1,
  6. Ming Zhao1,2 and
  7. Qianjin Lu1,2
  1. 1Institute of Dermatology, Chinese Academy of Medical Sciences and Peking Union Medical College, Nanjing, Jiangsu, China
  2. 2Key Laboratory of Basic and Translational Research on Immune-Mediated Skin Diseases, Chinese Academy of Medical Sciences, Nanjing, Jiangsu, China
  3. 3Department of Dermatology, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China
  4. 4Graduate Program in Biological and Biomedical Sciences, Yale University, New Haven, Connecticut, USA
  5. 5Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA
  6. 6Department of Pathology, Yale University, New Haven, Connecticut, USA
  1. Correspondence to Dr Qianjin Lu; qianlu5860{at}; Dr Jingru Tian; jingru.tian{at}; Dr Ming Zhao; zhaoming301{at}


Objective SLE is a common multisystem autoimmune disease with chronic inflammation. Many efficacy evaluation indicators of randomised clinical trials (RCTs) for SLE have been proposed but the comparability remains unknown. We aim to explore the preference and comparability of indicators reporting response rate and provide basis for primary outcome selection when evaluating the efficacy of SLE pharmaceutical treatment.

Methods We systematically searched three databases and three registries to identify pharmacological intervention-controlled SLE RCTs. Relative discriminations between indicators were assessed by the Bayesian hierarchical linear mixed model.

Results 33 RCTs met our inclusion criteria and we compared eight of the most commonly used indicators reporting response rate. SLE Disease Activity Index 4 (SLEDAI-4) and SLE Responder Index 4 were considered the best recommended indicators reporting response rate to discriminate the pharmacological efficacy. Indicator preference was altered by disease severity, classification of drugs and outcome of trials, but SLEDAI-4 had robust efficacy in discriminating ability for most interventions. Of note, BILAG Index-based Combined Lupus Assessment showed efficacy in trials covering all-severity patients, as well as non-biologics RCTs. The British Isles Lupus Assessment Group response and Physician’s Global Assessment response were more cautious in evaluating disease changes. Serious adverse event was often applied to evaluate the safety and tolerability of treatments rather than efficacy.

Conclusions The impressionable efficacy discrimination ability of indicators highlights the importance of flexibility and comprehensiveness when choosing primary outcome(s). As for trials that are only evaluated by SLEDAI-4, attention should be paid to outcome interpretation to avoid the exaggeration of treatment efficacy. Further subgroup analyses are limited by the number of included RCTs.

PROSPERO registration number CRD42022334517.

  • Systemic Lupus Erythematosus
  • Outcome Assessment, Health Care
  • Clinical Trial

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • The comparability between indicators reporting response rate in randomised clinical trials of SLE remains unknown. We innovatively conduct a Bayesian hierarchical linear mixed model and provide advice for the primary endpoint selection.


  • Indicator preference is altered by disease severity, classification of drugs and outcome of trials. SLE Disease Activity Index 4 and SLE Responder Index 4 are considered the best recommended indicators reporting response rate.


  • Our findings determine the preference and relative sensitivity of indicators reporting response rate under different circumstances, and highlight the importance of evaluating trial validity using a multidimensional criterion.


SLE is an aberrant autoimmune disease with diverse clinical manifestations and antibodies that predominantly affects females.1 2 The substantial prevalence and chronic disease course of SLE, combined with the adverse effects brought by corticosteroid usage, result in the increased disease burden globally.3–5 The purpose of SLE management is to achieve the remission of systemic symptoms and organ manifestations, which is considered a desirable outcome for patients with SLE with at the very least the absence of significant symptoms and signs of SLE, but high therapeutic needs are still unmet.6 For regular treatment, hydroxychloroquine and glucocorticoids are recommended in all patients with lupus, and appropriate initiation of immunosuppressive agents can expedite the discontinuation of glucocorticoids. Additionally, calcineurin inhibitors, belimumab and rituximab should be considered to add in persistently active condition.7 Recently, many innovative and targeted therapies have been proposed, showing promise in disease control even in patients with intractable complications.8 9 Nevertheless, the development and implementation of new SLE therapies have lagged behind that of other autoimmune rheumatic diseases. Due in large part to its heterogeneity with involvement in multiple principal domains, which are inconsistent at different times, the change or improvement in the course of SLE is difficult to measure.10 Indicators are important tools to monitor the performance of drugs and to identify emerging problems for improvement. To reflect intervention-derived benefits accurately, the ideal efficacy-evaluated indicators are the important basis of the field. In early 1996, the Systemic Lupus International Collaborating Clinics proposed the need to build a comprehensive assessment that includes disease activity, chronic damage and quality of life for patients with SLE.11 A set of quality indicators for SLE were then published by the European League Against Rheumatism, which covered a number of aspects of patient assessment.12 The most frequent applied metrics in randomised clinical trials (RCTs) are the British Isles Lupus Assessment Group (BILAG) and the SLE Disease Activity Index (SLEDAI).13 Currently, composite indices are also used as primary endpoints in clinical trials.

Although many indices are widely used in clinical trials and research, criteria for evaluating efficacy in pharmaceutical clinical trials for SLE have not been unified and recognised yet.14 The preference (ranking of different indicators based on their weight) and relative sensitivity (ability to detect and reflect variations) of indicators between trials with different design, drug format and baseline characteristics may alter final results, mislead researchers and limit the comparability of trial results.15 16 The diversity in the usage of scales underscores the fact that no single indicator has been universally accepted so far. Furthermore, the sensitivity and specificity of these indicators remain uncertain. In addition, the failure of many drugs to meet their primary or secondary endpoints has led to the re-examination of the design of SLE trials.10 17 Accordingly, there is a need to compare indicators within the same population to determine their comparability and preference in different types of RCTs for SLE. Our results determine the relative sensitivity of the indicators reporting response rate under different circumstances and underline the importance of assessing the efficacy of interventions using a multidimensional criterion.

Materials and methods

Study design

This systematic review and meta-analysis was prospectively registered on PROSPERO (ID: CRD42022334517), and reported as per the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.18

Search strategy

Two investigators (JT and DZ) searched published articles and clinical trial registry records, appraised studies on eligibility and extracted data independently. The search for RCTs included published articles from peer-reviewed English-language journals and registered trials in clinical trial registries, from inception to 4 May 2021. Three databases, that is, PubMed, EMBASE and Cochrane Library Central Register of Controlled Trials (CENTRAL), were systematically searched, and search strategies were adjusted to meet the specifications of each database. The search was supplemented by manual review of the reference lists of included publications and relevant reviews. Records of registered RCTs were collected from three publicly available web-based clinical trial registries including the of the US National Library of Medicine, the International Standard Randomised Controlled Trial Number Register and the Australian and New Zealand Clinical Trials Registry. The keyword search term “lupus” was entered combined with other specific filtering options in advanced search function for ‘Country’, ‘Study type’, ‘Current status’, etc in searching for eligible RCTs. Only studies that contained two or more specific outcome indices reporting response rate were included. Discrepancies were discussed and agreed by consensus. Detailed search strategies, study selection and screening and data extraction methods were provided in online supplemental appendices 1–4.

Supplemental material


We studied eight most commonly used SLE disease activity assessment tools reporting response rate, including three indicators based on the SLE Responder Index (SRI), namely SRI-4, SRI-5 and SRI-6; BILAG Index-based Combined Lupus Assessment (BICLA); serious adverse events (SAE); SLEDAI-4 (≥4-point improvement from baseline using SLEDAI); BILAG response (no worsening in BILAG index from baseline); and Physician’s Global Assessment (PGA) response (no worsening in PGA from baseline). Details of the above indicators were shown in online supplemental appendix 5. The outcome of interest was the percentage change between intervention and control groups.

Data analysis

To remove the influence of other factors, the gold standard model for sparse and heterogeneous data19–21—a Bayesian hierarchical linear mixed model—was applied to estimate the difference between control group and intervention group to obtain relative sensitivity and preference of outcome indicators in SLE. In hierarchical model, we calculated the percentage change (control group possibility−intervention group possibility) for discrete groups. The statistical analysis was implemented by brms package in R (V.4.0.5) with 8000 iterations and four chains. This package used Hamiltonian Markov chain Monte Carlo method to estimate posterior distribution. The model had three predictor covariates with fixed effects: topical or systemic application, age and disease severity. The intervention and type of intervention had hierarchical relationship in our model. Although there was variation in the variables, the difference between each index variable was stable in each chain.

Embedded Image

Subgroup analyses of topical or systemic application, age, disease severity and unsuccessful trials were conducted using the same Bayesian hierarchical linear mixed model to remove preference distortion of SLE outcome indicators brought by different participant situations, intervention application methods and intervention efficacy, which further demonstrated the sensitivity of different SLE outcome indicators. Detailed method and results were listed in online supplemental appendices 6 and 7.

Quality assessment

The risk of bias for individual studies was assessed according to the Cochrane Risk of Bias 2.0 tool22 for RCTs by two investigators (JT and SK) independently and disagreements were determined by discussion.

Patient and public involvement

No patients were involved in the design, conduct or reporting of this research owing to the nature of the study as a systematic review. Ethics approval was not required for this study.


Overview of indicators in pharmacological intervention-controlled RCTs for SLE

The characteristics of 33 enrolled studies were summarised in online supplemental appendices 8 and 9, and the most used indicator was SRI-4 (81.8%, 27). A total of 97.0% of the included studies were judged as having a low medium risk of bias (online supplemental appendix 10). Network plot of indicator comparisons was presented in figure 1, with nodes representing competing indicators and edges representing RCTs for pairs of indicators. These were divided into three subgroups based on disease severity, type of intervention and outcome of the trials. The majority of trials covered moderate-to-severe patients (84.8%, 28), and only five RCTs (15.2%) included all severity. According to pharmaceutical interventions, 21 RCTs (63.6%) were with antibodies, 10 (30.3%) with small molecules and 2 (6.1%) with non-biologics. Moreover, 17 RCTs (51.5%) concluded the pharmacological interventions were non-effective and 16 RCTs (48.5%) yielded effective results, with similar proportions. No obvious difference was found when assessing indicators among RCTs examined less effective medications, with different intervention types, with different characteristics of participants.

Figure 1

Network of eligible comparisons for efficacy evaluation indicators. The size of the nodes (purple circles) corresponds to the number of trials. Comparisons are linked with a line, the thickness of which corresponds to the number of trials that assessed the comparison. BICLA, BILAG Index-based Combined Lupus Assessment; BILAG, British Isles Lupus Assessment Group; PGA, Physician’s Global Assessment; SAE, serious adverse event; SLEDAI, SLE Disease Activity Index; SRI, SLE Responder Index.

Relative sensitivity and preference of indicators reporting response rate in pharmacological intervention-controlled RCTs for SLE

The overall preference of indicators was evaluated by Bayesian model considering the influence of topical or systemic application, age and disease severity (online supplemental appendix 11). Since the estimation of each indicator was calculated by its control group possibility minus intervention group possibility, a larger difference between two indicators represented a relatively better discrimination ability of the first indicator. The results were all presented as the weighted mean differences with corresponding 95% uncertainty intervals. If the null value was not included in the 95% uncertainty intervals, a statistically significant difference was detected. Given that, SLEDAI-4 was the best indicator with significantly higher response rate in intervention groups than in control groups compared with BILAG response, PGA response and SAE, which meant for the same participants, SLEDAI-4 was more likely to uncover the effectiveness of pharmacological interventions than other indicators. SRI-4 was the second preferred indicator, with SRI-6, SRI-5 and BICLA in descending order, which significantly preceded SAE. On the contrary, SAE was shown to perform worst with statistical significance compared with BICLA, SLEDAI-4, SRI-4, SRI-5 and SRI-6, which meant it could barely reflect the discrepancies between pharmacological interventions and controls. Besides, BILAG response was supposed to be the second worst indicator and PGA response was the third, both had significantly lower response rates in intervention groups than in control groups compared with SLEDAI-4 (figures 2 and 3).

Figure 2

Bayesian hierarchical linear mixed model estimated effectiveness with 95% uncertainty intervals on indicators reporting response rate in pharmacological intervention-controlled randomised clinical trials (RCTs) for SLE. BICLA, BILAG Index-based Combined Lupus Assessment; BILAG, British Isles Lupus Assessment Group; PGA, Physician’s Global Assessment; SAE, serious adverse event; SLEDAI, SLE Disease Activity Index; SRI, SLE Responder Index.

Figure 3

Preference of indicators reporting response rate in pharmacological intervention-controlled randomised clinical trials (RCTs) for SLE. The rank of indicators reporting response rate. The sooner an indicator reaches 1, the stronger the ability to discriminate treatment efficacy. BICLA, BILAG Index-based Combined Lupus Assessment; BILAG, British Isles Lupus Assessment Group; PGA, Physician’s Global Assessment; SAE, serious adverse event; SLEDAI, SLE Disease Activity Index; SRI, SLE Responder Index.

Subgroup analyses of relative sensitivity and preference of indicators reporting response rate in pharmacological intervention-controlled RCTs for SLE

The preference for indicators was also implicated when evaluating groups with different disease severity, intervention type or the outcome of trials. The sensitivity of SLEDAI-4 was attenuated in terms of evaluating and comparing the treatment efficacy for participants with moderate-to-severe SLE. SLEDAI-4 was comparable to SRI-4, SRI-5 and SRI-6, being significantly better than SAE, while SAE still showed limited discrimination ability and was significantly worse than other indicators except BILAG response and PGA response. The remaining indicators were not significantly different (online supplemental figure S5). Besides, in patients with all severity, SLEDAI-4 tended to be a more powerful indicator than other indicators even without statistical significance. It was noteworthy that BICLA could become a recommended indicator along with SRI-4, and SAE still lagged behind (online supplemental figure S6).

Moreover, SLEDAI-4 was also the most powerful indicator in the evaluation of antibody pharmacological interventions, being significantly superb than BILAG response and SAE. SRI-4 showed a non-dominant advantage compared with SRI-5 and SRI-6, which tied for same place. In addition, BICLA ranked next, with significant difference from SAE. Still, SAE remained the significantly least effective indicator when comparing it to other indicators except the BILAG response. What’s more, BILAG response was the second worst indicator and PGA response the third, which were significantly different from SLEDAI-4 and SAE, respectively (online supplemental figure S7). When assessing small molecules, though all indicators were comparable and no obvious difference was observed, it was supposed that SLEDAI-4 and SRI-4 were preferred (online supplemental figure S8). Within non-biologics interventions, there was also no clear superiority or inferiority among these indicators, but BICLA and SLEDAI-4 were more relatively sensitive (online supplemental figure S9).

When evaluating the efficacy of successful RCTs, SAE performed significantly worst compared with other indicators again, and BILAG response was significantly less preferred for measuring intervention efficacy compared with SAE and SLEDAI-4. In contrast, SLEDAI-4 achieved better significant discrimination ability than SAE and BILAG response, while SRI-4 was another indicator significantly suggested compared with SAE. Both SLEDAI-4 and SRI-4 were comparable in successful SLE trials. Besides, SRI-5 and SRI-6 presented equivalent efficacy revealing ability than SAE. Although with no statistical significance, BICLA and PGA response were also comparable (online supplemental figure S10). Seventeen unsuccessful RCTs were further analysed, and none of the indicators had robust efficacy discriminating ability for interventions that brought minor benefit. However, according to the rank of sensitivity, SLEDAI-4 was still the leading indicator that could reveal minimal benefits for pharmacological interventions. Besides, SRI-4, SRI-5, SRI-6, BICLA, BILAG response and PGA response had comparable tendencies to uncover the intervention effectiveness, although the differences were not significant (online supplemental figure S11).


Precision and accuracy in defining SLE disease activity has improved over the past 30 years and optimal indicators need to be cost-effective and robust when discriminating performance that correlate with the outcome of interest.10 23 For the first time, our study outlines the protocol for a Bayesian hierarchical linear mixed model designed to identify the most suitable indicators for SLE intervention assessment. SLEDAI-4 was the most valid indicator for nearly all types of pharmacological RCTs of SLE, and others were recommended together with it in different subgroups including different disease severity, intervention type or the outcome of trials, respectively. In contrast, SAE proved to be the least preferred indicator for efficacy discrimination under different circumstances. Our recommendations for the selection of primary outcome indicator(s) in future SLE RCTs are provided in table 1.

Table 1

Recommendations for the selection of response rate indicators as primary outcome of RCTs for SLE

Notably, the primary outcome played a dominant role in the statistical determination of intervention efficacy in clinical trials.24 25 After SRI-related indexes were proposed, they became favoured by numerous RCTs as the preferred primary outcome.15 Interestingly, the efficacy-reflecting ability of SRI-4 was not superb, while SLEDAI-4 as a component of SRI criteria was found to be the most sensitive indicator in our article. Similarly, in the phase III belimumab trial, it was analysed that the main contributor of SRI-4 was the improvement in SLEDAI alone and it was sufficient to discern improvement in most cases.26 Approximately one-third of included trials had SLEDAI-4 as a secondary outcome but few took it as a primary outcome, we recommended new trials that focus on revealing drug efficacy could attempt to apply SLEDAI-4 as a primary outcome indicator to avoid false negative.27 Meanwhile, choosing SLEDAI-4 as the only outcome indicator might lead to overestimates of treatment benefits, thus a cautious interpretation was needed.28 Furthermore, reduction of background therapy (especially glucocorticoids) and rigorous requirements for the trial sites would contribute to maximising the possibility of developing successful therapies.17

SRI-5 and SRI-6 were comparable most of the time, so one of them was advised to be selected as an outcome indicator to avoid redundancy in experimental design. PGA response and BILAG response were less preferred, representing that they were more cautious in evaluating disease changes. Owing to their low efficacy of assessment and the complexity of the criteria, both were not suggested as routine except as a supplement for SLEDAI-4 to obtain SRI-4. Though most trials demonstrated that the two composite response indices—SRI-4 and BICLA—were synergistic in terms of efficacy identification,29–31 a prior study noted that SRI-4 was more sensitive in patients with moderate-to-severe SLE.32 Similarly, based on our analysis, we recommended SRI-4 in patients with moderate-to-severe SLE instead of BICLA, while for patients with all severity, these two indicators were comparable.

Further detailed subgroup analysis was limited by the insufficient number of trials and the results need careful interpretation owing to the limitations of this study. As the most sensitive indicator was accompanied by increased false positives, a balanced indicator selection was always necessary. Immunological and clinical biomarkers also played an essential role in improving diagnosis, assessment and control of SLE; combining those indices could provide a more comprehensive assessment of the disease status in patients with SLE.33 Current indicators struggle to distinguish between responders and non-responders in SLE. Despite efforts in clinical trials like the Exploratory Phase II/III SLE Evaluation of Rituximab (EXPLORER), Belimumab International Systemic Lupus Erythematosus (BLISS)-52 and BLISS-76, results have been inconsistent.34 In response, there’s a shift towards alternative measures. ‘Treat to target’ endpoints focusing on low disease activity and remission were introduced.35 The Treatment Response Measure for SLE Taskforce is formed to create a multidomain clinical outcome measure for SLE trials. This can cover organ-specific manifestations like lupus nephritis, symptoms such as rashes and findings from laboratory tests.36 Additionally, the Lupus Foundation of America Rapid Evaluation of Activity in Lupus provides comprehensive lupus activity evaluations from both patient and clinician viewpoints.37 Moreover, SLE encompassed multidimensional issues such as physical, psychological and socioeconomical burden. Treatments of SLE were directed at prolonging patients’ survival, preventing organ damage and flares and optimising health-related quality of life (HRQoL). Therefore, HRQoL should be highlighted, offering the patients’ perspective on the disease and the impact of treatment on daily life. HRQoL was measured by Lupus Patient-Reported Outcome, Lupus Quality of Life, EuroQol-5D, Short Form 36 Health Survey, etc.10 38 Additionally, the evaluation ability of indices reporting score change could be explored further.

In summary, given the problems encountered in previous unsuccessful clinical trials, it is imperative to evaluate and demonstrate the therapeutic advantages of pharmacological interventions. Our results present evidence for the determination of indicators reporting response rate as primary outcome(s) in SLE RCTs and will help to propose and adopt better trial designs. SLEDAI-4 with the relatively highest sensitivity is the most objective indicator for this complex condition, and SRI-4 should be considered either. Comprehensive assessments together with other types of indicators are also essential. As for trials that are only evaluated by SLEDAI-4, attention should be paid to the interpretation of outcomes to avoid the exaggeration of treatment efficacy.

Data availability statement

All data relevant to the study are included in the article or uploaded as supplementary information.

Ethics statements

Patient consent for publication

Ethics approval

Ethics approval was not applicable, since patients were not involved in the design of this protocol.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • JT and SK contributed equally.

  • Contributors QL and JT conceived the study. JT and QL developed the protocol. JT and DZ did the literature search. SK and JT appraised the study quality and extracted and analysed the data. DZ was in charge of computation and coding. JT, SK, DZ, MZ, XY and QL interpreted the data. JT and SK wrote the first draft of the article. XY and YH revised the first draft of the article. QL reviewed and critically evaluated the draft paper. QL is responsible for the overall content as the guarantor.

  • Funding CAMS Innovation Fund for Medical Sciences (2021-I2M-1-059), Non-profit Central Research Institute Fund of Chinese Academy of Medical Sciences (2021-RC320-001, 2020-RC320-003), National Natural Science Foundation of China (81830097, 82203933).

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.