SLE is a complex autoimmune disease with considerable unmet need. Numerous clinical trials designed to investigate novel therapies are actively enrolling patients straining limited resources and creating inefficiencies that increase enrolment challenges. This has motivated investigators developing novel drugs and treatment strategies to consider innovative trial designs that aim to improve the efficiency of generating evidence; these strategies propose conducting fewer trials, involving smaller numbers of patients, while maintaining scientific rigour in safety and efficacy data collection and analysis. In this review we present the design of two innovative phase IIb studies investigating efavaleukin alfa and rozibafusp alfa for the treatment of SLE which use an adaptive study design. This design was selected as a case study, investigating efavaleukin alfa, in the Food and Drug Administration’s Complex Innovative Trial Design Pilot Program. The adaptive design approach includes prospectively planned modifications at predefined interim timepoints. Interim assessments of futility allow for a trial to end early when the investigational therapy is unlikely to provide meaningful treatment benefits to patients, which can release eligible patients to participate in other—potentially more promising—trials, or seek alternative treatments. Response-adaptive randomisation allows randomisation ratios to change based on accumulating data, in favour of the more efficacious dose arm(s), while the study is ongoing. Throughout the trial the placebo arm allocation ratio is maintained constant. These design elements can improve the statistical power in the estimation of treatment effect and increase the amount of safety and efficacy data collected for the optimal dose(s). Furthermore, these trials can provide the required evidence to potentially serve as one of two confirmatory trials needed for regulatory approval. This can reduce the need for multiple phase III trials, the total patient requirements, person-exposure risk, and ultimately the time and cost of investigational drug development programmes.
- Adaptive Clinical Trial Design
- Lupus Erythematosus, Systemic
- Autoimmune Diseases
Data availability statement
Data are available upon reasonable request. Qualified researchers may request data from Amgen studies. Complete details are available at the following: https://wwwext.amgen.com/science/clinical-trials/clinical-data-transparency-practices/
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
What is already known on this topic
This paper reviews current challenges in SLE drug development that have arisen due to a robust pipeline of potential therapies creating resource limitations, including a paucity of eligible patients, which have hindered drug development.
Despite the well documented need for innovative clinical trial approaches in SLE, few demonstrative examples of adaptive design studies have been reported.
What this study adds
An adaptive clinical trial design is introduced that was developed for two phase II studies investigating two potential SLE therapies sponsored by Amgen.
This design was selected as a case study in the Food and Drug Administration’s Complex Innovative Trial Design Pilot Program and incorporates response-adaptive randomisation (RAR), where the randomisation ratio can adapt based on accumulating data to favour the more efficacious dose arm(s) while the trial is ongoing.
In describing this design, a few of the statistical simulations that were performed to inform the RAR methodology are explained and illustrated.
How this study might affect research, practice or policy
Innovative approaches to trial design may relieve resource limitations through more efficient use of trial data, provide substantial evidence of efficacy of an investigational product, and yield ethical benefits for participating patients.
SLE is a complex, chronic, systemic autoimmune disease that can be disabling and life-threatening.1–4 A few therapies have recently been approved for SLE despite a history of disappointing results from trials for many other agents.5–10 Responses to approved treatments are not ubiquitous for all patients with SLE, and the availability of new therapeutic options remains a substantial unmet need.11–13 To help address this need, there is a robust pipeline of innovative therapies for SLE currently in development.14–17 While this increases the likelihood of identifying more viable treatment options for patients, the parallel development programmes require formidable numbers of eligible patients with SLE. Patient participation often lasts for many months or years, during which time they are ineligible for participation in other trials.18
Strict eligibility criteria in SLE trials result in limited numbers of trial-eligible patients. Competition for these patients leads to slow enrolment for all programmes.19 Recruitment is often restricted to participants with disease activity that is sufficient to permit demonstration of treatment efficacy, yet not severe enough to require excessive use of oral corticosteroids or other potent immunosuppressive therapies.20 In addition, disease activity must be sufficiently stable to minimise the need for medication adjustments, which can exacerbate the already problematic confounding of treatment effect estimates by background therapies.13 21 Entry criteria have evolved in an attempt to ensure safe participation of patients in a placebo-controlled randomised clinical trial. All this results in opposing factors at play in ensuring significant but stable disease in patients at trial entry. For example, use of adequate, but not excessive, background treatments are necessary so that patients receiving placebo, or those who do not respond to the study medication, are neither endangered nor overtreated. Adding to these recruitment challenges is the requirement for a significant time commitment from participants in order to reach scientific conclusions about both safety and efficacy of new treatment options.19 22 23 Likewise slow recruitment due to a pandemic, war or political unrest further underscores the need for smaller and more efficient trials that require fewer patients.24 25
In addition to competition for patients, the large number of concurrent clinical trials also creates intense competition for sites with SLE trial experience to participate in these programmes. Measurement of disease activity, to evaluate eligibility for trial entry and response to therapy, is particularly complex in patients with SLE because of fluctuating disease activity over time and involvement of multiple organ systems.26 27 Experienced evaluators are needed to ensure accuracy in disease activity measurements. Inadequately trained sites, with insufficiently trained staff, can enrol patients that do not meet entry criteria. Therefore, enrolment does not align with the expected number of qualified patients or varies considerably between sites. Trial data are bound to be compromised when it becomes necessary to use trial sites where investigators’ training can be cursory, profoundly affecting the interpretability of the data. High demand can also overwhelm experienced sites to the extent that they cannot participate in all of the programmes that need them.13 22
Adaptive designs, Bayesian statistical models, and other novel trial design elements are efficient and resource-sparing strategies that, when appropriately introduced, do not compromise trial integrity or validity of trial results and may help to address some of the serious challenges faced in drug development in SLE.28 29 Adaptive clinical trial designs allow for prospectively planned modifications in one or more aspects of the design based on accumulating data and interim analyses.28–31 Leveraging data from early in the trial to adapt study conduct can lead to swift conclusions of futility (or success), enabling patients to participate in other trials, and researchers to modify the size of one or more dosing groups, or reduce or increase patients’ exposure to dosages based on emerging efficacy or tolerability data.
The US Food and Drug Administration (FDA) and European Medicines Agency both agree that adaptive trial designs can be used to generate meaningful data on safety and efficacy for investigational treatments.28 30 32 The FDA has initiated efforts focused on advancing complex innovative designs (CIDs), including adaptive trials, to modernise drug development, improve efficiency and promote innovation.33 Moreover, a large group of clinical investigators and treatment developers convened by the Lupus Foundation of America proposed adaptive trials as a potential strategy to overcome existing barriers to SLE drug development.13 Despite these calls for innovative trial designs to remedy unmet needs in treatment development in rheumatology, including SLE, few examples of adaptive clinical trials in SLE have been published.34–36
In this review we describe a phase IIb adaptive trial design selected as a case study in the FDA’s CID Pilot Program.37 Regulatory requirements and methodology adopted for evaluating the statistical rigour of this CID trial design as a registrational-quality study will also be presented.
Adaptive design of the phase IIb trials for rozibafusp alfa and efavaleukin alfa in SLE
An adaptive design is being used in two ongoing phase IIb double-blind, placebo-controlled, dose-ranging, multicentre studies at Amgen evaluating the safety and efficacy of rozibafusp alfa (AMG 570) (NCT04058028) and efavaleukin alfa (AMG 592) (NCT04680637). Rozibafusp alfa is a novel bispecific antibody-peptide conjugate that simultaneously blocks inducible costimulator ligand and B cell activating factor activity38 and efavaleukin alfa is an interleukin (IL) 2 mutein fragment crystalisable (Fc) fusion protein that induces selective expansion of regulatory T cells.39
The primary objective of each of these phase IIb studies is to assess the efficacy of three dose levels of the investigational product versus placebo (figure 1). The studies are also designed to support identification of the optimal dose for a subsequent phase III confirmatory trial. Each study plans to separately enrol 320 patients, 18–75 years of age, with active SLE despite standard of care (SOC) therapy (oral corticosteroids and other immunosuppressants and/or immunomodulators). The primary endpoint in each study is the SLE Responder Index (SRI)−4 response,40 a composite endpoint comprised of hybrid SLE Disease Activity Index improvement of 4 points or more from baseline, no worsening of 0.3 points or more on the Physician’s Global Assessment, and no new severe organ score on the British Isles Lupus Assessment Group (BILAG) Index or more than one new BILAG moderate organ activity score. Traditional phase II dose-finding studies often evaluate treatment through 24 weeks; in the case of these phase IIb studies, the primary endpoint is evaluated at the regulatory required timepoint of 52 weeks. This will allow the study to potentially serve as one of two required registrational studies, reducing the overall development programme from three independent studies (one phase II and two phase III) to two. Interim analyses are iterative, beginning after the first 40 patients are randomised and have had the opportunity to complete the week 24 assessment (figure 1). Subsequent interim analyses are planned after every additional 32 patients are randomised and have had the opportunity to complete the week 24 assessment, until full enrolment is achieved or until futility is determined for all doses at an interim analysis. At each interim analysis, prior to the study being fully enrolled, available SRI-4 response data will be analysed to inform response-adaptive randomisation (RAR) and, beginning at the second interim analysis, futility.
At the last interim analysis, after all 320 enrolled patients have had the opportunity to complete week 24, the results based on available data may trigger early planning activities of the subsequent phase III studies. This final interim analysis has no impact on the conduct of the current phase IIb trials, but when there is convincing evidence of efficacy, there is opportunity to expedite planning and operational activities—reducing the time gap between phase II and phase III.
RAR is one of the adaptive features included in the phase IIb trials. With RAR, the initial treatment allocation ratio (1:1:1:1) can be modified after the study starts at predefined timepoints. These timepoints are prospectively chosen to ensure adequate data collection to provide differentiating information between doses. In these phase IIb trials, RAR is first implemented after 40 subjects have been enrolled and have had the opportunity to complete 24 weeks on study. It is subsequently implemented every additional 32 subjects until the study is fully enrolled. The randomisation ratio modifications are based on interim analyses of available clinical efficacy data to identify the most efficacious dose(s). RAR would then update the randomisation ratio to allocate subsequent active treatment patients to the more efficacious dose(s) while maintaining the placebo allocation constant at 25%. Anticipating that the placebo arm will have the lowest response rate, maintaining a 25% allocation ratio will preserve statistical power by preventing a reduction in the number of patients randomised to this arm.
The statistical model used to update the randomisation allocation probabilities, as well as the cadence by which the randomisation ratio will be modified, are prespecified in the protocol. RAR is implemented early to maximise the number of patients who benefit from an optimised randomisation scheme; it is also applied often to ensure that early changes continue to be appropriate throughout the course of the trial. An example, using simulated clinical trial data, illustrates how the observed efficacy response rates at each interim analysis inform updates to the randomisation ratios (figure 2). In this simulated trial, the high dose was assumed to be the most efficacious overall, and its early performance in the simulation reflected a linear dose response. Early interim analyses showed that the low dose had the lowest response rates, and the programme began randomising more patients to higher doses. As more data accrued, this conclusion was re-evaluated at subsequent interim analyses, and the high dose was consistently demonstrating the best efficacy and received the highest allocation ratio among the three active doses. Based on the results of the fifth interim analysis, almost all incoming subjects following this timepoint were allocated to the high dose or placebo, with minimal enrolment to the medium dose. The changes happen without the study team’s involvement or knowledge and can adapt, guided by data, according to a prespecified algorithm.
Increasing the number of patients randomised to the more efficacious treatment arm improves the power to detect a difference versus placebo by increasing the sample size assigned to that treatment arm in addition to increasing the available safety data for that dose. This results in a more robust data set to support the evaluation and the ultimate choice of dose advanced to the subsequent phase III trial.41 Unlike in trials with fixed, equal randomisation, RAR can increase the probability that a single patient may be randomised to an efficacious treatment arm,41 42 therefore patient and physician perception of trial participation may be more favourable which may increase incentive to enrol.28 43
A key objective of dose-ranging studies is to characterise the dose-response model across tested doses. It is recognised that RAR may identify a highly efficacious dose very early and result in allocating too few subjects to the other less effective dose groups, and creating challenges in fully characterising the dose-response curve. Therefore, in these two studies, the first RAR occurs after a sufficient run-in period, after the first 40 subjects have completed week 24, and would therefore ensure that a minimum number of subjects are randomised to each of the three tested doses.
Another important feature of the proposed phase IIb study design is the evaluation of futility at the planned interim analyses. Futility is assessed beginning with the second interim analysis, and at all subsequent interim analyses until full enrolment (figure 1). Assessing the primary endpoint at week 52 makes adaptations and early decision making challenging as it takes significant time to accumulate sufficient data to provide sufficient confidence in the decision. To address this, data from earlier timepoints, specifically weeks 16, 20 and 24 which were often used to inform dose finding for lupus, are leveraged to predict the week 52 response for subjects who have yet to reach the end of study using a longitudinal model. This approach allows interim analyses to start earlier and increase the benefits of both RAR and early assessments of futility. Interim analyses start after 72 patients have been enrolled and have had the opportunity to complete week 24, 6 months earlier than if complete data were required (week 52). If futility is met, further enrolment to the study may be stopped if there is sufficient evidence that the investigational product has low probability of achieving the efficacy target. Futility interim analyses increase the efficiency of a trial by supporting early decision making to terminate a futile trial. This can reduce the cost of failure and release patients and SLE-experienced sites for other development programmes that may show benefit,44 as well as reduce potential safety risks to participating patients.41 44 45
Bayesian hierarchical modelling
An innovative feature of the phase IIb study design is a prespecified analysis to evaluate the primary endpoint, which is based on a binary outcome (responder vs non-responder; figure 1). To support claims of efficacy and to serve as a registrational trial, substantial evidence is required from an adequate and well-controlled trial. When multiple statistical tests are evaluated simultaneously, statistical conclusions require control of multiplicity. Control of multiplicity may require an increase in overall sample size or prespecified assumptions regarding the dose-response relationship. For example, it is a common assumption that the dose-response relationship will be monotonically increasing, that is, the highest dose will have the greatest efficacy, followed by the middle and then the low dose. However, in SLE and other inflammatory diseases, a monotonic dose-response is often not observed (a phenomenon that has been documented for other therapeutics that modulate a complex immune system with many feedback loops and compensatory mechanisms).44–50 Instead of requiring a larger trial to accommodate the uncertainty as to the optimal dose, the innovative phase IIb design uses a Bayesian hierarchical model to compare each of the three dose levels to placebo.51 52 With this approach, the statistical model can borrow information across doses based on observed similarities between results (treatment effects) for various treatment groups. The more similar the response rates across treatment groups, the greater amount of borrowing will occur. Conversely, little or no borrowing would occur when the response rates are very different across doses. Thus, when all three doses have similar response rates, the treatment effect estimates for one dose will leverage information contained in the data for doses with similar effects, resulting in increased power and improved estimation of the treatment effect by sharing information.
This increases the statistical efficiency and can reduce the overall number of patients necessary to adequately power the trial and improve estimation of the treatment effect.
FDA’s CID Pilot Program
In August 2018, the FDA launched the CID Pilot Program to facilitate and advance the use of complex adaptive, Bayesian and other novel clinical trial designs to accelerate the development of therapies for unmet medical needs.33 This programme offers a unique opportunity for sponsors to obtain direct feedback from a large FDA multidisciplinary team on the study design, to align with the FDA on the registrational potential of the design, and to share knowledge on innovative tools to evaluate complex designs. Amgen submitted the efavaleukin alfa phase IIb study design for consideration by the FDA and was selected to participate in the CID programme. The efavaleukin alfa case study was subsequently published by the FDA on its CID website.33 37
The FDA has outlined four principles for clinical trials, including those with adaptive design, that must be satisfied to provide substantial evidence of efficacy of an investigational product.28 The following sections describe how these four principles were satisfied in the efavaleukin alfa phase IIb trial design.
Principle 1: ensure control of the chance of an erroneous conclusion
To reduce the chance for erroneous conclusions and support registration, it was important to first demonstrate that the adaptive design adequately controlled type I error below the nominal 5% level. Extensive simulations were conducted to assess the operating characteristics of this design under a multidimensional range of plausible values for uncontrollable nuisance parameters—for example, placebo response rate, enrolment speed, etc—under the scenario where none of the dose levels provided benefit over placebo alone. These simulations demonstrated that the proposed study design adequately controlled type I error across the plausible range of these parameters.
Lack of control of type II error can lead to a false conclusion of lack of efficacy, and possibly to termination of a clinical development programme for a product that could be beneficial to patients. To assess this operating characteristic, simulations were used to evaluate seven efficacy scenarios based on the absolute difference of treatment response rate to placebo at week 52. These included a traditional linear response across doses where one dose meets the target efficacy and the other two doses have moderate or low efficacy relative to placebo. For the purposes of illustration, this dose-response assumption was labelled as the ‘Good’ result. Other scenarios that were evaluated varied the assumptions regarding relative efficacy between the doses. For one of these scenarios, labelled ‘Nugget’, only one dose meets the target efficacy while the remaining two doses are assumed to have no effect relative to placebo. As the Bayesian hierarchical model used for evaluation of the primary endpoint does not assume a dose-response relationship, evaluation of these efficacy scenarios does not require specification of which assumption applies to the low, medium or high dose. This is particularly advantageous in SLE trials, because, as noted earlier, greater efficacy is often not seen at the highest dose.
Evaluation of the operating characteristics in a study design involves comparison of the probability of success between the proposed design and a traditional fixed design, defined as a design without interim analyses or RAR and with traditional statistical evaluation of the primary endpoint. Comparisons between designs were made with respect to the following factors: the final sample size randomised to each treatment group (to evaluate the performance of RAR); the time to complete a trial; and the probability of selecting the correct dose. In the ‘Good’ scenario, with a linear dose-response, the final randomised sample size closely aligned with the assumed treatment effect, with a greater number of patients randomised to better-performing doses (figure 3A). For the ‘Nugget’ scenario, in which only one dose was efficacious, the greatest number of patients was assigned to this dose, with fewer patients randomised to the other doses. As expected, RAR effectively identified the doses showing early evidence of effect and adapted throughout the study to allocate patients to these doses. For both the ‘Good’ and ‘Nugget’ scenarios, the proposed adaptive design showed a higher probability of success (sufficient power) compared with a fixed design (figure 3B), which would translate into a larger trial for a study design without adaptive features (figure 3B).
There are potential errors that can result from this type of adaptive design that are difficult to assess through simulation. For example, characteristics of patients enrolled early in the enrolment period may differ meaningfully from those enrolled later. In addition, differences in disease severity or changes in SOC medications over time may result in systematic imbalances across treatment arms.33 37 To assess these risks, patient characteristics and SOC medications can be evaluated throughout the study to identify potential drift, and resulting bias may be addressed with appropriate statistical methodology.
Principle 2: sufficiently reliable estimation of treatment effects
The estimated treatment effect from a clinical trial serves as the basis for evaluation of benefit-risk and product labelling. Therefore, it is important that study designs do not introduce statistical bias, that is, do not systematically overestimate or underestimate the benefits offered by a new therapy. The extensive statistical simulations conducted to support the evaluation of Amgen’s phase IIb design allow for assessment of bias by comparing the estimated treatment effect from each simulated trial against the assumed true effect. It was shown that the Bayesian hierarchical model used in both the interim and primary analyses of the SLE trial design, which involves dynamic data borrowing across the active treatment arms, reliably estimates the treatment effect and meets the requirements to support assessment of efficacy.
Principle 3: complete prespecification of decision rules, timing of their evaluation and resulting planned adaptations
The adaptive features of the SLE trial design were prespecified and detailed in the study protocol, statistical analysis plan, data monitoring committee charter and comprehensive study simulation report.
Principle 4: maintenance of trial conduct and integrity
The adaptive design of the SLE trial ensures that the sponsor and study personnel (including patients and investigators) are blinded to treatment assignments and comparative interim analysis results. The trial uses an independent data monitoring committee to maintain patient safety and trial integrity, and which is tasked with external review of interim analysis data. In addition, an external independent statistical group, in support of the data monitoring committee, is responsible for conducting each interim statistical analysis, including evaluation of futility, and generating the updated randomisation probabilities. The data monitoring committee will have sole access to evolving efficacy data and will not disclose results of interim analyses to the sponsor unless study termination is recommended due to futility, according to the predefined stopping rules. In this scenario, a data access plan is used to document sponsor access to interim data and/or results, ensuring such access is limited to decisions relating to study termination.
Following discussion at two meetings granted through the CID programme, the FDA concluded that the phase IIb adaptive trial design demonstrated adequate trial operating characteristics and could potentially serve as one of two confirmatory trials in support of registration. This was based on adoption of a week 52 endpoint, rather than a 24-week endpoint as in traditional early phase trials, as well as confirmation that the study design demonstrated adequate control of type I error across the plausible parameter space, reliable estimates of treatment effect, and procedures in place to maintain trial integrity. Meeting the FDA CID principles criteria and using study design features that promote efficiency may foster smaller trials that provide reliable data, thereby conserving overall clinical development resources. While increasing the efficiency and feasibility of timely testing for an investigational drug and reducing patient-exposure risks, a disadvantage to this approach is the limitation of dosing group sizes where lower numbers of patients might respond, but where, in a potentially definable subset, optimal pharmacokinetic/pharmacodynamic (PK/PD) and clinical efficacy may have been achieved. This possibility can be evaluated using comprehensive biomarker analysis to generate hypotheses for future testing without slowing down the development path for a treatment.
Despite the long-sought progress represented by several recently approved treatments for SLE, there is still substantial unmet need for new targeted therapeutic options for this complex, heterogeneous disease. The complexities inherent in SLE trials and the limited number of appropriate trial sites and eligible participants call for changes to traditional SLE trial design. The adaptive trial presented here, which is aligned with FDA CID programme requirements, was designed to be conducted with reduced use of resources and enhanced likelihood of detecting true positive treatment effects compared with traditional trials. It provides the additional ethical benefit of decreasing patient exposure to non-efficacious or harmful treatments and increasing the proportion of patients randomised to more generally efficacious doses in a clinical trial.
Data availability statement
Data are available upon reasonable request. Qualified researchers may request data from Amgen studies. Complete details are available at the following: https://wwwext.amgen.com/science/clinical-trials/clinical-data-transparency-practices/
Patient consent for publication
The authors thank Kate Smigiel, PhD, of Amgen Inc. and Kathryn Miles, PhD, of BioScience Communications, for medical writing support.
Contributors SG, EK, MM, CEM contributed to the conception and design of the study. All authors (SG, EK, JTM, ADA, KK, MM, CEM) contributed to the analysis and interpretation of the data. All authors contributed to the writing and reviewing of the manuscript and have given final approval for the version to be published. SG is the guarantor for this publication.
Competing interests SG, EK, MM, CEM are employees and stockholders of Amgen. JTM is a consultant for AbbVie, Alexion, Amgen, AstraZeneca, Aurinia Pharmaceuticals, BMS, EMD Serono, Gilead, Genentech, GSK, Lilly, Merck, Pfizer, Provention Bio, RemeGen, Sanofi, UCB, and Zenas, is a speaker for AbbVie, Biogen, Sanofi, and RemeGen, and has received grants (to institution) from AstraZeneca, BMS, and GSK. ADA is an investigator and consultant for AbbVie, Amgen, AstraZeneca, Aurinia Pharmaceuticals, BMS, Celgene, Idorsia, Genentech, GSK, Janssen, Lilly, Mallinckrodt Pharmaceuticals, Pfizer, and UCB. KK is a consultant for AbbVie, AstraZeneca, Biogen, BMS, Cabaletta Bio, EMD Serono, Equillium, Genentech, Gilead, GSK, Kangpu Biopharmaceuticals, Kezar Life Sciences, Kyowa Kirin Co., and Merck, and has received research support from Acceleron Pharma, Alexion, Alpine Immune Sciences, Amgen, Horizon, Idorsia, Kyowa Kirin Co., Lupus Research Alliance and the Wolfe Family Program in Lupus, NIH, Novartis, Provention Bio, UCB, and Vera Therapeutics.
Provenance and peer review Not commissioned; externally peer reviewed.