You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.

# Clinical Trials for Disease-Modifying Therapies in Alzheimer’s Disease: A Primer, Lessons Learned, and a Blueprint for the Future

#### Abstract

Alzheimer’s disease (AD) has no currently approved disease-modifying therapies (DMTs), and treatments to prevent, delay the onset, or slow the progression are urgently needed. A delay of 5 years if available by 2025 would decrease the total number of patients with AD by 50% in 2050. To meet the definition of DMT, an agent must produce an enduring change in the course of AD; clinical trials of DMTs have the goal of demonstrating this effect. AD drug discovery entails target identification followed by high throughput screening and lead optimization of drug-like compounds. Once an optimized agent is available and has been assessed for efficacy and toxicity in animals, it progresses through Phase I testing with healthy volunteers, Phase II learning trials to establish proof-of-mechanism and dose, and Phase III confirmatory trials to demonstrate efficacy and safety in larger populations. Phase III is followed by Food and Drug Administration review and, if appropriate, market access. Trial populations include cognitively normal at-risk participants in prevention trials, mildly impaired participants with biomarker evidence of AD in prodromal AD trials, and subjects with cognitive and functional impairment in AD dementia trials. Biomarkers are critical in trials of DMTs, assisting in participant characterization and diagnosis, target engagement and proof-of-pharmacology, demonstration of disease-modification, and monitoring side effects. Clinical trial designs include randomized, parallel group; delayed start; staggered withdrawal; and adaptive. Lessons learned from completed trials inform future trials and increase the likelihood of success.

## INTRODUCTION

Alzheimer’s disease (AD) is a progressive neurodegenerative disease that produces gradual decline in cognition and function [1, 2]. The most common form, late onset AD, becomes symptomatic in late life but biomarker studies show that the amyloid protein considered the major risk factor for the disease begins to accumulate in the brain up to 20 years before symptoms begin [3].

The total number of individuals with AD will double every 20 years [4]. The annual cost of AD currently exceeds $230 billion and the total annual cost will exceed$1 trillion by 2050 if means of preventing, delaying, slowing the progression, or improving the symptoms are not found [5].

There is a high rate of negative clinical trials in AD drug development programs; 99% of drugs tested between 2002 and 2014 showed no drug-placebo difference and only one drug was approved by the US Food and Drug Administration (FDA) during that period [6]. Drugs in the AD pipeline include agents intended to intervene in the basic biology of AD and modify disease progression, symptomatic cognitive enhancers, and drugs to treat neuropsychiatric symptoms [7, 8].

The greatest need in AD drug development is for disease-modifying therapies (DMTs) that will delay or slow the clinical course of AD by intervening in the processes leading to cell death [9]. Approximately two-thirds of the current AD drug development pipeline involves DMTs—either immunotherapies or small molecule agents administered orally [7, 8]. In this paper, we describe the methods for AD clinical trials of DMTs, review past failures to identify lessons for AD drug development, and look ahead to new approaches to improving AD drug development and optimizing success in bringing new treatments to patients with AD or those at high risk for the disorder.

## OVERVIEW

### Populations

Phases of AD are recognized; these are not distinct stages but represent a seamless progression from a high risk state in which amyloid is present in the brain in the form of neuritic plaques, to prodromal AD with episodic memory impairment (in the typical presentation of AD) and biomarker evidence of AD, to AD dementia with cognitive and functional impairment characterized as mild, moderate or severe [38] (Fig. 6). Although these phases represent progression along a seamless spectrum of severity, they are artificially divided for purposes of clinical trials. Tools and outcomes appropriate for one phase of disease (e.g., preclinical) are not the same as those one would choose for later phases (e.g., mild-moderate AD). Table 2 provides examples of cognitive and functional measures used as outcome measures for different phases of AD [39, 40, 63–70].

##### Fig.6

Phases of Alzheimer’s disease (AD) as defined by cognitive, functional, and biomarker observations. Trial goals for each phase are noted.

##### Table 2

Outcome tools used for the progressive phases of Alzheimer’s disease [39, 40, 63–70]

Clinical outcomes in AD dementia trials are well established and have been used to demonstrate efficacy of cholinesterase inhibitors and memantine. Cognitive measures for mild-moderate AD dementia include the ADAS-cog [39] and the Neuropsychological Test Battery [66]. Common secondary measures include the CDR-sb [40], Clinical Global Impression of Change (CGIC), and the Neuropsychiatric Inventory [71]. Dual outcomes are required in AD dementia trials and include a cognitive measure with a functional or global outcome.

Prodromal trials commonly use a composite endpoint comprised of cognitive and functional elements or of cognitive elements derived from several scales. Composite endpoints include the CDR-sb [40], the AD Composite Scale (ADCOMS) [69], and the integrated AD Rating Scale (iADRS) [70]. The FDA has indicated that demonstration of both cognitive and functional benefit is necessary for drug approval in the prodromal phase of AD; a drug-placebo difference on a composite scale should not depend entirely on differences in cognition [72]. Some trials of DMTs include both patients with prodromal AD and those with mild AD dementia; the differences in these populations is arbitrary, and the groups can be usefully combined to facilitate recruitment of a broader population and show benefit in patients who have more than minimal impairment.

Prevention trials include primary prevention studies involving participants with no cognitive symptoms and no state biomarker changes of AD or secondary prevention studies including participants who have no cognitive symptoms but in whom amyloid imaging or CSF amyloid measures show that amyloidosis is present. Studies of asymptomatic participants with autosomal dominant mutations often have mixtures of some patients with amyloid abnormalities and some without, offering the possibility of evaluating a DMT as either primary or secondary prevention [73, 74]. Highly sensitive cognitive measures are combined with biomarkers to determine the impact of anti-amyloid therapies [63, 67, 73, 74]. Participants in this stage of preclinical or presymptomatic AD show very mild cognitive decline that may provide an opportunity to establish a drug-placebo difference in cognitive change [75, 76]. Biomarkers reasonably likely to predict future cognitive decline include amyloid imaging and tau imaging. Tau PET correlates better with cognitive decline and MRI measures of brain atrophy and may provide more insight into DM than amyloid measures [77, 78].

### Clinical trial design

The most common Phase III design for DMT trials is the randomized, parallel group, placebo controlled, two or more arm, 18–24 month trial. The primary outcome is the drug-placebo difference at trial end on co-primary clinical and functional outcomes or clinical and global outcomes. Biomarker measures typically include MRI volumetrics; amyloid PET (if the agent has a mechanism expected to impact fibrillar amyloid); and CSF Aβ, total tau, and p-tau. Additional biomarkers might be chosen depending on drug MOA and specifics of the trial. Drug-placebo differences at trial end are analyzed for both clinical and biomarker outcomes. Analyses that offer supporting data expected in DM include change in slope of decline, increasing drug-placebo difference over time, and delay to milestones captured in the data (e.g., in a trial of prodromal patients, the percent of patients at each time point who have progressed to a diagnosis of dementia or advanced from a CDR score of 0.5 to a CDR score of 1). These supporting analyses can be affected by symptomatic agents and do not by themselves prove DM. Clinical and biomarker data are expected to be correlated if they are mediated by the same mechanism [79].

The delayed start and staggered withdrawal designs provide evidence of DM without depending on biomarkers. They demonstrate an enduring change in the course of the disease in comparison with a group begun on treatment earlier (in the case of the delayed start design) or withdrawn from therapy (in the case of the staggered withdrawal design) [80–82]. These trials have been difficult to implement and have had limited use in programs attempting to show DM. The switch from placebo to active therapy when a trial is terminated and participants enter an open label extension (all are on active therapy) provides an opportunity for a delayed start observation [83], although the absence of blinding at this stage of the trial could bias the observations. This open-label delayed start analysis could add support to a claim of DM without providing definitive evidence.

Adaptive clinical trial designs use data from the on-going trial to make decisions about trial conduct. For example, the Dominantly Inherited AD-Treatment Unit (DIAN-TU) uses an adaptive strategy for dose-selection of test agents [84]. Adaptive strategies can be used for dose, treatment duration, sample size, and entry criteria. The decision structure must be comprehensively pre-specified but adaptive designs have the advantage of responding to the in-trial observations and can save time and resources while optimizing the opportunity to demonstrate a drug-placebo difference [85].

Another resource-saving strategy in clinical trial design and analysis is the incorporation of futility analyses at a time when a sufficient number of patients have been exposed to treatment for a sufficiently long period time to predict the possible outcomes. If the drug-placebo difference at the time of the analysis suggests that the study has a very low possibility of finding a drug-placebo difference at trial conclusion, the trial can be stopped [64, 86]. Futility analyses avoid exposing patients to agents and potential side effects when a positive conclusion of the trial is deemed highly unlikely. Criteria for futility are evolving; they must be liberal enough to insure that potentially viable drugs are not terminated prematurely and conservative enough that trials with very little chance of success are not continued.

The sample size of the trial is determined by the anticipated effect size of the intervention, the variability of the key measurements, and the desired length of the trial. Assuming that a slowing of 20% or more is clinically meaningful for participants and families, the typical trial for a DMT anticipates including 600–1000 subjects per arm and observing them for 18–24 months [87]. Individuals with more severe disease have faster rates of decline. Prodromal patients who are ApoE4 carriers decline more rapidly than those who are not carriers [88]. The decline in the placebo group is critical to assessing the efficacy of the intervention and decline on placebo is a critical determinant of the success of a trial.

## LESSONS LEARNED FROM TRIALS OF DMTs

There have been frequent failures in attempts to develop new drugs for AD, and 100% of DMT development programs have failed [6]. Every trial, however, is a learning opportunity and many lessons have been learned that will assist in future drug development [89].

### Animal models of AD provide limited evidence of efficacy

Animal models of AD are an important means of investigating efficacy and toxicity in the preclinical state prior to exposing humans to possibly toxic or ineffective compounds. Many of the tg animal models overexpress the amyloid protein leading to cortical plaques similar to those observed in human AD [90]. These genetically engineered animals have abnormalities of amyloid metabolism but generally lack other aspects of human AD; they lack tau or cell death and have limited inflammatory changes [91]. The tg mice have mild cognitive changes but do not develop severe dementia equivalent to the human disease. Many types of therapy have been successful in reducing amyloid abnormalities in these animals and have often lead to improved cognitive performance on tests such as Morris Water Maze or Novel Object Recognition [90]. None of these successes at the preclinical level has predicted success at the human level. The animals serve as important gateways in the drug development process showing that they impact specific pathways; advancing a drug to human testing that did not succeed as expected in animals would be unwise. The models, however, recreate limited aspects of human AD such as amyloidosis and cannot be taken as models of the full spectrum of pathology of human AD or predictors of human benefit [23].

Another concern with regard to animal models is their reproducibility [92]. If an experiment cannot be reproduced within a single model or across related models then its ability to predict human outcomes is suspect. Strain, age, gender, handler behavior, diet, and light conditions may all influence animal behavior. Randomization and sample size are important aspects of animal trial design that have sometimes been ignored [93]. Lack of rigor with regard to these aspects of animal model testing may contribute to the lack of reproducibility both across models and in translating results from animals to humans.

### Establish BBB penetration in Phase I

BBB penetration is shown in preclinical studies by the effects of drug on behavioral studies and post-exposure necropsy. Differences between rodent and human BBB function, especially activity of p-gp transporter make extrapolation of animal model results to humans uncertain, requiring demonstration of BBB penetration in Phase I FIH studies [34]. Tarenflurbil is an example of an agent advanced as treatment for AD with in vivo activity in animal models but likely low entrance into the CNS in humans [94]. Before candidate agents exit Phase I, investigators should establish BBB penetration, the plasma/CSF ratio, and the relationship of predicted human brain exposure to concentrations associated with benefit in animal models.

### Determine a maximum tolerated dose in Phase I

Dose escalation studies in Phase I and dose refinement studies in Phase II should provide confidence in the dose(s) selected for Phase III. In particular, it is important to establish a MTD whenever possible to ensure that the highest possible doses have been explored. In some cases, occupancy studies may allow conclusions about dosing without an MTD if the receptor is fully occupied at lower doses. In other situations, solubility or physical features may limit the administered dose and the MTD cannot be determined. Beyond these exceptional circumstances, an MTD should be determined. Without an MTD, failure to show a drug-placebo difference in Phase II or Phase III will raise questions about the adequacy of the dose.

### The diagnosis of AD should be supported by biomarkers

An important learning is the relatively large number of individuals who have a prodromal AD or AD dementia phenotype but are not amyloid-bearing when studied with amyloid PET [42]. These non-amyloid individuals have suspected non-Alzheimer pathology (SNAP) and are presumed not to have AD. They should be excluded from trials of agents for AD. Table 3 shows the percentage of patients meeting clinical criteria for prodromal AD or mild AD dementia who are amyloid-bearing [42]. Amyloid is more common in those with ApoE genotypes but genetic characterization is insufficient to ensure the presence of amyloid. To be confident that the trial population has AD, amyloid imaging or CSF evidence of the AD Aβ/tau signature should be collected (Fig. 4).

##### Table 3

Amyloid PET findings in patients meeting clinical criteria for prodromal AD or mild AD dementia (stratified by ApoE genotype) [42]

 Group Amyloid Positive Amyloid Positive All 61% 39% All prodromal AD 50% 50% Prodromal ApoE4 carriers 71% 29% Prodromal ApoE4 non-carriers 31% 69% All mild AD dementia 75% 25% Mild AD dementia ApoE4 carriers 90% 10% Mild AD dementia ApoE4 non- carriers 58% 42%

### Assure target engagement in Phase II

DM is supported by an impact on “downstream” measures of cell death such as MRI atrophy, CSF tau, or possibly other biomarkers of neuronal degeneration such as neurofilament light chain protein [54]. These downstream consequences can reasonably be expected only if the “upstream” target of the pharmacologic intervention is successful. Target engagement measures will depend on the MOA of the candidate therapy. BACE inhibitors, gamma secretase inhibitors, and gamma secretase modulators will have an effect on amyloid production as measured by stable isotope-labeled kinetics (SILK) [95]. BACE inhibitors will also inhibit BACE activity as measured in the CSF and reflected in sAβPPβ, a by-product of BACE activity; gamma secretase modulators result in Aβ fragments of 15/16 amino acid lengths in the CSF which are not normally present in AD [95–97]. Proof of pharmacology is one goal of Phase II and compounds should not be advanced to Phase III without well documented support for a pharmacologic effect.

### Establish a dose-response relationship in Phase II

Dosing approaches in Phase II ideally establish a low dose that is ineffective, one or two mid-range doses that are effective, and a high dose that is not tolerated and not acceptable. A dose-response on clinical or biomarker measures increases confidence in the pharmacology of the molecule. Regulatory agencies usually seek assurance that patients are given the lowest effective dose to ensure that they are not being exposed to unnecessary side effects. Doses established in Phase II inform decisions of which dose should be advanced to Phase III. Drug formulation decisions should be completed in Phase II prior to Phase III.

### Collect multiple biomarkers to assess outcomes

Knowledge of the neurobiology of AD is incomplete. Systems biology studies demonstrate that AD biology is complex [98] and biomarkers provide limited windows onto this complex and ill-understood disease. Although working models of the order of events in AD have been constructed, none have been proven and none have guided successful DMT development. Agnostic approaches to biomarkers (e.g., amyloid; tau, neurodegeneration; A/T/N) are used to acknowledge the exploratory nature of our biomarker documentation of drug effects [99]. To support DM as the outcome of a therapy, trial sponsors should collect A/T/N biomarker data, emerging biomarkers, and biomarkers specifically linked to the mechanism of the intervention to gain a comprehensive view of the impact of treatment.

### Recruitment is a major challenge

Trial recruitment is a difficult process and each population—cognitively normal at-risk participants for prevention trials, minimally impaired biomarker positive participants for prodromal AD trials, and cognitive and functionally impaired participants for AD dementia trials—have unique requirements for identification, recruitment, informed consent, and retention in the trial. There are too few highly functioning trial sites in the world. The world’s populations are generally poorly educated about clinical trials and often have few opportunities to participate. Many trials spend more time in the recruitment phase of the trial than in the drug exposure phase. Slow recruitment slows the cycle time of trials and increases their cost. Many AD-concerned organizations are constructing responses to this challenge. The Global Alzheimer Platform (GAP) network of trial sites in the US and the European Prevention of Alzheimer’s Disease (EPAD) initiatives are among the leaders of the attempt to reduce recruitment times and accelerate trials [100, 101].

### Global trials have greater variability

One response to slow recruitment is to include many trial sites with each site recruiting only a few participants to the trial. In most trials, each site is expected to contribute 6–12 participants, but many sites contribute only 1 or 2 participants. This amplifies “noise” in the data and decreases the ability to demonstrate a drug-placebo difference.

Globalization of trials creates another set of challenges. Sites distributed around the world are culturally and linguistically diverse, have different standards of health care, and include participants with different histories of nutrition and levels of education. Trial sites are highly variable in terms of experience, expertise, training, and infrastructure. Local hospital and university institutional review boards (IRBs) may have limited experience with hosting and reviewing AD trials [102]. Global sites impose challenges in terms of drug manufacturing and distribution, supply lines, biomarker collection, laboratory availability, and data collection and quality assurance. The result of this complexity is that populations recruited into trials from around the world vary in terms of age, education, genotype, and other clinical characteristics, and they progress somewhat differently in clinical trials [103, 104]. North America and Western European trial populations are similar and results are likely to be most interpretable if these populations comprise the majority of the study population.

Efficacy and safety data are needed on all populations where the agents will be marketed; smaller trials in local populations may be the best way to address these needs.

### Comprehensive trial networks are needed to conduct AD trials

Conducting clinical trials is demanding and requires expertise, commitment, and infrastructure. Some academic medical centers support trials while others do not, industry sponsors support trials but tend not to support trial infrastructure. In the US, the National Center for Advancing Translational Sciences (NCATS) sponsors Clinical and Translational Science Awards (CTSAs) to provide trial infrastructure in major university medical centers [105]. Trial networks are currently re-created for each trial and raters are re-trained on the same outcomes for each trial. Each institution often has its own IRB for reviewing trials. Legal review of contracts further slows trial initiation. Construction of a highly efficient trial network with standing non-redundant training, and a central IRB are goals of GAP and EPAD [100, 101].

### Negative trials may indicate an ineffective drug or a failed trial

The failure to show a drug-placebo difference at the end of a trial may be due to lack of efficacy of the candidate therapy or flawed conduct of the clinical trial. Table 4 summarizes the reasons for negative outcomes in trials. Drug-related reasons for negative trials include lack of efficacy and excessive toxicity [106]. In some cases, the dose range has not been adequately explored in early drug development and a negative trial opens the question of whether the agent might have been efficacious at higher doses. Such agents must return to Phase I for dose escalation trials and sponsors rarely have an interest in pursuing this alternative. Trial-related reasons for failed trials of DMTs include lack of decline in the placebo group, enrollment of non-AD patients, and excessive measurement variability.

##### Table 4

Reasons for failure to show a drug-placebo difference at the end of a clinical trial of a disease-modifying agent. AD, Alzheimer’s disease

 Drug-related •Lack of efficacy of the agent •Inappropriately low dosing of an effective agent •Excessive toxicity or lack of tolerability leading to high discontinuation rates in the active treatment arms •Excessive toxicity or lack of tolerability leading to early termination of the trial Trial-related •Lack of decline in the placebo group •Recruitment of non-AD patients into trials requiring an AD substrate for drug benefit to occur •Excessive measurement variability •Lack of measurable effect of active comparator drugs (if available)

### Placebo decline determines drug-placebo difference

Successful DMTs will slow the course of decline in AD. Slowing of decline is established by contrasting the decline in the active treatment group with the trajectory of the placebo group. The placebo trajectory will determine the drug-placebo difference at end of trial. The rate of decline of the placebo group is a crucial consideration in understanding the treatment effect. Placebo groups with SNAP patients do no decline as rapidly as those with confirmed AD, emphasizing the importance of confirming the diagnosis of AD in trial participants [107]. Slow decline in the placebo group will minimize the drug-placebo difference and the agent will appear less efficacious than when compared with a more rapidly declining group. Similarly, an unusually rapidly declining placebo group may lead to an overestimation of drug efficacy since the drug-placebo difference will be exaggerated and this may not be reproduced in a later trial. A meta-analysis of placebo decline showed that patients with mild AD are expected to decline 5.6 points on the ADAS-cog or 3 points on the Mini-Mental State Examination in 18 months [108]. This figure is based on trials that included patients without biologically confirmed AD and may underestimate the decline in those confirmed with amyloid imaging or CSF studies to have AD.

### Phase II subgroup analyses do not provide guidance for Phase III

Negative trials are often analyzed to detect treatment-responsive subgroups that can be exploited in future trials. This approach entails substantial risk of being misled by spurious trial specific results. Subgroups are not subject to the same recruitment or randomization as the original group, the sample sizes of subgroups are often small leading to underpowered results, and the outcome measures are typically not optimized for a specific subgroup. Basing a Phase III program on a subgroup analysis of a Phase II trial with a negative outcome has usually resulted in a negative Phase III trial.

To reduce the risk of being misled, one can apply guidelines for interpretation of Phase II subgroup analyses. Table 5 shows the principal recommendations for subgroup analysis [109–111]. Subgroup analyses suggesting benefit in one group of patients require conducting a Phase II trial for this subgroup to gain additional confidence in this treatment approach.

##### Table 5

Questions to ask to determine how much confidence can be placed in a subgroup analysis [109–111]

 Guide: Questions to Ask of Subgroup Claims Supportive of Subgroup Claim if “Yes” Design Was the subgroup variable a baseline characteristic? Was the subgroup variable a stratification factor at randomization? Was the subgroup hypothesis specified a priori? Was the subgroup analysis one of a small number of subgroup hypotheses tested (≤5)? Analysis Can chance explain the subgroup difference? Was the test of interaction significant (p < 0.05)? Was the significant interaction effect independent, if there were multiple significant interactions? Context Was the direction of the subgroup effect correctly pre-specified? Was the subgroup effect consistent with evidence from previous related studies? Was the subgroup effect consistent across related outcomes? Was there indirect evidence to support the apparent subgroup effect – for example, biological rationale, laboratory tests, animal studies? Systematic reviews Is the subgroup difference suggested by comparisons within rather than between studies?

## BLUEPRINT OF A DEVELOPMENT PROGRAM FOR A DMT

This primer of DMT trials plus the lessons learned from negative trials suggest a blueprint for future trials of DMTs. The key elements of success for a DMT development program include:

• Comprehensive understanding of target biology

• Selective, potent agents impacting a key element of AD biology leading to cell death

• Disciplined conduct of a drug development program organized around a TPP

• Success in preclinical models of AD

• Acceptable ADME and toxicity in preclinical studies

• Acceptable ADME and toxicity in FIH studies

• BBB penetration demonstrated with relevant extrapolated brain exposures achieved in Phase I

• MTD established in Phase I

• Use of biomarkers in Phase II and III to establish accurate diagnosis of AD

• POC established in Phase II with target engagement and proof-of-pharmacology

• Dose-response shown in Phase II

• Trials implemented in high functioning trial network

• Globalization-dependent variability minimized in Phases II and III

• Demonstration of robust clinical and correlated DM-type biomarker response in Phase III

• Report Phase II and III trials using CONSORT criteria

• Continued assessment of safety and clinical utility after market introduction

## SUMMARY

Development of DMTs for AD is a difficult, long, and expensive process. No development program has yet succeeded. A systematic approach to drug development advancing the scientific understanding of the candidate molecule from preclinical studies through Phases I, II, and III of clinical trials can increase the probability of success and de-risk development programs. Biomarkers for diagnosis, target engagement and proof-of-pharmacology, outcome assessment, and side effect monitoring assist in drug development. Excellent conduct of trials and awareness of the trial pitfalls are critical to development success. New therapeutic targets such as tau-related processes and the use of combination therapies may enhance the chances of successful DMT development. Quality development and trial strategies for drugs that are potent, selective, and impactful on the biology of AD are necessary to bring urgently needed new treatments to patients with AD and those at risk for the disease.

## ACKNOWLEDGMENTS

The authors acknowledge support of a COBRE grant from the NIH/MIGMS (P20GM109025) and Keep Memory Alive.

Authors’ disclosures available online (https://www.j-alz.com/manuscript-disclosures/17-9901).