Outcome Measures



Outcome Measures


Anne T. Berg

John T. Langfitt

Barbara G. Vickrey

Samuel Wiebe



Introduction

During the nearly 10 years that have elapsed since the first edition of this book, there have been several new studies and reports that have advanced our understanding of the outcomes of epilepsy surgery. Large retrospective, single-center studies and one large prospective, multicenter study have been reported, and the first randomized trial of epilepsy surgery was successfully completed. The methodologic standards for operationalizing and studying the various outcomes considered important to epilepsy surgery have improved and seem to be converging toward consensus. Increasingly, there is an emphasis on the long-term outcomes that, in the early literature, had been largely ignored.

The following discussion summarizes the current state of outcome assessment and provides a brief summary of the current understanding based on the strongest, most methodologically rigorous studies. The outcomes to be covered are seizures, mortality, neurocognitive deficits, psychiatric disorders, health-related quality of life, utility measures, and employment. Although consensus is developing for measurement of these outcomes, the measurements themselves are of relatively little value if they are not done in the context of an appropriately designed and analyzed study.


Study Design: Some Basic Concepts and Principles


The Hypothesis

At the heart of any good study is one or more clearly articulated study questions. These questions are then operationalized into testable hypotheses. The term “testable” in part refers directly to the process of subjecting the data to statistical tests appropriate to their form. Thus, one plans a study based on the final form the results will take.


The Design

The value of a study’s results also heavily depends on the quality of the study design (to ensure validity of results) and the size of the study sample (related to statistical power and stability of findings). There are a few common approaches used in studying epilepsy surgery.5,12


Retrospective Cohort Based on Chart Review

Such studies are easy and convenient to do. They can help to generate hypotheses and provide descriptive data or novel information useful in designing a rigorous prospective study. All else being equal, a retrospective chart review is the weakest design. It requires depending on information not recorded for research purposes, in general, and with the investigator’s specific research questions in particular. Consequently, data are often inappropriate to the question being asked, incomplete, or just missing entirely. Follow-up—something absolutely front and center for an outcomes study—is of uneven intensity and completeness across patients, often as a function of outcome. This alone can invalidate a study’s findings.

Retrospective chart review studies can be strengthened by (a) performing consistent and well-documented follow-up through the charts and (b) limiting the study to information that is reliably and routinely recorded in the medical records. An important advantage of a well-conducted retrospective study is that long-term data can be obtained without the need for actually waiting many years. It is an appropriate first approach to a question not previously addressed and should be followed with more definitive studies if warranted. In some cases, a cohort can be assembled retrospectively, and then continued follow-up can be conducted prospectively as in a recent study from Australia.62,63


Prospective Studies

In contrast to relying on chart review, in a well-done prospective study, one can ensure that patients are appropriately identified and enrolled; that baseline evaluations are made and their results recorded in a consistent, meaningful, and analyzable format; that outcome measures are assessed in a standardized manner; and that follow-up is consistent for all patients. There are at least three ways in which one might do a prospective study. We note that the first, a randomized trial, can never be done retrospectively; however, the other two approaches often are.


Randomized Clinical Trial.

For comparing the outcomes of two treatments, a randomized trial is the single best design. Although excellent for a basic test of treatment efficacy, randomized trials have several important limitations: (a) For practical or ethical reasons, the eligibility criteria are often very restrictive, making the results of limited generalizability outside of the highly selected trial population, (b) patients are typically less willing to participate in randomized clinical trials (RCTs) than in observational studies, especially in studies of highly invasive interventions; and (c) randomized trials are generally not powered to detect infrequent adverse events, which often only surface in large follow-up studies. All of these factors impose limits, sometimes severe, on the generalizability and scope of the trial’s results.

A recent randomized trial of epilepsy surgery done in Canada overcame the first two of these concerns.99 This was possible because there is a 1-year waiting period for surgery in Canada. Willing candidates were randomized either to immediate surgery (the waiting period was waived) or to the normal wait that they would have had were they not in the trial. Generalizability was not of concern either because the eligibility criteria captured typical temporal lobe epilepsy patients presenting for surgical evaluation.








Table 1 Summary of considerations in designing a clinical research study of outcomes























Study considerations Comments
Clear testable hypotheses Formulated in terms of specific statistical tests and comparisons
Study design Appropriateness depends on level of knowledge and degree to which information will represent a contribution to extant knowledge; well-powered, prospective study designs are preferred
Sample-size considerations Must be planned in advance; larger samples provide better statistical power and allow detection of smaller but still meaningful effects
Standardized outcome measures Should reflect real outcomes of interest; when they are available, accepted, well-vetted measures should be used (e.g., for neuropsychological, psychiatric, and quality of life)
Standardized follow-up protocol Explicit predetermined plans for periodic contact with patients to collect data, carried out consistently across study participants; use of clinic visits alone is often suboptimal
Consistent approach to measuring and collecting all other relevant data Ideally, the investigator has some control over this information prospectively; if dependent on clinical records, do some preliminary exploration in advance and determine what reliably can be recorded from these sources; have a standardized procedure for collecting and recording information

For an intervention such as surgery, another significant limitation is that the outcomes of greatest interest today are those
that take years, even decades, to realize. In fact, after 1 year, most of the patients in the Canadian trial who were randomized to the usual delay before surgery went on to have surgery.


Nonrandomized Control Group.

Although it is appealing, a nonrandomized study is much more difficult to execute well. The comparison of treated to untreated patients will almost always be confounded by fundamental differences between the two groups. Well-designed and carefully analyzed studies can provide results of comparable quality to that of randomized trials11,26; however, this requires knowing what the differences are and being able to measure them adequately. In the end, there is often a lingering doubt about the accuracy if not the outright validity of findings from nonrandomized studies.


Surgical Cohort (Series).

Most surgical outcome studies are limited to a surgical cohort or case series. Although inappropriate for assessing the efficacy of surgery, it is reasonable to use this type of study for identification of pre- and early postoperative predictors of outcome. Such a design is also appropriate for studying the relationship between seizure control and other outcomes (e.g., employment, depression, health-related quality of life, etc.) after surgery. For example, “does someone who stops having seizures after surgery experience improvements in his employment status?”

There are several other considerations that cut across all study designs:



  • Sample size and statistical power. Statistical power refers to the ability of a study to detect a difference if one exists. It is a product of several factors, including sample size and the size of the effect to be detected.25 All other things being equal, the larger the effect to be detected, the better is the statistical power. Likewise, the larger the sample size for a given effect, the greater is the statistical power. Any study should be planned with an adequate sample size to detect meaningful effects. This process is made quite easy today by the availability of software that help to determine the necessary sample size under a variety of conditions and assumptions.32


  • A deliberate, standardized, and meaningful assessment of outcomes. Measures that reflect outcomes of real interest and that can be reliably assessed should be used. Use of published well-validated and accepted instruments, where they exist, to measure various outcomes is strongly encouraged. The use of idiosyncratic, unvetted instruments for factors and concepts for which well-developed methods already exist adds little to the literature.


  • Rigorous protocol for data collection. Whether the study is a randomized trial or not, all data should be collected according to a carefully designed rigorous protocol constructed to ensure accurate, complete, timely, unbiased collection of all data in the same manner for all study participants.

The basic considerations for study design are summarized in Table 1.


Seizure Outcome

In the earlier literature, seizure outcomes were assessed in a variety of idiosyncratic ways94 at varying points after surgery. A recent guideline emphasized the importance of complete seizure freedom and of assessing seizure outcome each year.100 The use of various outcome categories, although popular, leads to imprecision and difficulty in comparison and interpretation of results. With the widespread availability of fairly standard statistical techniques (as a class known as survival methods31) investigators are increasingly using remission (complete seizure freedom) and subsequent relapse and studying the determinants and patterns of these outcomes over prolonged (many year) periods of follow-up.62,63,71,76,102 Thus, at least for resective surgery, the preferred and most consistently applied approach to studying seizure outcomes is to study remission and relapse over time since surgery. Other methods may still be preferred in drug trials or for studying the outcomes of palliative procedures (e.g., callosotomy or vagus nerve stimulation).

Only relatively recently was a randomized trial of temporal lobectomy successfully completed.99 Once and for all, this study demonstrated the efficacy of surgery with respect to seizure control. One year after surgery (or randomization) 58% of surgical versus only 8% of nonsurgical patients (p <.001) were completely seizure free. Large-scale observational studies, some retrospective, some prospective, have also appeared. In addition to demonstrating the high rate of seizure freedom following epilepsy surgery, they have provided insight into the course of the long-term seizure outcomes after surgery. These have turned out to be complex.34,63,80,101,102 In particular, patients who become immediately seizure free after surgery may not always remain so. Relapse rates can be high (25%–30%). To the extent to which it has been studied, it is found that many patients who relapse after a remission regain remission fairly
quickly. In fact, some relapses may occur in association with discontinuation of antiepileptic drugs (AEDs), although at least two large studies failed to find a strong if any effect on relapse associated with AED withdrawal.14,63

Prediction of which patients deemed appropriate for surgery will be the most likely to become seizure free remains challenging. Many small reports pepper the literature with various findings, few of which are ever replicated. For an extensive review of the literature on this subject see McIntosh et al.64 For temporal lobe epilepsy, the recent large-scale studies confirm the presence of unilateral mesial temporal sclerosis as the most important and consistently demonstrated predictor of a good seizure outcome. A history of generalized tonic–clonic seizures also seems to predict a poorer chance of remission. Apart from these factors, presurgical prediction of seizure outcome after temporal lobectomy has not advanced much. Current findings must be interpreted, however, in the context of the fact that patients are already highly selected. Recent reports do not provide any firm basis for recommendations that could improve current selection procedures and criteria.

Information about outcomes after extratemporal resection is much scarcer. In general, extratemporal surgery tends to be associated with a lower chance of complete remission after surgery.89 Studies often report better outcomes if there is a discrete, resectable lesion and if electroencephalographic (EEG), clinical, and imaging findings are congruent. A recent meta-analysis exemplifies the difficulties in summarizing and pooling results from studies with differing methodologies, patients, and procedures and emphasizes the importance of adopting standard research methods. In this study, highly heterogeneous results precluded conclusions about the predictive value of congruence of EEG and magnetic resonance imagining (MRI), extent of surgical resection, and presence of interictal discharges. On the other hand, an abnormal MRI, history of febrile seizures, mesial temporal sclerosis or resectable tumors, and absence intracranial monitoring augured seizure freedom.91


Mortality

Mortality, especially unexpected death, is higher in people with epilepsy than in the population at large.39,56,84 Among adults with epilepsy, it is highest in those with intractable seizures.90,97 Assessing death is generally not difficult in a study with rigorous follow-up methods in place. Determining cause of death can be more complicated, especially if one is interested in determining whether a death was a sudden unexpected death associated with epilepsy (SUDEP),90 and most especially when relying on death certificate data, a commonly used source of information.10

In separate studies, moderately to substantially lower mortality rates have been reported after surgery in patients who became seizure free compared with those who did not.65,82,84 Ryvlin provided a critique of the methodologic difficulties of assessing the impact of surgery on mortality risk.72 In particular, he suggested that the same factors that might render patients inappropriate for surgery or at highest risk of not becoming seizure free might also be associated with cardiac arrhythmias, thus independently placing them at increased risk of SUDEP. An RCT would address these concerns by balancing risk factors in medical and surgical groups. In the only surgical RCT, one patient died in the medical group and none in the surgical group at 1 year.99 In children, there is agreement that SUDEP is rare, mortality is higher only in those with symptomatic epilepsy, and it is often related to the underlying neurologic condition.13,19,21


Cognitive Outcomes

Specific cognitive functions (e.g., memory and sensory and motor functions) may be impaired before or after epilepsy surgery as a result of the surgery itself. Assessment of such impairments involves several considerations including adequacy of pre- and postsurgical assessments, rigorous study design, and appropriate methods for determining whether a real and meaningful change has occurred. In the neuropsychological outcome literature for surgery, weaknesses in basic methodologic considerations (discussed earlier) are common. These include small sample sizes, retrospective designs, loss to follow-up, and inadequate controls.


Methodologic Issues in Assessing Cognitive Change

There are additional concerns that are more specific to assessing real and meaningful changes in cognitive function.



  • A difference of means. Early studies reported cognitive outcome as group mean change scores from pre- to postsurgery. When group mean scores improved, this was taken as evidence for beneficial effects of surgery on cognition. Even without biases introduced by selective loss to follow-up, group mean changes do not always reflect changes at the individual level. Furthermore, not all changes are clinically relevant.


  • Practice effects. When testing study participants repeatedly over time, the study design and particularly the analysis must account for the fact that all cognitive tests are subject to practice effects and test–retest variability. Regression to the mean for unusually high or low initial test scores can be expected. Therefore, a change score can simply reflect the change expected due to these statistical characteristics and not necessarily a true change in cognitive function.


  • Reliable change. Concerns about practice effects have led a number of authors to develop methods of determining how much a cognitive test score must change for it to be considered likely to reflect true change. Reliable change indices (RCIs) or standardized, regression-based (SRB) change scores are two methods for determining the incidence and direction of true cognitive changes. A RCI provides a confidence interval around a retest score that reflects the variability that is to be expected on the basis of test–retest variability and practice. An SRB score is a standardized score (i.e., z-score) that reflects these factors along with regression to the mean. RCI and SRB scores are typically calculated from samples of clinically stable, medically managed patients tested on two separate occasions. For a given test, an individual change score outside the RCI/SRB-defined range is said to represent a “statistically reliable” change. Table 2 shows neuropsychological instruments used in two recent, large, multicenter studies of epilepsy surgery with references to the source of RCI/SRB data43,59,73 and estimated administration time for each test.








    Table 2 Neuropsychological tests that have been be used in assessing the cognitive effects of surgery








































































    Cognitive domain Test and source of RCI/SRB Administration time (min)
    General intelligence WAIS-R,a,b WAIS-IIIc 9020
    Motor function Grooved Pegboardb 1010
      Grip Strengthd 520
    Auditory attention WAIS-R Digit Span,a WAIS-III Digit Spanc 5e
    Visual attention Trail Making Test Part Aa 520
    Confrontation naming Boston Naming Testb 1520
    Verbal fluency Controlled Oral Word Association Test (COWAT)a,b 5–1085
    Visuospatial analysis WAIS-R Block Design,a WAIS-III Block Designc 15e
      Hooper Visual Organization Testa 1020
    Verbal memory Rey Auditory Verbal Learning Test (RAVLT)b 3058
      California Verbal Learning Test (CVLT)a 3020
      WMS-R Logical Prose,a,b WMS-III Logical Prosec 15e
    Nonverbal memory Brief Visuospatial Memory 1585
      Test–Revised (BVMT-R)d  
      Rey-Osterreith Complex Figured 1520
    Executive function Trail Making Test Part Ba 520
    RCI, reliable change indices; SRB, standardized, regression-based change scores.
    aRCI/SRB found in Hermann et al.43
    bRCI/SRB found in Sawrie et al.73
    cRCI/SRB found in Martin et al.59
    dRCI/SRB not available. eAuthor’s estimate (J. T. L.).

    Only gold members can continue reading. Log In or Register to continue

    Stay updated, free articles. Join our Telegram channel

Aug 1, 2016 | Posted by in NEUROLOGY | Comments Off on Outcome Measures

Full access? Get Clinical Tree

Get Clinical Tree app for offline access