Retrospective design
Ill-defined study cohort
Subjective outcome measures
Poorly defined prognostic predictors
Follow-up of variable duration across patients
Small sample size
Absence of masking
Changing technologies and procedures
Inadequate control of confounding variables
Inappropriate statistical analyses
A retrospective design may bias the study results for the lack of standardized data collection. The data collected in retrospective studies are generally retrieved from medical records and rely on the quality and completeness of the available information. In the absence of prespecified definitions, several baseline variables may reflect the accuracy of the caring physicians in filling the medical records and the differing interpretations given to each variable.
The representativeness of the inception cohort is another source of bias, more evident in (but not restricted to) retrospective studies, because “prevalent” cases (i.e., those still being followed at the time of the study) rather than “incident” cases (i.e., those included at the time of surgery) are preferably included. “Prevalent” and “incident” cases tend to differ in that the former are most likely represented by survivors, by patients with persisting seizures or with other epilepsy-related problems, or perhaps by individuals better satisfied with the center’s quality of care. In contrast, “incident” cases are represented by all patients enrolled at the time of surgery and as such include both patients who will achieve seizure remission and those who will continue to experience epilepsy-related problems.
In the absence of precise definitions of the outcome measures and the prognostic predictors, these variables are subjected to differing interpretations. The Engel classification of seizure outcome [11], still the most widely used, has been criticized for including patients’ (and physicians’) subjective opinions and for using terms like “disabling seizures” or “rare” which are open to widely different interpretations. The alternative use of quantitative measures to replace this terminology, as done in the ILAE Commission on Neurosurgery report [12], represents only a modest improvement because a 50 % reduction of baseline seizure frequency, as a measure of disease outcome, is strongly dependent on each individual’s baseline seizures.
The duration of follow-up varies across patients depending on the timing of enrolment. As the prognosis of a disease is a time dependent variable, studies with long-term follow-up and high drop-out rate may be biased towards more positive results. Patients lost to follow-up might have discontinued their visits because of seizure-related problems and in general for the outcome of the disease. On the contrary, patients with short-term follow-up may have insufficient time to show the (lack of) benefits of surgery. Assessing the outcome of surgery by counting seizure-free patients at last follow-up does not take into account the length of follow-up and the drop-out rate and, as such, may be a source of biased results. Minimum follow-up periods should be allowed and appropriate statistical tests (see below) should be used to adjust for the length of follow-up.
The majority of surgical series are based on small numbers of patients. Small sample size may lead to imprecise estimates (as confirmed by the wide confidence intervals) in the proportions of patients meeting predefined outcome measures and is a source of sampling bias.
In the absence of masking, outcome measures (i.e., seizure counts) may reflect subjective interpretations, which vary across patients and investigators. In the absence of a prospective design and planning, the way the patients keep track of their seizures may lead to under-ascertainment, which is most likely to occur at the presence of minor events (“auras”). As well, the caring physicians and the investigators may give differing interpretations of events likely to be recorded as seizures. In an open setting, these events may be less likely to be diagnosed as seizures in patients who received surgery than in patients treated conservatively.
The change of the technological and procedural approaches may explain the increasing proportions of surgical successes over time, which may in turn prevent comparisons of studies done in different epochs.
A correct interpretation of the results of a study on the outcome of a disease in relation to a given intervention (in this case, surgery of epilepsy) should rely on an adequate control of the most important sources of bias, which are particularly relevant in an uncontrolled setting like that of observational studies. In most studies, an adequate control of the confounding variables was not included at the planning stage to become part of the study design. In studies where the control of confounders at the planning stage was unfeasible (which is mostly the case of retrospective studies), a correct statistical analysis, including use of multivariate analysis models, was infrequently planned. In addition, despite the differing periods of observation after surgery, appropriate statistical methods have been rarely used to adjust for the different length of follow-up across patients.
In their practice parameter on temporal lobe and localized neocortical resections, Engel [13] identified several major methodological deficiencies in the published studies, which included, among others, the retrospective design, the scarcity of data on preoperative seizures, and the absence of masking in seizure outcome assessment.
Data Pooling and Meta-analyses and a Critical Appraisal of Prognostic Indicators
The heterogeneity of the study populations and outcome measures and, most of all, the pitfalls in the study designs (see previous section) have a strong influence on data pooling. In a comprehensive review of the literature on seizure outcome after temporal lobectomy, McIntosh et al. [14] reported that good surgical outcome could be predicted by hippocampal sclerosis or abnormal MRI, prolonged febrile seizures, anterior temporal localization of interictal epileptiform activity on scalp EEG, extent of mesial resection, absence of perioperative generalized seizures, and absence of acute postoperative seizures. However, the authors identified several sources of variability across studies which prevented data pooling for the risk of biased results. These included, among others, the differing settings and the variability of the definitions used for the prognostic predictors.
A meta-analysis of 47 articles was also performed by Tonini et al. [15] who used strict inclusion and exclusion criteria to minimize the heterogeneity of the study designs. Febrile seizures, mesial temporal sclerosis, abnormal MRI, EEG/MRI concordance, and extensive surgical resection were the strongest prognostic indicators of seizure remission (positive predictors), whereas postoperative discharges and the use of intracranial monitoring predicted an unfavorable prognosis (negative predictors). Firm conclusions could not be drawn for extent of resection, EEG/MRI concordance, and postoperative discharges for the heterogeneity of study results. Neuromigrational defects, CNS infections, vascular lesions, interictal spikes, and side of resection did not affect the chance of seizure remission after surgery. However, limitations were indicated by the authors in the interpretation of the results. The first major limitation was intrinsic to all studies based on data pooling and meta-analysis. Data were extracted from studies using different criteria for seizure outcome. A second limitation was the variable length of the follow-up across studies. The authors limited the review to studies with an expected follow-up of at least 1 year (actually, duration of follow-up was less than 12 months in few patients). Although this interval can be considered crucial for the prediction of surgical outcome, the chance of seizure remission after surgery could be significantly affected by the differing duration of an extended follow-up (i.e., more than 12 months). A third limitation was inherent to the forced dichotomization of each putative prognostic predictor, which means that each factor could be contrasted to a variety of other factors or could include a number of different conditions, each of which associated with differing clinical outcomes. A fourth limitation was that any variable was examined without considering the role of other (known or unknown) confounders, which might have been the true prognostic predictors. This limitation refers to factors like the use of intracranial recording and febrile seizures. Last, the changing techniques and operative procedures might have affected seizure outcome preventing comparisons of studies done in different epochs.
The Effects of Epilepsy Surgery Beyond Seizure Control
To provide a comprehensive counseling to surgical candidates and their families, the impact of surgery should be assessed beyond seizure control and should include the effects of treatment on cognitive, psychosocial and behavioral functions, quality of life, mortality, and costs. All these outcome variables have been assessed by Spencer and Huh [16] in a comprehensive review of epilepsy surgery. The authors found that up to 4 % of adults with anterior mesial temporal lobectomy and up to 10 % of children with focal resection had neurological complications. Mortality related to surgery and late postoperative deaths were estimated at 0–2 % and were higher in children than in adults. Most studies described significant decline in verbal memory, mostly after dominant temporal resections. The methods and timing of neuropsychological assessments documenting these changes were, however, heterogeneous. Some studies reported improvements in verbal memory and full-scale IQ after resection of the non-dominant temporal lobe; however, the degree of contribution of the retest effect and the longevity of these findings was unclear. Between 40 and 50 % of children with epilepsy had high rates of comorbid learning disabilities, developmental delay, psychiatric and behavioral difficulties, and psychosocial problems. After surgery, 4–30 % of patients developed new affective disorders. 1–5 % developed psychosis, although reports from the past decade showed lower incidences than earlier studies. Most studies trying to define the effects of surgery on psychiatric disturbances and the risk factors for sequelae did not include presurgical and postsurgical assessments. Most studies of developmental and cognitive results of medial temporal resections in children suggested a lack of significant change in IQ or verbal memory. Seizure-free outcome was not always clearly related to cognitive outcome.
Although studies provide valuable information on the improvement of quality of life after surgery and indicate seizure freedom as the strongest and most consistent predictor for quality of life improvement, the lack of preoperative comparisons and absence of true control populations in most studies limits the validity of the results. The review by Spencer and Huh (2008) [16] did not compare surgical to medical outcomes. A number of studies have, however, compared surgical to medical therapy in terms of seizure control, cognitive status, psychosocial complications, quality of life, and mortality. These studies were carefully examined by Perry and Duchowny [17] who concluded that, based on limited evidence, successful epilepsy surgery, defined by complete seizure freedom, has the potential to improve cognitive functions, quality of life, and mortality, and may even prove cost-effective. However, the authors acknowledged that in these studies surgical and medical groups were significantly different at baseline (the latter being often unsuitable surgical candidates), the findings were frequently inconsistent, and the purported benefits of surgery on cognitive and psychosocial functions and on quality of life were mostly mediated by seizure freedom and reduction of the pharmacological burden.
One must also consider the use of different outcome measures. A critical appraisal of the quality of evidence on neuropsychological outcomes after epilepsy surgery has been recently published [18] and can be summarized here. The authors examined 147 articles and verified the application of 45 items used as measures of neuropsychological outcome of surgery. Among the socio-demographic characteristics of the patients, education was reported only by 54 % of studies and ethnicity and employment in rare instances. Only 16 % of studies reported on presurgical and 14 % on postsurgical drug treatment. Blinding of clinicians and/or assessors was almost never performed and an independent blind outcome assessment was never performed. Reasons for loss to follow-up were given in only 31 % of studies. Postsurgical deficits were indicated in only 25 % of reports. Quantitative measures of changes were applied in only 64 % of studies. Validated measures of change were used in 26 % of studies. The study results could be generalized in only 23 % of cases. Randomization of treatment was rare and a predefined sample size calculation was almost never performed. A quality assessment of psychosocial outcome measures was not performed.
In summary, assessment of surgical outcome in epilepsy beyond seizure control has received little attention in terms of rating of the quality of evidence using the principles of evidence-based medicine. A correct determination of the neuropsychological outcomes after surgery is relevant in helping the identification of patients at risk for postoperative cognitive decline.