Chapter 211 Meaningful Retrospective Analysis
Retrospective Clinical Studies
Retrospective studies represent a major portion of the available evidence in neurologic surgery. Although these types of studies constitute level III or IV evidence (Oxford Center for Evidence-based Medicine Levels of Clinical Evidence [Table 211-1]), they nonetheless represent the majority of neurosurgery evidence to date. There are three major types of nonrandomized clinical studies:
1. Case-control studies (Oxford Center Level III Evidence): Case-control studies are used to identify factors that might lead to a particular outcome by looking retrospectively and comparing cases with a particular outcome to controls without that outcome. When the end point is infrequent, the case-control method is particularly useful because prospective studies would likely require large numbers of patients to have enough power to detect differences between cohorts if the differences did exist. Haines used the case-control approach for studying craniotomy infections.1
2. Nested case-control studies: One variation of the case-control study is the nested case-control study, in which cases of a condition are identified from a defined cohort and, for each case, a selected number of matched controls are selected. This strategy has the advantage of potentially limiting not only the costs that might be associated with doing a large prospective trial but also the confounding biases typically associated with traditional case-control studies, in which the case and control populations differ substantially in ways that might not be apparent to the investigators evaluating the results.2
3. Case series (Oxford Center Level IV Evidence): A study of one group of patients without a comparison group. These studies tend to be descriptive and are best used to provide outcomes data for new techniques or for the treatment of rare disorders.
Level of Clinical Evidence | Description |
---|---|
I | Well-executed randomized controlled trial |
II | Prospective cohort study with controls |
III | Case-control study |
IV | Case series (without control group) |
V | Expert opinion or theory |
Modified from Oxford Centre for Evidence-based Medicine Levels of Evidence and Grades of Recommendation, 2001. http://www.cebm.net/index.aspx?0=5513.
In some situations, it is advantageous to compare the outcomes from one cohort with those from a cohort treated previously. Fessler et al.’s classic paper comparing corpectomy outcomes to cervical laminectomy outcomes (Nurick grade) is an example.3
Limitations of Randomized Clinical Trial Methodology
Although the randomized clinical trial (RCT) represents the highest level of clinical evidence, significant barriers exist to performing an RCT in spine surgery. The heterogeneity of spine diseases, the requirement for equipoise, and the learning curve associated with novel procedures are often cited as common challenges in performing RCTs. Even when performed, the results of RCTs are difficult to interpret or do not provide a clear answer to the research question (see Chapter 210). For these and many other reasons, nonrandomized clinical studies including retrospective studies remain an important research tool for spine surgeons. In fact, the number of published retrospective clinical studies continues to increase (Fig. 211-1). Prospective registries may represent an alternative to the RCT, although they pose the risk of generating data that can be difficult to interpret without clearly defined entry and exclusion criteria and control groups.
Advantages of Retrospective Methodology
Retrospective studies may, in some situations, evaluate more diverse patient populations and, as a result, provide data that more closely informs actual clinical practice. A study by Glassman et al. evaluated the effect of sagittal imbalance on health status by retrospectively reviewing 752 patients with various degrees of adult deformity. The study demonstrated that increasing sagittal imbalance correlates with worsening health status.4 In a companion study, Glassman et al. further identified sagittal imbalance as a reliable predictor of clinical symptoms relative to a number of other patient characteristics.5 Although the associations identified in these retrospective studies could not establish cause and effect, they are very useful for practicing spine surgeons interested in treating spine deformity.
Retrospective studies and prospective clinical trials can be viewed as working cooperatively to provide comprehensive evidence. Retrospective studies represent the observational first step that is critical for uncovering patterns among a vast array of patient factors and outcomes.6 From these patterns, hypotheses can be generated to identify potential causal relationships. RCTs alternatively represent the scientific experimentation step that either confirms or refutes these causal relationships. RCTs are well suited to test the effectiveness of an intervention by controlling for differences in baseline patient characteristics; however, retrospective studies are often very useful for identifying which patient characteristics are the most relevant in predicting outcomes. Both observation and experimentation steps are essential for scientific advancement.
The primary utility of the retrospective study is in identifying patterns among patient characteristics (e.g., risk factors, prognostic indicators) and their potential effect on clinical outcomes. This advantage is particularly evident in studying infrequent or delayed outcomes. For example, Cammisa et al. retrospectively reviewed 2144 patients treated over a 9-year period to identify factors associated with an incidental durotomy. They identified a 3.1% rate of durotomy in spine surgery patients.7 They found that incidental durotomies occurred more frequently in patients with prior surgery, and, that overall, with appropriate repair, patients with durotomies did not suffer any long-term sequelae compared with patients without durotomy. An RCT to address this particular question would have required 10 years to perform and likely would have cost millions of dollars.
Adjacent-level disease is another example of a relatively low-frequency event following spine surgery that has been studied using retrospective studies. The concept (although it is controversial) has fueled the development of motion-preservation techniques in spine surgery. In a landmark paper, Hilibrand et al. retrospectively evaluated 374 patients for delayed incidence of adjacent-segment degeneration following anterior cervical fusion up to 10 years postoperatively.8 They determined that the annual incidence of symptomatic adjacent-segment disease was 2.9% in these patients. The retrospective approach allowed for the quantification of the frequency of adjacent-level disease, but it could not establish its cause. Although further prospective studies are still needed to truly understand whether fusion causes adjacent-level disease, retrospective studies help to frame questions for further study and identify rates of various complications or other clinical events.
Retrospective studies can evaluate questions that might be impossible or unethical with an RCT. To study the effect of quitting smoking on spine fusion, Glassman et al. retrospectively reviewed 357 patients who underwent lumbar instrumented fusion. They found the nonunion rate was 14.2% for nonsmokers. They used follow-up telephone surveys to determine whether smokers quit smoking after surgery for at least 6 months and used this approach to assess the potential value of smoking cessation after lumbar spine fusion surgery. Cessation of smoking for at least 6 months after surgery was associated with a nonunion rate of 17.1%, whereas the nonunion rate was 26.5% in patients who continued to smoke after surgery.9
Retrospective methodologies avoid a specific type of bias, known as the Hawthorne effect, in which the patient’s and/or the physician’s behaviors are altered as a direct result of being observed, thereby influencing the treatment effect.10 This phenomenon is relevant in spine surgery RCTs, particularly medical device studies, because both patients and physicians may demonstrate heightened enthusiasm for potentially more attractive newer technology. Resnick et al. identified an example of this phenomenon in the RCT comparing the Prestige (Medtronic, Memphis, TN) cervical disc arthroplasty to single-level anterior cervical discectomy and fusion (ACDF).11 The investigators found that patients randomized to the novel device technology (Prestige arthroplasty) had better neurologic outcomes than those undergoing conventional ACDF.12 Resnick et al. noted that the surgical neural decompression for both the Prestige and ACDF procedures was identical, and, as such, better neurologic outcomes with the disc replacement were seemingly implausible. True differences in neurologic outcome, however, may have been attributable to better attention being paid to detail during the arthroplasty procedure as participating surgeons established experience and facility with the device, or perhaps a more optimistic view was expressed by patients who were pleased to have received the novel technology. Retrospective studies inherently eliminate this bias as they, de facto, initiate investigation after whatever treatment effect and outcome have already occurred.