Summary of Key Points
Confounders in clinical investigation are systematic errors that cause a tendency toward erroneous results. The two major forms of bias in retrospective studies are selection bias and missing data.
Case-control studies are used to identify factors that might lead to a particular outcome by looking retrospectively and comparing cases with a particular outcome to controls without that outcome. These types of studies are ideal when the incidence of a particular outcome is rare (e.g., infection).
Large administrative databases containing vast amounts of patient data have increased our opportunities to generate meaningful evidence using retrospective analysis.
The state inpatient databases (SIDs) are part of the Healthcare Cost and Utilization Project (HCUP) sponsored by the Agency for Healthcare Research and Quality. A SID contains demographic data, admission date and discharge date, procedure and diagnosis codes. Some of the SIDs contain patient identifiers and permit longitudinal analysis. For example, SIDs have been used to study reoperation rates following spine surgery.
The nationwide inpatient sample is part of the HCUP sponsored by the Agency of Healthcare Research and Quality and represents a weighted sample from the SIDs that is readily generalized to the entire U.S. nonfederal hospital universe. The NIS has been used extensively to study complication rates following surgery and has also been used to assess comparative hospital charges for various spinal procedures.
The NIS contains no patient identifiers and can only be used to study single hospital admissions. The NIS contains no patient-reported outcomes data.
Today, there is a growing need to supplement our education with a systematic understanding of the principles of evidence-based practice. This chapter explores the value of the retrospective approach and explains how to interpret clinical evidence when the clinical data are not randomized, or even prospective. New approaches using modern statistical methods with administrative databases have improved our ability to extract meaningful conclusions from retrospective data. By understanding both the strengths and weaknesses of the retrospective study, we can learn to harness the power of the approach without falling prey to false conclusions. Indeed, there is a role for retrospective studies, and it is likely that the approach will have a lasting presence in understanding outcomes from spine surgery.
Retrospective Clinical Studies
Retrospective studies represent a major portion of the available evidence in neurologic surgery. Although these types of studies constitute level III or IV evidence (Oxford Center for Evidence-based Medicine Levels of Clinical Evidence [ Table 214-1 ]), they nonetheless represent the majority of neurosurgery evidence to date. There are three major types of nonrandomized clinical studies:
Case-control studies (Oxford Center Level III Evidence): Case-control studies are used to identify factors that might lead to a particular outcome by looking retrospectively and comparing cases with a particular outcome to controls without that outcome. When the end point is infrequent, the case-control method is particularly useful because prospective studies would likely require large numbers of patients to have enough power to detect differences between cohorts if the differences did exist. Haines used the case-control approach for studying craniotomy infections.
Nested case-control studies: One variation of the case-control study is the nested case-control study, in which cases of a condition are identified from a defined cohort and, for each case, a selected number of matched controls are chosen. This strategy has the advantage of potentially limiting not only the costs that might be associated with doing a large prospective trial but also the confounding biases typically associated with traditional case-control studies, in which the case and control populations differ substantially in ways that might not be apparent to the investigators evaluating the results.
Case series (Oxford Center Level IV Evidence): A case series refers to a study of one group of patients without a comparison group. These studies tend to be descriptive and are best used to provide outcomes data for new techniques or for the treatment of rare disorders.
|Level of Clinical Evidence||Description|
|I||Well-executed randomized controlled trial|
|II||Prospective cohort study with controls|
|IV||Case series (without control group)|
|V||Expert opinion or theory|
In some situations, it is advantageous to compare the outcomes from one cohort with those from a cohort treated previously. Fessler and colleagues’ classic paper comparing corpectomy outcomes to cervical laminectomy outcomes (Nurick grade) is an example.
Limitations of Randomized Clinical Trial Methodology
Although the randomized clinical trial (RCT) represents the highest level of clinical evidence, significant barriers exist to performing an RCT in spine surgery. The heterogeneity of spine diseases, the requirement for equipoise, and the learning curve associated with novel procedures are often cited as common challenges in performing RCTs. Even when performed, the results of RCTs are difficult to interpret or do not provide a clear answer to the research question (see Chapter 213 ). For these and many other reasons, nonrandomized clinical studies including retrospective studies remain an important research tool for spine surgeons. In fact, the number of published retrospective clinical studies continues to increase ( Fig. 214-1 ). Prospective registries may represent an alternative to the RCT, although they pose the risk of generating data that can be difficult to interpret without clearly defined entry and exclusion criteria and control groups.
Advantages of Retrospective Methodology
Retrospective studies may, in some situations, evaluate more diverse patient populations and, as a result, provide data that more closely inform actual clinical practice. A study by Glassman and colleagues evaluated the effect of sagittal imbalance on health status by retrospectively reviewing 752 patients with various degrees of adult deformity. The study demonstrated that increasing sagittal imbalance correlates with worsening health status. In a companion study, Glassman and coworkers further identified sagittal imbalance as a reliable predictor of clinical symptoms relative to a number of other patient characteristics. Although the associations identified in these retrospective studies could not establish cause and effect, they are very useful for practicing spine surgeons interested in treating spine deformity.
Retrospective studies and prospective clinical trials can be viewed as working cooperatively to provide comprehensive evidence. Retrospective studies represent the observational first step that is critical for uncovering patterns among a vast array of patient factors and outcomes. From these patterns, hypotheses can be generated to identify potential causal relationships. RCTs alternatively represent the scientific experimentation step that either confirms or refutes these causal relationships. RCTs are well suited to test the effectiveness of an intervention by controlling for differences in baseline patient characteristics; however, retrospective studies are often useful for identifying which patient characteristics are the most relevant in predicting outcomes. Both observation and experimentation steps are essential for scientific advancement.
The primary utility of the retrospective study is in identifying patterns among patient characteristics (e.g., risk factors, prognostic indicators) and their potential effect on clinical outcomes. This advantage is particularly evident in studying infrequent or delayed outcomes. For example, Cammisa and associates retrospectively reviewed 2144 patients treated over a 9-year period to identify factors associated with an incidental durotomy. They identified a 3.1% rate of durotomy in spine surgery patients. They found that incidental durotomies occurred more frequently in patients with prior surgery and that overall, with appropriate repair, patients with durotomies did not suffer any long-term sequelae compared with patients without durotomy. An RCT to address this particular question would have required 10 years to perform and likely would have cost millions of dollars.
Adjacent-level disease is another example of a relatively low-frequency event following spine surgery that has been studied using retrospective studies. The concept (although it is controversial) has fueled the development of motion-preservation techniques in spine surgery. In a landmark paper, Hilibrand and colleagues retrospectively evaluated 374 patients for delayed incidence of adjacent-segment degeneration following anterior cervical fusion up to 10 years postoperatively. They determined that the annual incidence of symptomatic adjacent-segment disease was 2.9% in these patients. The retrospective approach allowed for the quantification of the frequency of adjacent-level disease, but it could not establish its cause. Although further prospective studies are still needed to truly understand whether fusion causes adjacent-level disease, retrospective studies help to frame questions for further study and identify rates of various complications or other clinical events.
Retrospective studies can evaluate questions that might be impossible or unethical with an RCT. To study the effect of quitting smoking on spine fusion, Glassman and colleagues retrospectively reviewed 357 patients who underwent lumbar-instrumented fusion. They found the nonunion rate was 14.2% for nonsmokers. They used follow-up telephone surveys to determine whether smokers quit smoking after surgery for at least 6 months and used this approach to assess the potential value of smoking cessation after lumbar spine fusion surgery. Cessation of smoking for at least 6 months after surgery was associated with a nonunion rate of 17.1%, whereas the nonunion rate was 26.5% in patients who continued to smoke after surgery.
Retrospective methodologies avoid a specific type of bias, known as the Hawthorne effect, in which the patient’s or the physician’s behaviors are altered as a direct result of being observed, thereby influencing the treatment effect. This phenomenon is relevant in spine surgery RCTs, particularly medical device studies, because both patients and physicians may demonstrate heightened enthusiasm for potentially more attractive newer technology. Resnick and associates identified an example of this phenomenon in the RCT comparing the Prestige (Medtronic, Memphis, TN) cervical disc arthroplasty to single-level anterior cervical discectomy and fusion (ACDF). The investigators found that patients randomized to the novel device technology (Prestige arthroplasty) had better neurologic outcomes than those undergoing conventional ACDF. Resnick and associates noted that the surgical neural decompression for both the Prestige and ACDF procedures was identical, and, as such, better neurologic outcomes with the disc replacement were seemingly implausible. True differences in neurologic outcome, however, may have been attributable to better attention being paid to detail during the arthroplasty procedure as participating surgeons established experience and facility with the device, or perhaps a more optimistic view was expressed by patients who were pleased to have received the novel technology. Retrospective studies inherently eliminate this bias as they, de facto, initiate investigation after whatever treatment effect and outcome have already occurred.