当前位置:首页 >> 医学 >>

Chapter 9. Case–Control Studies


Chapter 9. Case–Control Studies
KEY CONCEPTS

A case–control study is an observational study in which subjects are sampled based upon presence or absence of disease and then their prior exposure status is determined.

Case–control studies are statistically efficient and cost-effective for the study of rare diseases, and multiple risk factors can be investigated in a case–control study.

Newly diagnosed persons with disease are referred to as incident cases, whereas previously existing cases are referred to as prevalent cases.

Ideally, the controls should have a prevalence of exposure that is the same as the population of unaffected persons.

A population-based study is one in which cases and controls are sampled from a defined population, such as a metropolitan area.

A hospital-based sample of cases and controls may be convenient and inexpensive to collect, but may be biased by factors that affect the likelihood of hospitalization for cases and controls.

If sampling of cases, controls, or both is influenced by prior exposure history, then a selection bias may be present.

Confounding occurs when the apparent effect of the exposure of interest is attributable in whole or in part to some other factor.

Matching in a case–control study involves sampling of controls to parallel selected characteristics of cases in order to reduce the likelihood of confounding by the matched features.

The odds ratio is a measure of association between the exposure and disease that can be calculated in case–control studies.

PATIENT PROFILE
A 55-year-old woman was in excellent health until 2 weeks before admission, when she developed malaise, low-grade fever, cough, and generalized muscle pain. Although she took aspirin, her symptoms became worse over the next several days, in particular increased muscle pain, which made it very difficult for her to rise from a chair. She then consulted her personal physician, who performed a thorough evaluation. The patient's history was unremarkable except for insomnia over the previous year, which she treated with self-prescribed L-tryptophan. On physical examination, she had mild, diffuse muscle tenderness and a mild, erythematous maculopapular rash over much of her body. Laboratory examination was remarkable for elevations of her blood eosinophil count (2000 cells per mm3, < 250 cells per mm3) and mildly elevated aldolase levels. Eosinophilia-myalgia syndrome (EMS) was diagnosed.

CLINICAL BACKGROUND
In November 1989, researchers from the Centers for Disease Control and Prevention (CDC) and local health departments published the first description of EMS. This newly recognized syndrome is characterized by incapacitating myalgias (muscle pains), elevated eosinophil counts, and in some patients, arthralgias (joint pains), skin thickening, hair loss, and interstitial lung disease. EMS was first recognized in October 1989, when astute physicians determined that three people with unexplained myalgias and eosinophilia had consumed L-tryptophan, an essential amino acid available without prescription in drug and health food stores. Prompt response by health departments quickly led to case–control studies, the results of which suggested that ingestion of L-tryptophan was the cause of EMS. L-Tryptophan–containing products were taken off the market in November 1989. EMS occurs predominantly in women and is relatively rare. Nationwide disease surveillance conducted by the CDC led to identification of about 1500 cases of EMS, including 40 fatalities; nearly all cases occurred between mid-1988 and the end of 1989, although the actual number of cases was probably several times higher than the reported number. In 1990, after the recall of L-tryptophan, the number of reported cases fell to near zero. Further case–control studies showed that of the people studied, nearly everyone with EMS (cases) but only about half of those without EMS (controls) had consumed L-tryptophan produced by one particular manufacturer. Further inquiry disclosed that this company had changed manufacturing conditions prior to and during the period of the epidemic. Risk of developing EMS among those consuming L-tryptophan from this manufacturer was estimated to be 20 to 40 times higher than the risk among those consuming L-tryptophan from other sources. The epidemic was attributed to contamination of the L-tryptophan during production by the implicated manufacturer. Although many contaminants have been identified chemically in L-tryptophan produced by that manufacturer, the search to identify the specific contaminant or contaminants that cause EMS has been hampered by the lack of an animal

model that can reproduce the full spectrum of EMS seen in humans. The history of EMS illustrates the importance of astute clinical observations and the value of a rapid public health response, which led to the timely recall of L-tryptophan and the prevention of an even larger outbreak of disease. The initial study, as well as most of the subsequent investigations, linking consumption of L-tryptophan with the occurrence of EMS were based on a case–control design. In this chapter, case–control studies are described in detail.

CASE–CONTROL STUDIES: INTRODUCTION
As with cohort studies, case–control investigations typically are designed to assess the association between occurrence of disease and an exposure suspected of causing (or preventing) that disease. In many situations, however, a case–control study is more efficient than a cohort study because a smaller sample size is required.

The primary feature that distinguishes a case–control study from a cohort study is selection of subjects based on their disease status. The investigator selects cases from among those persons who have the disease of interest and controls from among those who do not. In a well-designed case–control study, cases are selected from a clearly defined population, sometimes called the source population, and controls are selected from the same population that yielded the cases. The histories of prior exposure for both cases and controls are examined to assess relationships between exposure and disease. The basic design of a case–control study is shown in Figure 9–1.

Figure 9–1.

Schematic diagram of the design of a case–control study. Shaded areas represent subjects who were exposed to the risk factor of interest. The approach to the design of a case–control study can be illustrated by one study, conducted in Minnesota (Belongia et al, 1990), designed to assess the association between the use of L-tryptophan and the risk of developing EMS. In this study, investigators contacted physicians in an attempt to identify all cases of EMS in the metropolitan area of Minneapolis–St. Paul. To select controls, they randomly called selected telephone numbers in the same area. Researchers interviewed subjects and asked about potential risk factors and about their use of L-tryptophan. They asked cases about use of L-tryptophan immediately prior to onset of illness, and asked controls about recent use. For each subject who reported use of L-tryptophan, the investigators obtained the brand of L-tryptophan and lot number, so that the manufacturer could be traced. L-Tryptophan was taken significantly more frequently by cases than by controls—61 of 63 case subjects (97%)—but only 101 of 5188 control subjects (2%). Because previous studies had already demonstrated a strong association between developing EMS and use of L-tryptophan in general, the main contribution of this study was the finding that risk was strongly associated with use of L-tryptophan from a particular manufacturer. Among subjects who used L-tryptophan and for whom the manufacturer could be determined, 29 of 30 cases (97%) but only 5 of 9 controls (56%) had used L-tryptophan from the implicated manufacturer. The design of this study is illustrated schematically in Figure 9–2.

Figure 9–2.

Schematic diagram of a case–control study of use of L-tryptophan and subsequent risk of developing eosinophilia-myalgia syndrome (EMS).

This investigation illustrates several important features of case–control studies. First, the design provides an efficient means to study rare diseases such as EMS. Case–control studies tend to be more feasible than other types of epidemiologic investigations, such as cohort studies, because fewer subjects are required. The smaller sample size is accompanied by a reduction in cost. Second, case–control studies allow researchers to investigate several risk factors. In this example, the investigators evaluated L-tryptophan and other factors as possible causes of EMS. Third, as with other nonexperimental or observational studies, a single case–control investigation does not "prove" causality, but it can provide suggestive evidence of a causal relationship that warrants intervention by public health officials to reduce exposure to the implicated risk factor. In this context, the removal of L-tryptophan-containing products from the market resulted in the virtual elimination of reported cases of EMS, although anecdotal reports of EMS-like illness still rarely occur in the apparent absence of L-tryptophan usage. These rare reports raise the speculative possibility that a contaminant that could cause EMS-like illness might still be present in some over-the-counter supplement or other source.

DESIGN OF CASE–CONTROL STUDIES
In this section, several aspects of case–control design are discussed, including sources of cases, sources of controls, and collection of information.

Cases
One of the first steps in a case–control study is to identify and select cases—a step that also determines the source population. Case identification should be complete, and the source population—the population from which cases arise—should be well defined. For example, cases might be sampled at random from all patients who are diagnosed with EMS during the study period and who reside within a certain geographic region, such as a state of the United States, or from all cases that occur among subscribers to a health maintenance organization. The source population consists of state residents in the first instance, and subscribers to the health maintenance organization in the second instance. In the previously cited study of EMS, the source population consisted of residents of the metropolitan area of Minneapolis–St. Paul, Minnesota. These cases may be identified by a surveillance system or by reviewing hospital records, other medical records, or death certificates available through institutional or population-based disease registries. In some situations, complete identification of cases in a well-defined source population may be too time consuming or otherwise infeasible. If so, a common alternative involves use of a "convenience sample." Cases might be sampled from patients admitted to particular hospitals or from those seen in certain clinics. Although such cases can often be identified easily, the underlying source population may not be well defined, thus making it difficult to generalize results confidently.

The investigator typically studies newly diagnosed or incident cases, although it is sometimes necessary to include previously existing or prevalent cases. Prevalent cases should be excluded primarily because the exposure may affect the prognosis or the duration of the illness. When this effect occurs, the exposure status of existing prevalent cases tends to differ from that of all cases. For example, suppose that prior use of L-tryptophan either prevents death or prolongs the duration of EMS. Prevalent cases of EMS might then have a higher reported use of L-tryptophan than would all cases with this disease. Consequently, a case–control comparison of use of L-tryptophan would tend to be distorted by an inflated estimate of use for cases. The general principle involved is that the likelihood of a case being included in the study must not depend on whether that case was exposed to the risk factor of interest. Another important step in designing a case–control study is to specify the definition of a case. The criteria should minimize the likelihood that an affected person (true case) is missed (ie, the criteria must be sensitive) or that a nonaffected person is falsely classified as a case (ie, the criteria must be specific). In general, there is a trade-off between the desire to include all cases (particularly when the disease is extremely rare, as is EMS) and the desire to prevent dilution of the case group with nonaffected persons. Moreover, restrictive criteria may require information that is unavailable for some subjects, making it impossible for such subjects to be classified fully. In practice, inclusion criteria are chosen to minimize misclassification yet promote feasibility. For example, in the previously cited study of EMS in Minneapolis–St. Paul, cases met specific criteria including the following: elevated eosinophil counts, myalgia or muscle weakness, and residence in the study area.

Controls
The next key step in a case–control study is to identify and select controls. Ideally, controls are chosen at random from the source population. If the source population is a state, city, or other well-defined area, controls in that area might be contacted by dialing telephone numbers at random (random-digit dialing), by visiting residences, by mailing letters soliciting participation, or by other means. An important goal is to select controls so that participation does not depend on exposure.

That is to say, the sample of controls should have the same prevalence of exposure as the source population of unaffected persons. If participation does depend on exposure, the case–control comparison may be distorted. In the previously cited study of EMS, the investigators selected controls by random-digit dialing in the Minneapolis–St. Paul area (the source population). Because the population of Minneapolis–St. Paul has fairly complete telephone coverage, this approach to selecting controls is unlikely to be influenced by use of
L-tryptophan (the exposure) or, among users, by the manufacturer of L-tryptophan.

Accordingly, within the control group selected by random-digit dialing, the manufacturer of
L-tryptophan should be comparable among users to that of the source population.

Determination of Exposure
Once cases and controls are selected, the next step is to obtain as accurate information as possible about each individual's prior exposure to the risk factor of interest, as well as to other exposures. The information concerning other exposures is used to determine whether association of disease with a risk factor is due to the exposure of interest or to other characteristics of exposed persons. Because factors cannot affect risk after the disease occurs, the timing of exposures is critical. With slowly developing diseases that lack early evidence of involvement, establishing the temporal sequence of exposure and onset of disease can be difficult or impossible. Interviews and questionnaires are the most common means of determining a subject's exposure history. Interviews can be conducted in person or by telephone. To ensure that information from cases and controls is obtained in the same manner, interviews should be standardized, monitored, and conducted by trained interviewers. Interviews are useful for collecting data because (1) questions may cover a wide range of potential risk factors, (2) costs are relatively low, and (3) information can be obtained on exposures that occurred years prior to the onset of illness. Occasionally, there is concern that cases and controls may recall exposures differently, perhaps distorting case–control comparisons. For example, cases—perhaps in an attempt to explain their illnesses—may overreport exposures. This is of particular concern when there has been a great deal of publicity about the association between the exposure and the disease of interest. For instance, after the association of
L-tryptophan with EMS was first identified and publicized, knowledge of this association could

have affected the reported exposures of cases in subsequent investigations. To minimize problems associated with subject recall, attempts can be made to verify exposures through other methods. In the context of the association between use of
L-tryptophan and development of EMS, for example, the interviewer might request that the

subject produce the L-tryptophan package. By inspecting the package, the interviewer can confirm that it was opened (and therefore the product presumably was used); the manufacturer and the lot number can also be identified. Information concerning risk factors may also be obtained from medical, occupational, or other records. These methods of obtaining information are not based on self-reporting and consequently should avoid the reporting bias that may occur when information is obtained through interviews. However, the amount of information found in records is often limited, so that all of the data of interest may not be available. Furthermore, this information may not be recorded in a standardized manner, leading to variability in subject classification. An objective means of characterizing exposure is through the use of a biological marker, such as measurement of an agent—an indicator of an agent—in blood or other specimens. However, there are several difficulties inherent in the use of biological markers. First, obtaining the specimens can involve an invasive procedure that discourages subject participation. Second, many exposures do not have known biological markers. Third, even if a marker exists, it may be transient and thus not present when the measurement is taken. For example, levels of L-tryptophan in blood would reflect only relatively recent exposure and would decline rapidly after exposure is stopped. Finally, the disease state may alter metabolism, thereby distorting case–control comparisons.

The type of case–control study described in the Minneapolis–St. Paul investigation of use of L-tryptophan and development of EMS, in which newly diagnosed cases and controls are sampled from a source population, is used quite commonly. It is often called a population-based study because cases and controls are sampled from a defined population, in this instance, by virtue of place of residence.

HOSPITAL-BASED CASE–CONTROL STUDIES
Other types of case–control studies differ from the population-based study primarily in the way the samples of cases and controls are selected. Variations include the use of prevalent rather than incident cases and sampling of controls from a readily available, convenient group such as hospital inpatients. The hospital-based case–control study is used so often that it merits mention. In this type of study, the investigator typically selects cases from persons with the disease of interest who are admitted to a particular hospital or hospitals; controls are selected from persons admitted with other conditions but with no evidence of the disease of interest. The researcher then obtains information from cases and controls, often by interviewing them in the hospital.

The hospital-based approach can be illustrated by a case–control study of Reye's syndrome, a condition characterized by acute encephalopathy associated with fatty degeneration of the liver. This illness occurs almost exclusively in children and typically follows a viral illness. To study the association between Reye's syndrome and use of various medications during an antecedent viral illness, researchers in this study selected cases from children admitted with Reye's syndrome to any of a preselected group of referral hospitals.

Investigators selected controls from children admitted to these same hospitals with an antecedent illness, presumably of viral origin. Parents were interviewed to assess prior exposure to aspirin. Twenty-six of the 27 cases—but only 6 of the 22 hospitalized controls—had been exposed to a salicylate-containing medication. In nearly every instance, the salicylate was aspirin. The hospital-based case–control study can be very convenient, as cases and controls are found in the same institutions. Moreover, potential subjects, if not too ill, may be particularly willing to participate. For example, they may have more time than would normally be available. Within a hospital-based case–control study, factors that might influence hospitalization at a particular facility, such as socioeconomic status, tend to be balanced between cases and controls. Although hospital-based studies can be convenient, they also are susceptible to distorted results. First, cases and controls in a hospital-based study may not arise from a single, well-defined population—in contrast to the population-based case–control studies described previously. This could happen, for example, if referral patterns to particular hospitals varied across different diagnoses. Moreover, controls in a hospital-based case–control study are in a hospital because they are ill, and the condition or conditions for which they are hospitalized may be associated with—and even caused by—the exposure of interest. If so, the exposure histories of controls may differ from those of nonaffected persons in the source population, and a distorted case–control comparison may result. Several selection criteria for hospital-based controls may help to reduce this type of distortion; those criteria are listed in Table 9–1.

Table 9–1. Approaches to Sampling of Controls in Hospital-Based Case-Control Studies. Selection Criteria for Hospital-Based Controls To Do To Avoid

Select controls from various diagnostic groups so no Do not select patients who have particular risk factors will be overrepresented multiple concurrent conditions

Select controls from patients with acute conditions so Do not select patients with diagnoses earlier exposures could not have been influenced by known to be related to the risk factor the condition of interest

These difficulties probably account for a decline in popularity of hospital-based case–control studies. Despite these problems, however, hospital-based case–control studies are still performed. Typically, they are easier and quicker to conduct than population-based studies, because cases and controls are identified efficiently. Consequently, hospital-based case–control studies may be less expensive. Furthermore, the collection of information on exposure from medical records and biological markers is easier in the hospital environment. Subjects in the hospital are more accessible for interview than persons in the community. As already noted, hospital-based controls may be more cooperative with investigators because they are ill and may want to advance medical knowledge. Despite these differences in approach and the subtle differences in interpretation that may result, the basic case–control design remains intact. Cases are selected from those with the

disease of interest and controls from those without that disease. The relative strengths of population-based and hospital-based case–control studies are summarized in Table 9–2. In brief, the hospital-based approach offers logistical advantages, whereas the population-based approach tends to characterize more accurately the history of prior exposure of the source population.

Table 9–2. Relative Strengths of Population-Based and Hospital-Based Case-Control Studies. Population-Based
Source population is better defined

Hospital-Based
Subjects are more accessible

Easier to make certain that cases and controls Subjects tend to be more cooperative derive from the same source population Exposure histories of controls more likely to Background characteristics of cases and reflect those of persons without the disease of controls may be balanced interest Easier to collect exposure information from medical records and biological

specimens

SELECTION BIAS
Bias is a systematic error in a study that distorts the results and limits the validity of conclusions. Bias will be discussed in more detail in Chapter 10: Variability & Bias. Bias can occur for a variety of reasons, most of which can affect any type of study. One form—selection bias—poses a particular threat to case–control studies. This form of bias, as suggested by its name, reflects systematic errors that arise from the way in which subjects are selected.

If selection of cases, controls, or both is influenced by prior exposure, this bias may be present. In particular, if the prior exposure of the cases studied differs from that of all cases arising from the source population—or if prior exposure of controls differs from that of persons in the source population without the disease of interest—selection bias may be present. The particular susceptibility of case–control studies to selection bias arises because of the need to obtain two samples: a sample of cases and a sample of controls. Unless each sample is obtained without regard to exposure, results may be biased. The development of selection bias is illustrated schematically in Figure 9–3. The shaded figures represent persons who were exposed and the unshaded figures represent persons who were not exposed. In the source population, one third of persons with the disease of interest were exposed. Among the cases included in the study, however, two thirds were exposed. That is to say, exposed persons with disease were more likely than unexposed persons with disease to be selected for the study. In this illustration, an opposite sampling pattern is displayed for persons without disease. In this group, exposed persons were less

likely to be selected for study than were unexposed persons. Obviously, in this study, comparison of exposure histories of sampled cases with controls would yield a result different from that achieved by contrasting exposure of persons with and without the disease of interest in the source population.

Figure 9–3.

Schematic diagram of the origin of selection bias in a case–control study. The shaded figures represent persons who were exposed, and the unshaded figures indicate persons who were unexposed. There are at least three ways in which this type of bias could arise in case–control studies of EMS: 1. Preferential diagnosis of exposed cases may lead to selection bias. After the initial publicity concerning the suspected association of EMS with use of L-tryptophan, physicians may have been more inclined to suspect the diagnosis of EMS among those who were known to have used L-tryptophan. If so, subjects with EMS who did not take L-tryptophan could have been underrepresented within the case group, thus leading to selection bias. 2. Low participation may lead to selection bias. For example, eligible subjects may refuse to participate, or physicians may advise their patients not to participate. If prior use of L-tryptophan among those who did not participate differed from prior use among participants, selection bias must be suspected.

3. Errors in sampling controls from the source population can also create selection bias. For example, if sampled controls had a condition such as insomnia that would make them more likely than other people to use L-tryptophan, selection bias could occur. Studies of EMS and use of L-tryptophan raise an interesting point concerning susceptibility of different studies to selection bias due to preferential diagnosis of exposed cases. Selection bias could have affected any case–control study of the association between development of EMS and any use of L-tryptophan that was conducted after the extensive media publicity. Nearly all the published case–control studies, however, were designed to investigate the association between risk of developing EMS and the manufacturing source of the
L-tryptophan. These studies were less susceptible to selection bias. For example, consider

the possibility that preferential diagnosis of EMS in exposed cases led to underrepresention of unexposed subjects in the case series. This type of problem was unlikely in the studies concerning the source of L-tryptophan because most physicians would not have known the source, nor would they have been aware of the link between a particular manufacturer and risk of L-tryptophan. Thus preferential diagnosis of EMS in exposed cases (those who used
L-tryptophan from the implicated manufacturer) should not have occurred to any significant

extent, and this group should not have been overrepresented in the case group.

MATCHING

Confounding is a distortion of results that occurs when the apparent effects of the exposure of interest are attributable entirely or in part to the effects of an extraneous variable. Confounding is likely to occur when persons exposed to the risk factor of interest differ from nonexposed persons with respect to the prevalence of other risk factors for the disease of interest. Confounding is discussed in more detail in Chapter 10: Variability & Bias. In this chapter, several possible ways to control confounding are presented, including matched sampling.

Matching is a popular approach to control confounding in case–control studies. Its popularity reflects the belief that matching cases and controls forces these groups to be similar with respect to important risk factors, and thereby makes case–control comparisons less subject to confounding. This perception about matching is valid, provided the appropriate matched analysis is conducted. The first step in matching is to identify a case. Investigators then select from the source population one or more potential controls who have the same values that the case has for each matching factor. The process of matching by race and sex is illustrated schematically in Figure 9–4. To match on a continuous variable such as age, it is typically necessary to form categories, such as 5-year intervals (years 10–14, 15–19, 20–24, etc). In a study with matching on race, sex, and age in 5-year intervals, a 17-year-old African-American female case would be matched to an African-American female control aged 15–19 from the source population. As in an unmatched study, these controls would come from the defined source

population. More than one control can be matched to each case, but the ratio of controls to cases rarely exceeds 4:1 because additional controls beyond this ratio add relatively little to the statistical power of the study.

Figure 9–4.

Schematic diagram of the process of matching controls to cases by race and sex. The use of matching is common in clinical studies, particularly when the disease of interest is extremely rare, as is EMS. In this situation, there are a small number of potential cases and a large number of potential controls. Matching can increase the statistical efficiency of case–control comparisons and thus achieve a particular level of statistical power with a smaller sample size. The matching protocol often simplifies decisions about how to sample controls. In addition, matching tends to ensure that case–control differences in the risk factor of interest cannot be explained by reference to the matched variables. However, these advantages of matching must be weighed against a number of disadvantages. As indicated in Table 9–3, matching can be time consuming and therefore expensive. Any potential cases or controls that cannot be matched must be discarded, which can be viewed as a wasteful process. Any variable that is matched in a study cannot be evaluated as a risk factor in that investigation. Finally, matching on ordinal or continuous variables may result in categories that are too broad to remove completely the effects of the matched variables from the exposure–disease relationship.

Table 9–3. Advantages and Disadvantages of Matching in Case-Control Studies. Advantages Disadvantages

May increase the precision of case-control May be time-consuming and expensive to comparisons and thus allow a smaller perform study The sampling process is easy to Some potential cases and controls may be excluded because matches cannot be made

understand and explain

If analyzed correctly, provides reassurance The matched variables cannot be evaluated as that matched variables cannot explain risk factors in the study population case-control differences in the risk factor of interest For continuous or ordinal variables, matching categories may be too broad, and residual case-control differences in these variables may persist Another investigation of development of EMS and use of L-tryptophan conducted by researchers in Minnesota illustrates the use of a matched case–control study design (CDC, 1989). In that study, researchers contacted area physicians to identify people with unexplained eosinophilia and severe myalgia. They also required that a muscle biopsy, if done, show eosinophilic perimyositis or perivasculitis. For each case, they identified a control who matched the case on age, sex, and telephone exchange. They interviewed subjects and asked about prior use of L-tryptophan, prior use of selected other medications, and diet. For each of the 12 case–control pairs, the case had consumed L-tryptophan prior to onset of illness, but none of the matched controls had consumed L-tryptophan. Results of this study indicated a strong association between ingestion of L-tryptophan and risk of developing EMS.

ANALYSIS
The type of analysis employed in a case–control study depends on whether subjects were sampled in an unmatched or in a matched approach. These two analytic strategies are described in the following sections.

Unmatched Design
The data obtained in an unmatched case–control study can be summarized as indicated in Table 9–4. For simplicity, only two levels of exposure are discussed here, although the basic methods can be expanded to include multiple levels of exposure. Each subject can be classified into one of four basic groups defined by disease and prior exposure status:

Table 9–4. Summary of Data Collected in an Unmatched Case-Control Study. Exposed
Case Controls Total A C A+C

Unexposed
B D B+D

Total
A+B C+D A+B+C+D

A. Cases who were exposed B. Cases who were not exposed C. Controls who were exposed D. Controls who were not exposed. The format of Table 9–4 should appear familiar, since it resembles that of Table 8–6. Although the summary tables for cohort and case–control studies appear similar, it is important to remember that the underlying approaches to sampling differ, and the analysis must account for these differences. In a cohort study, sampling is based on exposure status, and the investigator thus determines the total numbers of exposed (A + C) and unexposed subjects (B + D) included in the study. Then risk of disease development can be estimated separately for exposed and unexposed groups, and these two risks can be compared in a risk ratio (RR). A case–control study, on the other hand, begins with sampling of persons with the disease of interest and individuals without the disease (A + B and C + D, respectively). With this approach, the proportion of persons in the study who have the disease is no longer determined by the risk of developing the disease in the source population, but rather by the choice of the investigator. That is, a disease that occurs infrequently in the source population can be oversampled, so that affected individuals constitute a large proportion of the study sample. This ability to oversample affected individuals is the reason case–control studies are statistically efficient for the study of rare diseases. In a case–control study the investigator determines the ratio of persons with the disease to persons without it, and thus the proportion of study subjects who have the disease does not provide an estimate of the risk of developing the disease. As shown in the following section, however, an indirect estimate of the incidence rate ratio can still be obtained in a case–control study.

Odds Ratio
With the notation introduced in Table 9–4, the probability that a case was exposed previously is estimated by

The odds of exposure for cases represent the probability that a case was exposed divided by the probability that a case was not exposed. The odds then are estimated by

Similarly, the odds of exposure among controls are estimated by

The odds of exposure for cases divided by the odds of exposure for controls are expressed as the odds ratio (OR). Substituting from the preceding equations, the OR is estimated by

The OR is sometimes termed the exposure odds ratio or the cross-product of Table 9–4, because it results from dividing the product of entries on one diagonal of this table by the product of entries on the cross-diagonal. When incident cases and controls are sampled from the same source population (with selection independent of prior exposure), the exposure OR provides a valid estimate of the incidence rate ratio (see Appendix C). In other words, if properly designed, a case–control study can yield a measure of association between exposure and disease that approximates the incidence rate ratio. The calculation of the OR can be illustrated by data from a case–control study of EMS in which risk associated with use of particular brands of L-tryptophan was studied. Among those who took L-tryptophan, 22 of 58 cases took one particular retail lot (lot A), compared with 7 of 93 controls, as summarized in Table 9–5. The OR for these data is as follows

Table 9–5. Summary of Data from the Study of Eosinophilia-Myalgia Syndrome (EMS) and Use

of Lot A. Use Lot A
Cases Controls Total 22 7 29

Use Other Lot
36 86 122

Total
58 93 151

In other words, the odds for use of lot A for patients with EMS were over seven times greater than the odds for use of lot A among controls in this study. To the extent that the OR provides a valid estimate of the incidence rate ratio, it could be concluded from this investigation that use of Lot A increased the likelihood of developing EMS more than sevenfold. As with the risk ratio, a 95% confidence interval around the point estimate of the OR can be calculated. A formula to calculate an approximate 95% confidence interval is given in Appendix D. With the data presented in Table 9–5, the approximate 95% confidence interval for the OR is 2.9 to 19.1. That is, the data from this study are consistent with a moderately to strongly positive association between the use of a particular lot of L-tryptophan and the development of EMS. This association is unlikely to have occurred by chance alone, since the null value of the OR (null value = 1) is well outside the 95% confidence interval. The point estimate and confidence interval for this odds ratio are illustrated in Figure 9–5.

Figure 9–5.

Point estimate and 95% confidence interval for odds ratio comparing patients who used lot A L-tryptophan with eosinophilia-myalgia syndrome (EMS) and controls

Matched Design
In a matched case–control study, the analysis must account for the matched sampling scheme. When one control is matched to each case, summary data can be presented in the format shown in Table 9–6. An extension of this basic format can be employed for situations in which the ratio of controls to cases differs from 1:1. Although there are four cells in Table 9–6, the entries into this format are quite different from what we find in previous tables. Each entry into Table 9–6 represents not one subject but two (a matched case–control pair). That is, each case–control pair can be classified into one of the four basic combinations of exposure status:

Table 9–6. Summary Data Format for a Matched Case-Control Study with One Control Per Case.

Control Exposed
Case exposed Case unexposed Total W Y W+Y

Control Unexposed
X Z X+Z

Total
W+X Y+Z W+X+Y+Z

W—Both case and control exposed X—Case exposed but control unexposed Y—Case unexposed but control exposed Z—Both case and control unexposed. Case–control pairs that are entered into cells W and Z are referred to as concordant pairs, because in these pairs, the exposure status of cases and controls is the same. Case–control pairs that are entered into cells X and Y, in contrast, are referred to as discordant pairs because in these pairs, the exposure status of cases and controls differs. The OR for a pair-matched case–control study is given by a simple ratio:

This odds ratio can be interpreted in the same manner as the OR for unmatched studies. To illustrate the calculation of the OR from a matched study, the results of a hypothetical matched study with 200 matched case–control pairs are shown in Table 9–7. The OR from this study is as follows:

Table 9–7. Summary Data from a Hypothetical Matched Case-Control Study of Use of L-Tryptophan and Risk of Developing Eosinophilia-Myalgia Syndrome (EMS). Control Exposed
Case exposed Case unexposed Total 132 5 137

Control Unexposed
57 6 63

Total
189 11 200

A 95% confidence interval around the point estimate of the matched OR can be calculated. A formula to calculate an approximate 95% confidence interval is given in Appendix D. With the data presented in Table 9–7, the approximate 95% confidence interval for the OR is 4.6 to 28.3. That is, the data from the hypothetical matched case–control study are consistent with a strong to a very strong positive association between the use of L-tryptophan and the

development of EMS. This association is highly unlikely to have occurred by chance, as the null value of the OR (null value = 1) is far outside the 95% confidence interval. To further illustrate analysis of matched case–control studies, consider again the matched case–control study of L-tryptophan conducted by the researchers in Minnesota (CDC, 1989). The results of that study, summarized in Table 9–8, indicate that for every case–control pair, the case but not the control had taken L-tryptophan. Therefore, the OR from this study is as follows:

Table 9–8. Results of a Matched Case-Control Study of Use of L-Tryptophan and Risk of Developing Eosinophilia-Myalgia Syndrome (EMS). Control Exposed
Case exposed Case unexposed Total 0 0 0

Control Unexposed
12 0 12

Total
12 0 12

This odds ratio is undefined or infinite, since the denominator is zero. This suggests a very strong association between use of L-tryptophan and risk of developing EMS. The 95% confidence interval for the odds ratio is 2.8 to infinity. It is highly unlikely that this association occurred by chance, as the null value of the OR (null value = 1) is well outside the 95% confidence interval. (In this example, the small number of case–control pairs in the Y category violates the large sample assumptions of the approximate confidence interval formula in Appendix D. Therefore, a more complicated exact formula was used to estimate this 95% confidence interval.)

SUMMARY
In this chapter, the basic approach to the design and analysis of case–control studies is presented, with illustrations drawn primarily from the literature on the relationship between the use of contaminated L-tryptophan and the risk of developing EMS. A case–control study is a type of observational investigation in which subjects are enrolled on the basis of the presence or absence of a particular disease (eg, EMS) and are then evaluated to determine their history of prior exposure to risk factors of interest (eg, use of L-tryptophan). The advantages and disadvantages of the case–control approach are summarized in Table 9–9. The advantages of this design are primarily logistical. In particular, rare diseases (EMS is an example) and those with long latency periods can be studied efficiently. The sample size required for a case–control study tends to be smaller than would be needed for an alternative design, such as a cohort study. As a result, the expense of conducting a case–control study may be substantially less than the cost of conducting a cohort study. Furthermore, reliance

on historical information allows rapid completion of a case–control study. The ability to reach a prompt conclusion is particularly important if the disease of interest is potentially life-threatening, as is EMS, because future cases might be prevented if preventive action is taken to limit exposure to a suspected risk factor.

Table 9–9. Advantages and Disadvantages of Case-Control Studies. Advantages
Efficient diseases Efficient for the study of chronic Not efficient for the study of rare exposure diseases Tend to require a smaller sample More susceptible to selection bias than alternative size than other designs Less expansive than designs alternative Information on exposure may be less accurate than that available in alternative designs for the study of

Disadvantages
rare Risk of disease cannot be estimated directly

designs May be completed more rapidly than alternative designs

The disadvantages of case–control studies relate primarily to their susceptibility to systematic errors. Because cases and controls are sampled separately, it is possible that these groups may not arise from the same source population. Bias can be introduced into the study results if exposure status is associated with the likelihood of including cases or controls into the study. Reliance on subject recall of earlier exposures or the use of historical records can lead to imprecise or inaccurate classification of exposure. The decision to conduct a case–control study typically is motivated by a desire to explore the relationship between prior exposure to a specific risk factor and the likelihood of developing a particular disease. Ideally, the cases and the controls should derive from a single well-defined source population, such as a state or metropolitan area (a population-based sampling scheme). An attempt may be made to identify all newly diagnosed cases (incident cases) within the source population, particularly when the disease is rare or the source population is modest in size. Cases may be identified from hospital records, surveillance systems, death certificates, or other sources. Careful criteria for the presence of disease must be established to minimize false inclusions or exclusions. Controls typically are sampled from the population that gave rise to the cases. Occasionally, for purposes of convenience, hospital-based samples of cases and controls are selected. The hospital-based approach tends to have the advantages of accessibility to the subjects and cooperative study participants. On the other hand, cases and controls may derive from dissimilar source populations in a hospital-based study, and prior exposure status might influence the likelihood of inclusion in this type of investigation.

Matching of controls to cases on the basis of known risk factors for the disease of interest is a common practice in case–control studies. The intent of matching is usually to decrease the possibility of confounding, or mixing of the effect of exposure to the risk factor of interest with the effects of exposure to other risk factors. Matching can increase the statistical precision of estimates and thereby allow a smaller sample size. On the other hand, matching can be time consuming, and subjects who are not successfully matched must be discarded from the analysis. The process of selection of subjects in a case–control study precludes the estimation of risks (or rates), and the risk ratio therefore cannot be calculated directly from case–control data. An indirect estimate of the risk ratio, however, can be calculated in a case–control study. This measure is referred to as the odds ratio and is defined as the odds of exposure among cases divided by the odds of exposure among controls. The approach to calculating the odds ratio depends on whether cases and controls were sampled in an unmatched or matched fashion. In either instance, a point estimate and 95% confidence interval for the odds ratio can be calculated as a measure of association between prior exposure to the risk factor and occurrence of disease. A number of case–control studies of EMS are discussed. In those studies, cases and controls were sampled using various approaches. The most consistent risk factor that emerged from the studies was the prior use of L-tryptophan produced by one manufacturer. The strength of the association between prior use of L-tryptophan from the implicated manufacturer and development of EMS, the dose–response, and the consistency of results across studies—as well as other considerations such as biological plausibility—suggest the possibility that a cause-and-effect relationship exists between the exposure in question and the occurrence of disease. The decline in the incidence of reported cases of EMS after withdrawal from the market of L-tryptophan–containing products further supports this explanation.

FURTHER READING
Schulz KF, Grimes DA: Case–control studies: research

REFERENCES
Clauw DJ, Pincus T: The eosinophilia-myalgia syndrome: What we know, what we think we know, and what we need to know. J Rheumatol 1996;23(Suppl 46):2.

Clinical Background

CDC: Eosinophilia-myalgia syndrome and L-tryptophan-containing products—New Mexico, Minnesota, Oregon, and New York, 1989. MMWR 1989;38:785. Kilbourne EM et al: Tryptophan produced by Showa Denko and epidemic eosinophilia-myalgia syndrome. J Rheumatol 1996; 23(Suppl 46):81.

Mori

Y

et

al:

Scleroderma-like

cutaneous

syndromes.

Curr

Rheumatol

Reports

2002;4:113. [PMID: 11890876] Swygert LA et al: Eosinophilia-myalgia syndrome. Results of national surveillance. JAMA 1990;264:1698. [PMID: 2398610]

Introduction

Belongia EA et al: An investigation of the cause of the eosinophilia-myalgia syndrome associated with tryptophan use. N Engl J Med 1990;323:357. [PMID: 2370887]

Design of Case–Control Studies

Cummings P, Koepsell TD, Weiss NS: Studying injuries with case–control methods in the emergency department. Ann Emerg Med 1998;31:99. [PMID: 9437350] Essebag V et al: The nested case–control study in cardiology. Am Heart J

2003;146:581. [PMID: 14564310]

Hospital-Based Case–Control Studies

Cummings P, Koepsell TD, Weiss NS: Studying injuries with case–control methods in the emergency department. Ann Emerg Med 1998;31:99. [PMID: 9437350] Hurwitz ES et al: Public Health Service study of Reye's syndrome and medications. JAMA 1987;257:1905. [PMID: 3820509]

Selection Bias

Horwitz RI, Daniels SR: Bias or biology: Evaluating the epidemiologic studies of L-tryptophan and the eosinophilia-myalgia syndrome. J Rheumatol 1996;23(Suppl 46):60.

Matching

Gold EB: Case–control studies and their application to endocrinology. Endocrinol Metab Clin North Am 1997;26(No. 1):1.

Analysis

CDC: Eosinophilia-myalgia syndrome and L-tryptophan-containing products—New Mexico, Minnesota, Oregon, and New York, 1989. MMWR 1989;38:785. Hertzman PA et al: Association of the eosinophilia-myalgia syndrome with the ingestion of tryptophan. N Engl J Med 1990;322:869. [PMID: 2314421] Slutsker L et al: Eosinophilia-myalgia syndrome associated with exposure to tryptophan from a single manufacturer. JAMA 1990;264:213. [PMID: 2355442]

E-PIDEMIOLOGY Further Reading

http://www.nemsn.org/

Clinical Background

http://www.nemsn.org/

Design of Case–Control Studies

http://www.bmj.com/epidem/epid.8.shtml http://www/pitt.edu/~super1/lecture/lec10281/index.htm http://www.pitt.edu/~super1/lecture/lec8591/index.htm

赞助商链接
相关文章:
11_Cases Analysis_Chapter03_Case
11_Cases Analysis_Chapter03_Case_英语考试_外语学习...For the entire season, 20 Lower control limit ...9 54 65 93 249 103 59 135 48 技术重要度 每...
...organizational behavior_BMA-case(chapter 6,14,18...
AIEN_understanding and managing organizational behavior_BMA-case(chapter 6,14,18)_英语学习_外语学习_教育专区。爱恩学院 BMA case studeyChapter...
更多相关标签: