A Crash Course in Stats
- M. Sue Kirkman, MD
“There are three kinds of lies: lies, damned lies, and statistics,” wrote Mark Twain in his autobiography.1 The New England Journal of Medicine's May online (ahead of print) publication of the metaanalysis suggesting that rosiglitazone is associated with increased risk of myocardial infarction (MI) reminded me of Twain's line and also how little I (and most physicians) know about statistics and how to interpret the medical literature.2 After publication of the paper by Cleveland Clinic's Steven Nissen, MD, and Kathy Wolski, MPH, I found myself trying to explain meta-analyses, odds ratios, confidence intervals, and the concepts of absolute versus relative risk to reporters (sometimes looking things up in statistics books between calls).
Subsequent publication of the interim analysis of the Rosiglitazone Evaluated for Cardiac Outcomes and Regulation of Glycemia in Diabetes (RECORD) study, two additional meta-analyses that came to conflicting conclusions about rosiglitazone and MI risk, and the seemingly conflicting analyses presented by GlaxoSmithKline and the Food and Drug Administration at the advisory committee hearing in July prolonged the media frenzy and added to the general confusion (DOC News, June 2007, page 1, and August 2007, page 3).3–6 At the risk of overstepping my training (I'm not a statistician; I just play one in DOC News), I offer here a crash course in statistics, using the Nissen and Wolski paper as the case study.
The results of the paper were reported as follows: “In the rosiglitazone group, compared to the placebo group, the odds ratio for MI was 1.43, with 95% confidence interval 1.03–1.98, p=0.03.” Let's leave aside the issue of meta-analyses for now, and imagine that Nissen and Wolski were reporting results of a single large clinical trial of rosiglitazone compared with placebo.
The odds ratio is a measure of the strength of the association between an exposure (in this case randomization to the rosiglitazone group) and an outcome (MI). In this case, it compares the odds of MI (number with MI divided by number without MI) in the rosiglitazone group with the odds of MI in the placebo group. The paper estimates the odds of MI with rosiglitazone are 1.43 times the odds of MI with placebo.
If the odds ratio were 1, that would mean there was no difference in the odds of MI between the two groups. That brings us to confidence intervals, those things that seem to clutter up abstracts with lots of numbers. Confidence intervals are somewhat analogous to the more familiar P values, in that they are a measure of how sure we are that our results are not just due to chance.
In clinical research, we fundamentally make estimates of what is “true” but difficult or impossible to measure. Imagine you have a paper bag filled with 1,000 red and white marbles, but you don't know how many are red and how many are white (in actuality 600 are red). Rather than pouring out all the marbles and counting them, you decide to reach into the bag and pull out 10 marbles, count how many are red, then throw the 10 marbles back in and shake the bag. If you do this once, you might by chance get the “true” number (6 red and 4 white), or you might get a number of other combinations ranging from all 10 red to all 10 white (with those extremes less likely but still possible). If you do this repeatedly and average the results, your result should edge ever closer to the “true” value, but it may not ever come out to be exactly 600 red. However, you could come up with a number like 588, but with 95% confidence intervals of 570–606. The more times you do the experiment, the closer would be your estimate to 600 and the narrower your confidence intervals.
In the case of an odds ratio, for the result to be statistically significant, the 95% confidence interval should not overlap 1 (i.e., the odds ratios within the confidence interval should all be >1 or <1, the no difference point). In the Nissen and Wolski paper, we can be 95% confident that the “true” odds ratio is 1.03–1.98, a range that includes only increased odds of MI with rosiglitazone. In the same paper, the authors estimate the odds ratio for death associated with rosiglitazone to be 1.64—a bigger odds ratio than for MI. However, the confidence intervals for the death odds ratio are 0.98–2.74; this is not a statistically significant number because the confidence interval includes 1 (no difference) and even includes a small possibility of lower odds with rosiglitazone.
What about this whole metaanalysis issue? A meta-analysis is one way to try to find relatively rare outcomes that otherwise might require enormous and prohibitively expensive trials to detect. A number of physicians (and lay people) noticed that if you lump all the studies together in the Nissen and Wolski paper, the percentage of patients with MI looks greater with placebo than with rosiglitazone. However, in a metaanalysis you can't just lump all the patients in all the studies together in one pot and then calculate the results as if there had been a single study. Studies might include very different types of patients, and it would not be appropriate to compare people with diabetes to those without, for example. Rather, you compare each study's individual result (in this case, each study's odds ratio) with those of the other studies, weighing the odds ratios to account for things like study size (an odds ratio of 10 in a study of 20 patients shouldn't “count” as much as one of 1.05 in a study of 10,000 patients). In the Nissen and Wolski paper, they only lumped the very small industry trials of people with diabetes, as the patients and protocols were similar, then compared them with several large and very different studies. Similarly, some have criticized the authors for excluding studies with no MIs from the metaanalysis, but this is actually appropriate: If a study has no odds of MI, you cannot calculate an odds ratio.
For those still bothered by the perception that the rate of MI looks lower with rosiglitazone if you lump all the raw data together, several things may reassure you. If you carefully look at the results, note that the odds ratio for MI was greater than 1 (rosiglitazone was associated with higher odds for MI) within each of the three studies. And although it seems counterintuitive that the raw lumped numbers might suggest something opposite to the meta-analysis finding, this is actually a phenomenon known as Simpson's paradox, in which results obtained when comparing several groups seem to be reversed when the groups are inappropriately combined.
A final point of confusion, certainly for the media and for our patients, is the concept of absolute versus relative differences in risk. After the Nissen and Wolski paper came out, the media trumpeted the relative risk increase (43%, from the odds ratio of 1.43). However, the actual difference in event rates was very small: 0.36% of placebo patients had an MI, and 0.43% of rosiglitazone patients had one in the small trials grouping—an absolute increase of <0.1%. This is not to downplay the absolute risk increase (which occurred in relatively healthy study subjects in only 6 months of treatment), but to say that in terms of public alarm there is a big difference between the concept of a 43% increase and that of a 0.1% increase in heart attacks.
Rather than lumping statistics in with “lies and damned lies” (Twain actually attributed the quote to Benjamin Disraeli), we should see them as tools to describe how closely our estimates are to “truth,” and we need to try to understand them. Doing so is not easy, but we owe it to our patients, who look to us for explanations and reassurance. ▪
Footnotes
-
- American Diabetes Association, Inc.















