Healthcare, etc.: Reviewing medical literature part 3 continued: threats to validity

Thursday, January 13, 2011

Reviewing medical literature part 3 continued: threats to validity

As promised, today we talk about confounding and interaction.

A confounder is a factor related to both, the exposure and the outcome. Take for example the relationship between alcohol and head and neck cancer. While we know that heavy alcohol consumption is associated with a heightened risk of head and neck cancer, we also know that people who consume a lot of alcohol are also more likely to be smokers, and smoking in turn raises the risk of H&N CA. So, in this case smoking is a confounder of the relationship between alcohol consumption and the development of H&N CA. It is virtually impossible to get rid of all confounding completely in any study design, save for possibly in a well designed RCT, where randomization presumably assures equal distribution of all characteristics; and even there you need an element of luck. In observational studies our only hope to deal with confounding is through statistical manipulation we call "adjustment", as it is virtually impossible to chase it away any other way. And in the end we still sigh and admit to the possibility of residual confounding. Nevertheless, going through the exercise is still necessary in order to get closer to the true association of the main exposure and the outcome of interest.

There are multiple ways of dealing with the confounding conundrum. The techniques used are matching, stratification, regression modeling, propensity scoring and instrumental variables. By far the most commonly used method is regression modeling. This is a rather complex computation that requires much forethought (in other words, "Professional driver on a closed circuit; don't try this at home"). The frustrating part is that, just because the investigators did the regression, does not mean that they did it right. Yet word limits for journal articles often preclude authors from giving enough detail on what they did. At the very least they should tell you what kind of a regression they ran and how they chose the terms that went into it. Regression modeling relies on all kinds of assumptions about the data, and it is my personal belief, though I have no solid evidence to prove it, that these assumptions are not always met.

And here are the specific commonly encountered types of regressions and when each should be used:
1. Linear regression. This is a computation used for outcomes that are continuous variables (i.e., variables represented by a continuum of numbers, like age, for example). This technique's main assumption is that the exposure and outcome are related to each other in a linear fashion. The resulting beta coefficient is the slope of this relationship if it is graphed.
2. Logistic regression. This is done when the outcome variable is categorical (i.e., one of two or more categories, like gender, for example, or death). The result of a logistic regression is an adjusted odds ratio (OR). It is interpreted as an increase or a decrease in the odds of the outcome occurring due to the presence of the main exposure. Thus, a OR of 0.66 means that there is a 34% reduction in the odds (used interchangeably with risk, though this is not quite accurate) of the outcome due to the presence of the exposure. Conversely, a OR of 1.34 means the opposite, or a 34% increase in the odds of the outcome if the exposure is present.
3. Cox proportional hazards. This is a common type of a model developed for a time to event, also known as "survival analysis" (even if not done for survival per se as the outcome). The resulting value is a hazard ratio (HR). For example, if we are talking about a healthcare-associated infection's impact on the risk of remaining in the hospital longer, a HR of, say, 1.8 means that a HAI increases the risk of being in the hospital by 80% at any time during the hospitalization. To me this tends to be the most problematic technique in terms of assumptions, as it requires that the risk of an even stays constant throughout the time frame of the analysis, and how often does this hold true? For this reason the investigators should be explicit about whether or not they tested for the assumption of proportional hazards and whether this was met.

Let's now touch upon the other techniques that help us to unravel confounding. Matching is just that: it is a process of matching subjects with the primary exposure to those without in a cohort study or subjects with the outcome to those without in a case-control study, based on certain characteristics, such as age, gender, comorbidities, disease severity, etc.; you get the picture. By its nature, matching reduces the amount of analyzable data, and thus reduces the power of the study. So, is is most efficiently applied in a case-control setting, where it actually improves the efficiency of enrollment.

Stratification is the next technique. The word "stratum" means "layer", and stratification refers to describing what happens to the layers of the population of interest with and without the confounding characteristic. In the above example of smoking confounding the alcohol and H&N CA relationship, stratifying the analyses by smoking (comparing the H&N CA rates among drinkers and non-drinkers in the smoking group separately from the non-smoking group) can divorce the impact of the main exposure from that of the confounder on the outcome. This method has some distinct intuitive appeal, though its cognitive effectiveness and efficiency dwindle the more strata we need to examine.

Propensity scoring is gaining popularity as an adjustment method in the medical literature. A propensity score is essentially a number, usually derived from a regression analysis, giving the propensity of each subject for a particular exposure. So, in terms of smoking, we can create a propensity score based on other common characteristics that predict smoking. Interestingly, some of these characteristics will be present also in people who are not smokers, yielding a similar propensity score in the absence of this exposure. Matching smokers to non-smokers based on the propensity score and examining their respective outcomes allows us to understand the independent impact of smoking on, say, the development of coronary artery disease. As in regression modeling, the devil is in the details. Some studies have indicated that most papers that employ propensity scoring as the adjustment method do not do this correctly. So, again, questions need to be asked and details of the technique elicited. There is just no shortcut to statistics.

Finally, a couple of words about instrumental variables. This method comes to us from econometrics. An instrumental variable is one that is related to the exposure but not the outcome. One of the most famous uses of this method was published by a fellow you may have heard of, Mark McClellan, where he looked at the proximity to a cardiac intervention center as the instrumental variable in the outcomes of acute coronary events. Essentially, he argued, the randomness of whether or not you are close to a center randomizes you to the type of treatment you get. Incidentally, in this study he showed that invasive interventions were responsible for a very small fraction of the long-term outcomes of heart attacks. I have not seen this method used that much in the literature I read or review, but am intrigued by its potential.

And now, to finish out this post, let's talk about interaction. "Interaction" is a term mostly used by statisticians to describe what epidemiologists call "effect modification" or "effect heterogeneity". It is just what the name implies: there may be certain secondary exposures that either potentiate or diminish the impact of the main exposure of interest on the outcome. Take the triad of smoking, asbestos and lung cancer. We know that the risk of lung cancer among smokers who are also exposed to asbestos is far higher than among those who have not been exposed to asbestos. Thus, asbestos modifies the effect of smoking on lung cancer. So, to analyze those smokers exposed to asbestos together with those who were not will result in an inaccurate measure of the association of smoking with lung cancer. More importantly, it will fail to recognize this very important potentiator of tobacco's carcinogenic activity. To deal with this, we need to be aware of the potentially interacting exposures, and either stratify our analyses based on the effect modifier or work the interaction term (usually constructed as a product of the two exposures, in out case smoking and asbestos) into the regression modeling. In my experience as a peer reviewer, interactions are rarely explored adequately. In fact, I am not even sure that some investigators understand the importance of recognizing this phenomenon. Yet, the entire idea of heterogeneous treatment effect (HTE) and our pathetic lack of understanding of its impact on our current bleak therapeutic landscape, is the result of this very lack of awareness. The future of medicine truly hinges on understanding interaction. Literally. Seriously. OK, at least in part.

In the next installment(s) of the series we will start tackling study analyses. Thanks for sticking with me.

Thursday, January 13, 2011

Reviewing medical literature part 3 continued: threats to validity

No comments:

Post a Comment