Friday, July 20, 2012

Early radical prostatectomy trial: Does it mean what you think it means?

Another study this week added to the controversy about early prostate cancer treatment. The press, as usual, stopped at citing the conclusion: Early prostatectomy does not reduce all-cause mortality. But the really interesting stuff is buried in the paper. Let's deconstruct.

This was a randomized controlled trial of early radical prostatectomy versus observation. The study was done mostly within the Veterans' Affairs system and took 8 years to enroll a little over 700 men. This alone should give us pause. Figure 1 of the paper gives the breakdown of the enrollment process: 5,023 men were eligible for the study, yet 4,292 declined participation, leaving 731 (15% of those who were eligible) to participate. This is a problem, since there is no way of knowing whether these 731 men are actually representative of the 5,023 that were eligible. Perhaps there was something unusual about them that made them and their physicians agree to enroll in this trial. Perhaps they were generally sicker than those who declined and were apprehensive about the prospect of observation. Or perhaps it was the opposite, and they felt confident in either treatment. We can make up all kinds of stories about those who did and those who did not agree to participate, but the reality is that we just don't know. This creates a problem with the generalizability of the data, raising the question of who are the patients that these data actually apply to.

The next issue was what might be called "protocol violation," though I don't believe the investigators actually called it that. Here is what I mean. 364 men were randomized to the prostatectomy group, and of them only 281 actually underwent a prostatectomy, leaving nearly one-quarter of the group free of the main exposure of interest. Similarly, of the 367 men randomized to observation, 36 (10%) underwent a radical prostatectomy. We might call this inadvertent cross-over, which does tend to happen in RCTs, but needs to be minimized in order to get at the real answer. What this type of cross-over does is, as is pretty intuitively obvious, blend the groups' differences in exposure, resulting in a smaller difference in the outcome, if there is in fact a difference. So, when you don't get a difference, as happened in this trial, you don't know if it is because of these protocol violations or because these treatments are essentially equivalent.

And indeed, the study results indicated that there is really no difference between the two approaches in terms of the primary endpoint (all-cause mortality over a substantially long follow-up period was 47% in the prostatectomy and 50% in the control groups [hazard ratio 0.88, 95% confidence interval 0.71 to 1.08, p=0.22]). This means that the 12% relative difference in this outcome between the groups was more likely due to chance than to any benefit of the surgery. "But how can cancer surgery impact all-cause mortality?" you say. "It only claims to alter what happens to the cancer, no?" Well, yes that is true. However, can you really call a treatment like that successful if all it does is give you the opportunity to die of something else within the same period of time? I thought not. And anyway, looking at the prostate cancer mortality, there really was no difference there either: 5.8% attributable mortality in surgery group compared to 8.4% in the observation group (hazard ratio 0.63, 95% confidence interval 0.36 to 1.09, p=0.09).  

The editorial accompanying this study raised some very interesting points (thanks to Dr. Bradley Flansbaum for pointing me to it). He and I both puzzled over this one particularly unclear statement:
...only 15% of the deaths were attributed to prostate cancer or its treatment. Although overall mortality is an appealing end point, in this context, the majority of end points would be noninformative for the comparison of interest. The expectation of a 25% relative reduction in mortality when 85% of the events are noninformative implies an enormous treatment effect with respect to the informative end points.
Huh? What does "noninformative" mean in this context? After thinking about it quite a bit, I came to the conclusion that the editorialists are saying that, since prostate cancer caused such a small proportion of all deaths, one cannot expect this treatment to impact all-cause mortality (certainly not the 25% relative reduction that the investigators targeted), the majority of the causes being non-prostate cancer related. Yeah, well, but then see my statement above about the problematic aspects of disease-specific mortality as an outcome measure.

The editorial authors did have a valid point, though, when it came to evaluating the precision of the effects. Directionally, there certainly seemed to be a reduction in both all-cause and prostate cancer mortality in the group randomized to surgery. On the other hand, the confidence intervals both crossed unity (I have an in-depth discussion of this in the book). On the third hand (erp!) the portion of the 95% CI below 1.0 was far greater than that above 1.0. This may imply that with a study that could have achieved greater precision (that is, narrower confidence intervals) we might have gotten a statistical difference between the groups. But to get at higher precision we would have needed either 1) a larger sample size (which the investigators were unable to obtain even over an 8-year enrollment period), or 2) fewer treatment cross-overs (which is clearly a difficult proposition, even in the context of a RCT), or 3) both. On the other hand (the fourth?), the 3% absolute reduction in all-cause mortality amounts to the number needed to treat of roughly 33, which may be clinically acceptable.

So what does this study tell us? Not a whole lot, unfortunately. It throws an additional pinch of confusion into the cauldron already boiling over with contradiction and uncertainty. Will we ever get the definitive answer to the question raised in this work? I doubt it, given the obvious difficulties implementing this RCT.  
If you like Healthcare, etc., please consider a donation (button in the right margin) to support development of this content. But just to be clear, it is not tax-deductible, as we do not have a non-profit status. Thank you for your support!


  1. One thing that the trial points out, as did the much-criticized PLCO and ERSPC trials of PrCa screening, is that it's really difficult to recruit patients for and maintain randomized groups to test an intervention that is already in widespread practice. No doubt most of the men who decline participation in this study said: "So you want me to be in your study when there's a 50/50 chance I'll be given NO treatment for a potentially lethal cancer? The heck with that!" And unfortunately, by making screening and treatment so widespread before there was evidence, we physicians reinforced the notion (now very much refuted) that most if not all prostate cancers would eventually be lethal.

  2. Excellent post.
    Re: Low participation rate.
    Our group is currently researching reasons why people decline participation in randomised trials of surgery versus non-operative treatment. While it depends on the specific trial (the disease, the intervention, the control etc) it appears that one of the main reasons is that people do not want their treatment to be decided by chance. This is despite them understanding that the two treatments are similar, and despite being happy to have either treatment.
    I think that they don't realise that the treatments they normally receive depend on the (random) nature of which particular doctor they see.
    Re: Crossover.
    Even in real life, not all people offered a prostatectomy will go ahead, and patients who have been advised against an operation may shop around to get one. It might be informative to know why they crossed over. Do you think an "as treated" analysis would have helped or hindered?

  3. Kenny, yes, you are absolutely right about that -- practice ahead of evidence is a recipe for never getting the answer.

    Dr. Skeptic, very interesting observation, thank you. In some ways the doctor or the medical center the patients use acts as an instrumental variable and essentially randomizes them to one treatment or another. This is an idea that I had the good fortune to discuss once with Jack Wennberg, and would be fascinating to test.
    As for as treated analysis, that would be prone to insurmountable biases, I am afraid, and would be even more confusing than enlightening.

    Thanks to both of you for your comments.

  4. There's another important take home message here, which is rarely expressed this way, but may matter more to patients than the minor differences that matter to researchers. If prostatectomy offered a large, slam dunk mortality benefit for men with early stage prostate cancer, we would see the effect fairly easily. Sadly, it is not. If it does offer a mortality benefit (which is looking less and less likely) that benefit is a small one at best, and it must be weighed by the individual patient against the very significant side effects of the treatment. The very fact that we have to spend so much time and money and effort to find out what the benefit might be is a signal that it's not very big.

    There are people who will argue that even if the mortality benefit to the individual is small, the population based benefit could be large because so many men are diagnosed with prostate cancer. Leaving aside for moment the fact that prevalence is being driven in large measure by a lousy screening test, we should question this "population multiplier" idea. Yes, there may be many lives "saved" by a procedure that has a very small individual benefit, but populations don't die, and populations don't suffer treatment side effects. Individuals do. What matters to the patient is what is most likely to happen to him.

  5. Shannon, you are absolutely right on both points! I have often thought that much of our medical treatment research is about detecting minute differences at the margins. And though these differences get magnified at the population level, it is difficult to know the real impact since treatments behave differently in the wilds of clinical practice than in RCTs. In my post yesterday ( I referred to the Mandelblatt paper on mammography, where it looks like we have to screen 1,000 40-yo women for 30 years to avoid 8 deaths from breast CA. At the same time there are false alarms and lots of biopsies, and we don't even know the extent of complications from these biopsies.