Showing posts with label mortality. Show all posts
Showing posts with label mortality. Show all posts

Friday, July 20, 2012

Early radical prostatectomy trial: Does it mean what you think it means?

Another study this week added to the controversy about early prostate cancer treatment. The press, as usual, stopped at citing the conclusion: Early prostatectomy does not reduce all-cause mortality. But the really interesting stuff is buried in the paper. Let's deconstruct.

This was a randomized controlled trial of early radical prostatectomy versus observation. The study was done mostly within the Veterans' Affairs system and took 8 years to enroll a little over 700 men. This alone should give us pause. Figure 1 of the paper gives the breakdown of the enrollment process: 5,023 men were eligible for the study, yet 4,292 declined participation, leaving 731 (15% of those who were eligible) to participate. This is a problem, since there is no way of knowing whether these 731 men are actually representative of the 5,023 that were eligible. Perhaps there was something unusual about them that made them and their physicians agree to enroll in this trial. Perhaps they were generally sicker than those who declined and were apprehensive about the prospect of observation. Or perhaps it was the opposite, and they felt confident in either treatment. We can make up all kinds of stories about those who did and those who did not agree to participate, but the reality is that we just don't know. This creates a problem with the generalizability of the data, raising the question of who are the patients that these data actually apply to.

The next issue was what might be called "protocol violation," though I don't believe the investigators actually called it that. Here is what I mean. 364 men were randomized to the prostatectomy group, and of them only 281 actually underwent a prostatectomy, leaving nearly one-quarter of the group free of the main exposure of interest. Similarly, of the 367 men randomized to observation, 36 (10%) underwent a radical prostatectomy. We might call this inadvertent cross-over, which does tend to happen in RCTs, but needs to be minimized in order to get at the real answer. What this type of cross-over does is, as is pretty intuitively obvious, blend the groups' differences in exposure, resulting in a smaller difference in the outcome, if there is in fact a difference. So, when you don't get a difference, as happened in this trial, you don't know if it is because of these protocol violations or because these treatments are essentially equivalent.

And indeed, the study results indicated that there is really no difference between the two approaches in terms of the primary endpoint (all-cause mortality over a substantially long follow-up period was 47% in the prostatectomy and 50% in the control groups [hazard ratio 0.88, 95% confidence interval 0.71 to 1.08, p=0.22]). This means that the 12% relative difference in this outcome between the groups was more likely due to chance than to any benefit of the surgery. "But how can cancer surgery impact all-cause mortality?" you say. "It only claims to alter what happens to the cancer, no?" Well, yes that is true. However, can you really call a treatment like that successful if all it does is give you the opportunity to die of something else within the same period of time? I thought not. And anyway, looking at the prostate cancer mortality, there really was no difference there either: 5.8% attributable mortality in surgery group compared to 8.4% in the observation group (hazard ratio 0.63, 95% confidence interval 0.36 to 1.09, p=0.09).  

The editorial accompanying this study raised some very interesting points (thanks to Dr. Bradley Flansbaum for pointing me to it). He and I both puzzled over this one particularly unclear statement:
...only 15% of the deaths were attributed to prostate cancer or its treatment. Although overall mortality is an appealing end point, in this context, the majority of end points would be noninformative for the comparison of interest. The expectation of a 25% relative reduction in mortality when 85% of the events are noninformative implies an enormous treatment effect with respect to the informative end points.
Huh? What does "noninformative" mean in this context? After thinking about it quite a bit, I came to the conclusion that the editorialists are saying that, since prostate cancer caused such a small proportion of all deaths, one cannot expect this treatment to impact all-cause mortality (certainly not the 25% relative reduction that the investigators targeted), the majority of the causes being non-prostate cancer related. Yeah, well, but then see my statement above about the problematic aspects of disease-specific mortality as an outcome measure.

The editorial authors did have a valid point, though, when it came to evaluating the precision of the effects. Directionally, there certainly seemed to be a reduction in both all-cause and prostate cancer mortality in the group randomized to surgery. On the other hand, the confidence intervals both crossed unity (I have an in-depth discussion of this in the book). On the third hand (erp!) the portion of the 95% CI below 1.0 was far greater than that above 1.0. This may imply that with a study that could have achieved greater precision (that is, narrower confidence intervals) we might have gotten a statistical difference between the groups. But to get at higher precision we would have needed either 1) a larger sample size (which the investigators were unable to obtain even over an 8-year enrollment period), or 2) fewer treatment cross-overs (which is clearly a difficult proposition, even in the context of a RCT), or 3) both. On the other hand (the fourth?), the 3% absolute reduction in all-cause mortality amounts to the number needed to treat of roughly 33, which may be clinically acceptable.

So what does this study tell us? Not a whole lot, unfortunately. It throws an additional pinch of confusion into the cauldron already boiling over with contradiction and uncertainty. Will we ever get the definitive answer to the question raised in this work? I doubt it, given the obvious difficulties implementing this RCT.  
                  
If you like Healthcare, etc., please consider a donation (button in the right margin) to support development of this content. But just to be clear, it is not tax-deductible, as we do not have a non-profit status. Thank you for your support!

Tuesday, November 22, 2011

Lessons from Xigris

I have been wanting to write for a while about the demise of Xigris, but work and other commitments have stalled my progress. But it is time.

Here is my disclosure: I have received research funding from BioCritica, a daughter company of Eli Lilly, the manufacturer of Xigris. I also happen to know well and hold in high esteem the depth of knowledge and integrity of several colleagues who worked on Xigris internally at Lilly.

But on to the story. Xigris has had a short and bumpy life. When the PROWESS study, the Phase III Xigris trial, was first published in the NEJM in 2001 [1], it was the first therapy to succeed in sepsis, reducing mortality by 6% from about 31% to about 25%, yielding the number needed to treat of 16. This was huge, as so many trials to date had failed, and no progress had been made in sepsis management for years. These data opened the door to the FDA approval, despite a hung advisory committee, where equal numbers of members voted for and against approval. The controversy centered on concerns for bleeding complications, as well as some protocol changes during the trial and a switch in the manufacturing process. The latter concern was allayed by the Agency's detailed analysis and the finding of equivalence. There was a signal in a subgroup analysis that the drug might have been most effective among the most ill patients with a high probability of death, but not in their less ill counterparts. And despite the fact that the pivotal trial was not specifically performed in these patients, the approval for use specified just such a population.

So, despite the controversy, the drug was approved, though several post-marketing commitment studies were mandated. ENHANCE [2, 3] was an international study whose findings broadly confirmed the safety and efficacy of the drug, while the ADDRESS study [4], done in patients at low risk for death, was terminated early for lack of efficacy.

It seemed that PROWESS ushered in an era of positive results in sepsis. Shortly after its publication, other studies on the use of early goal-directed therapy [5], low-dose steroids [6] and tight glucose control [7] appeared in high impact journals, and the years of failure in sepsis management seemed to be over.  

In the meantime, and amid further controversy [8], Lilly supported the creation of the Values, Ethics and Rationing in Critical Care (VERICC) Task Force [9, 10], in addition to giving funding for the international Surviving Sepsis Campaign (SSC), which has resulted in the evidence-based practice guideline for sepsis management [11, 12] and an implementation program for the sepsis bundles, jointly sponsored by the SSC and the Institute for Healthcare Improvement [13]. The latter 2-year program enrolled over 15,000 patients world-wide, and achieved a doubling of bundle compliance from 18% to 36% with a concurrent drop in adjusted mortality of 5%. Because of several methodological issues and the lack of transparency about what it took to implement the bundle, it has never been clear to me a). whether there was causality between the bundle and mortality, and b). whether this effort was cost-effective.

But that aside, Xigris continued to stir up controversy, and there were still safety concerns. Some very well done observational studies, however, continued to confirm its effectiveness and safety in the real world setting [14]. Yet the final trial, PROWESS-SHOCK (done because of fears of an increase in bleeding complications), where patients in septic shock received Xigris as a part of their early management, brought doom. It was this study, whose preliminary results appeared in the press release from October 25, 2011, that prompted Lilly to pull the drug off the world market, since no difference in the 28-day mortality was detected between placebo and Xigris arms. Ironically, the preliminary reports indicate that no excess bleeding was noted in the treatment arm.

So, after roughly 10 years and millions of dollars, Xigris disappeared. But what can we learn from its story? There are many lessons that we should carry away, some about the way we do research, some about marketing practices, but all of them are about the need for a higher level of conversation and partnership. The biggest elephant in this room is whether a manufacturer should be allowed to fund guideline development. It is a complicated issue, particularly given our native proneness to cognitive bias, but in my opinion yes. This certainly cannot be done in a quid pro quo way. Perhaps this is naïve but should it not simply be a question of good data? And why wouldn't a manufacturer give money for the development of sensible guidelines without strings attached when the data are good?

Unfortunately, to me, Xigris is the poster child for how broken our research enterprise is, as I have discussed in this JAMA commentary [15]. Until all stake holders start talking to each other and arriving at common, useful and achievable goals, this is a story that will repeat itself again and again. The fact that regulatory trials, with all of their expensive and flashy internal validity, concern themselves only with statistical issues and care nothing about what happens in the real world is a travesty on many levels. The fact that it costs nearly $1 billion to bring a drug to market means that only big Pharma can bankroll such a gamble, and in return must demand big profits. The fact that this $1 billion fails to bring us studies that help clinicians and policy makers understand fully how to optimize the use of a drug once it is on the market is inexcusable. What we need is more intellectually honest discussions leading to novel pragmatic ways to answer the relevant questions in a timely manner and without bankrupting the system.

So, does the obvious financial interest mean that manufacturers should stay out of these discussions? I happen to think that they need a prominent place at the table. I actually think that the current fiasco is largely the result of too little interaction and too little cross-pollination of ideas: when we all sit around the table a nod in agreement, there is little progress. Deeper and novel understanding is built on disagreement and debate. Therefore, to leave the manufacturers out would invite further irrelevance. The bottom line is that we are all conflicted, and, according to the editors of PLoS, non-financial conflicts of interest, though more subtle and difficult to discern, may present an even bigger threat to much of what we do [16]. Elbowing out a party with an obvious conflict may have the unintended consequence of leaving some of the more insidiously conflicted others to run the show. And although we can argue whether profit is the healthiest driver for performance in healthcare, the reality is that our entire healthcare "system" is built around profit-making. Therefore it is disingenuous to single out one player over others.

On the positive side, the halo effect around Xigris brought a ton of attention to sepsis and its management. As Wes Ely conjectured in this piece, our improved understanding of sepsis (largely due to all the attention Xigris brought to it, in my opinion), is probably what rendered the drug useless in PROWESS-SHOCK. So, after all the hype, the noise and the hoopla, what is left is a company less one drug and hundreds of millions of dollars, and a disease area with a whole lot of what amounted to public health investment, with a vastly improved understanding of the disease state. How much is this benefit worth?

References

[1] Bernard GR, Vincent JL, Laterre PF, et al: Efficacy and safety of recombinant human activated protein C for severe sepsis. N Engl J Med 2001; 344:699–709
[2] Bernard GR, Margolis BD, Shanies HM, et al. Extended Evaluation of Recombinant Human Activated Protein C United States Trial (ENHANCE US). A Single-Arm, Phase 3B, Multicenter Study of Drotrecogin Alfa (Activated) in Severe Sepsis. Chest 2004;125:2206-16
[3] Vincent JL, Bernard GR, Beale R et al.
Drotrecogin alfa (activated) treatment in severe sepsis from the global open-label trial ENHANCE: further evidence for survival and safety and implications for early treatment. Crit Care Med 2005;33: 2266-77
[4] Abraham E, Laterre P-F, Garg R, et al. Drotrecogin Alfa (Activated) for Adults with Severe Sepsis and a Low Risk of Death. New Engl J Med 2005;353:1332-1341
[5] Rivers E, Nguyen B, Havstad S, et al. Early goal-directed therapy in the treatment of severe sepsis and septic shock. N Engl J Med 2001;345:1368-1377
[6] Annane D, Seville B, Charpentier C, et al. Effect of treatment with low doses of hydrocortisone and fludrocortisone on mortality in patients with septic shock. JAMA 2002;288:862-871
[7] van den Berghe G, Wouters P, Weekers F, et al. Intensive insulin therapy in the critically ill patients.  N Engl J Med 2001;345:1359-1367 
      [8] Eichacker PQ, Natanson C, Danner RL. Surviving Sepsis – Practice Guidelines, Marketing Campaigns and Eli Lilly. N Engl J Med 2006;355:1640-2 
      [9] Sinuff T, Kahnamui K, Cook DJ, et al. Rationing critical care beds: A systematic review. Crit Care Med 2004;32:1588-97
      [10] Truog RD, Brock DW, Cook DJ, et al. Rationing in the intensive care unit. Crit Care Med 2006;34:958-63
      [11] Dellinger RP, Carlet JM, Masur H, et al: Surviving Sepsis Campaign guidelines for management of severe sepsis and septic shock. 2004;32:858-73 
      [12] Dellinger RP, Levy MM, Carlet JM, et al: Surviving Sepsis Campaign: International guidelines for management of severe sepsis and septic shock: 2008. Crit Care Med 2008;36:296-327. Erratum in Crit Care Med 2008;36:1394-96
      [13] Levy MM, Dellinger RP, Townsend SR, et al. The Surviving Sepsis Campaign: Results of an international guideline-based performance improvement program targeting severe sepsis. Crit Care Med 2010;38:367-74
      [14] Lindenauer PK, Rothberg MB, Nathanson BH, et al. Activated protein C and hospital mortality in septic shock: A propensity-matched analysis. Crit Care Med 2010;38:1101-7
      [15] Zilberberg MD. The clinical research enterprise: Time for a course change? JAMA 2011;305:604-5 
      [16] The PLoS Medicine Editors (2008) Making Sense of Non-Financial Competing Interests. PLoS Med 5(9): e199. doi:10.1371/journal.pmed.0050199


Wednesday, February 2, 2011

Intervention in ICU reduces hospital mortality, but by how much?

Addendum #2, 12:09 PM EST, 2/2/11:
So, here is the whole story. Stephanie Desmon, the author of the JH press release, e-mailed me back and pointed me to Peter Pronovost as the source for the 10% reduction information. I e-mailed Peter, and he got back to me, confirming that 
"The 10 percent is the rounded differences in differences in odds ratios"
Moral of the story: The devil is in the details.


And speaking of details, I must admit to an error of my own. If you look at the figure reproduced below, I called out the wrong points. For adjusted data, you need to look at the open circles (for the intervention group) and squares (for the control group). In fact, the adjusted mortality went from about 20% at baseline to 16% in the 13-22 months interval for the Keystone cohort, while for the control group it went from a little over 20% to a little under 18%. This makes the absolute reduction a tad more impressive, though there is still less than a 2% absolute difference between the reduction seen in the intervention vs. the control group, leaving all of my other points still in need of addressing. 


Addendum #1, 11:00 AM EST, 2/2/11:
I just found what I think is the origin of the 10% mortality reduction rumor in this press release from Johns Hopkins. I just e-mailed Stephanie Desmon, the author of the release, to see where the 10% came from. Will update again should I hear from either Maggie Fox or Stephanie Desmon.  

Remember the Keystone project? A number of years ago when we started to pay close attention to healthcare-associated infections (HAI), and hospitals started to take introspective looks at their records, it turned out the the ICUs in the state of Michigan for one reason or another had very high rates of HAIs. As this information percolated through our collective consciousness, the stars aligned in such a away as to release funding from the AHRQ in Washington, DC, for a group of ICU investigators at the Johns Hopkins University School of Medicine in Baltimore, MD, headed by Peter Pronovost, to design and implement a study employing IHI-style (Boston, MA) bundled interventions to prevent catheter-associated blood stream infections (CABSI) and ventilator-associated pneumonia (VAP) across the consortium of ICUs in MI. Whew! This poly-geographic collaboration resulted in a landmark paper in 2006 in the New England Journal of Medicine, wherein the authors showed that the bundled interventions directed by a checklist aimed at CABSI were indeed associated with a satisfying reduction of CABSI. Since 2006 the ICU community has been eagerly awaiting the results of the VAP intervention from Keystone, but none has come out. When there is a void of information, rumors fill this void, and plenty of rumors have circulated about the alleged failure of the VAP trial.

I do not want to belabor here what I have written before with regard to VAP and its prevention, and what makes the latter so difficult, and how little evidence there really is that the IHI bundle actually does anything. You can find at least some of my thoughts on that here. But why am I bringing up the Keystone project again anyway? Well, it is because Pronovost's group has just published a new paper in BMJ, and this time their aim was even more ambitious: to show the impact of this state-wide QI intervention on hospital mortality and length of stay. This is a really reasonable question, mind you, since, we could argue that, if the intervention reduces HAI, it should also do something to those important downstream events that are driven by the particular HAI, namely mortality and LOS. But here are a couple of issues that I found of great interest.

First, as we have discussed before, whether or not VAP itself causes death in the ICU population (that is patients die from VAP), or whether VAP tends to attack those who are sicker and therefore more likely to die anyway (patients die with VAP) remains unclear in our literature. There is some evidence that late VAP may be associated with an attributable increase in mortality, but not early, and these data need to be confirmed. Why is this important? Because if VAP does not impart an increase in mortality, then trying to decrease mortality by reducing VAP is just swinging at windmills.

So, let's talk about the study and what it showed as reported in the BMJ paper. You will be pleased that I will not here go through the traditional list of potential threats to validity, but take the data at face value (well, almost). The authors took an interesting approach of comparing the performance of all eligible ICUs regardless of whether they actually chose to take part in the project. Of all the admissions examined in the intervention group, 88% came from Keystone participants. This is a really sound way to define the intervention cohort, and it actually biases the data away from showing an effect. So, kudos to the investigators. The comparator cohort came from ICUs in the hospitals surrounding Michigan, those that were not eligible for Keystone participation. One point about these institutions also requires clarification: I did not see in the paper whether the authors actually looked at the control hospitals' QI initiatives. Why is this important? Well, if many of the comparator hospitals had successful QI initiatives, then one could expect to see even less difference between the Keystone intervention and the control group. So, again, good on them that they biased the data against themselves.

This is the line of thinking that brings me to my second point. Reuters' Maggie Fox covered this paper in an article a couple of days ago, an article whose byline lede (thanks for the correction, @ivanoransky) floored me:
(Reuters) - A U.S. program to help make sure hospital staff maintain strict hygiene standards lowered death rates in intensive care units by 10 percent, U.S. researchers reported on Monday.
Mind you, I read the article before delving into the peer-reviewed paper, so my surprise came out of just knowing how supremely difficult it is to reduce ICU mortality by 10% with any intervention. In the ICU we celebrate when we see even a 2% absolute mortality reduction. So, it became obvious to me that something got lost in translation here. And indeed, it did. Here is how I read the data.

There are multiple places to look for the mortality data. One is found in this figure:

Now, look at the top panel and focus on the solid circles -- these depict the adjusted mortality in the Keystone intervention group. What do you see? I see mortality going from about 14% at the baseline to about 13.5% at implementation phase to about 13% at 13-22 months post implementation. I do not see a 10% reduction, but at best about a 1% mortality advantage. What is also of interest is that the adjusted mortality in the control group (solid squares) also went down, albeit not by as much. But almost at every point of measurement it was lower already than in the intervention group.
Then there is this table, where the adjusted odds ratios of death are given for the two groups at various time points:
And this is where things get interesting. If you look at the last line of the table, the adjusted odds ratios indeed look impressive, and, furthermore, the AOR for the intervention group is lower than that for the control group. And this is pleasing to any investigator. But what does it mean? Well it means that the odds of death in the intervention group went down roughly by 24% (give-or-take the 95% confidence interval) and by 16% in the control group,each compared to itself at baseline. This is impressive, no?

Well, yes, it is. But not as impressive as it sounds. A relative reduction of 24% with the baseline mortality of 14% means an absolute reduction in mortality of 14% x 24% = 3.4%. But, you notice that we did not actually observe even this magnitude of mortality reduction in the graph. What gives? There is an excellent explanation for this. It is a little known fact to the the reader (and only slightly more so to the average researcher and peer reviewer) that the odds ratio, while a fairly solid way to express risk when the absolute risk is small (say, under 10%), tends to overestimate the effect when the risk is higher than 10%. I know we have not yet covered the ins and the outs of odds ratios, relative risks and the like in the "reviewing literature" series, but let me explain briefly. The difference between odds and risk is in the denominator. While the denominator for the latter is the entire cohort at risk for the event (here all patients at risk for dying in the hospital), that for the former is that part of the cohort that did not experience the event. See the difference? By definition, the denominator for the odds ratio is smaller than for the relative risk calculation, thus yielding a more impressive, yet inaccurate, reduction in mortality. 

Bottom line? Interesting results. Not clear if the actual intervention is what produced the 1% mortality reduction -- could have been secular trends, regression to the mean or Hawthorne effect, to name just a few alternatives. But regardless, preventing death is good. The question is were these improvements in mortality sustained after hospital discharge, or were these patients merely kept alive so that they could die elsewhere? Also, what is the value balance here in terms of resources expended on the intervention versus the results that may not even be due to the particular intervention in question?

All of this is to say that I am really not sure what the data are showing. What I am sure of is that I did not find any evidence of a 10% reduction in mortality reported by Reuters (I did e-mail Maggie Fox and at this time still awaiting a reply; will update if and when I get it). In this time of aggressive efforts to bend the healthcare expenditures curve we need to pay attention to what we invest in and the return on this investment, even if the intervention is all "motherhood and apple pie."                  

Thursday, November 11, 2010

More thoughts on lung cancer screening

Today I want to continue in the vein of yesterday's post and discuss a slightly different aspect of the NLST study. But first, a story. I was at a meeting a couple of weeks ago, where we were discussing antibiotics in pneumonia. We were talking about mortality, and one of the docs said something that became a constant mosquito buzz in my brain, which I have been unable to swat away. He said that antibiotics never claimed to reduce all-cause mortality, only mortality associated with infection. Now, on the face of it, the statement makes perfect sense, particularly as you think mechanistically. But it also makes an important point from the perspective of population epidemiology. And it just happens to be related to the NLST study.

Yesterday we talked about the characteristics of the screening tests and in what populations it may make sense. But all of these current and future decisions rest on the results for the primary outcome, which was mortality from lung cancer, reaching an impressive 20% relative reduction. But what about all-cause mortality, and why should we care? Well, saving people from cancer deaths is great, and this is what early screening can mechanistically do. But it is only great if they are not at risk of dying of something else and can derive a meaningful benefit from a life not terminated by an undetected cancer. Unfortunately, as you well know, life does not work this way, and there are many other reasons that people die -- old age, cardiovascular disease, and the like. In epidemiology we call these "competing risks", and it is these risks that sometimes dilute the excitement over the effectiveness of our interventions to a Pyrrhic victory.

Let us apply this to the NLST, or at least what we know today. The results reported are so cursory at this time, that we have to make a lot of educated guesses. But when has that ever stopped us? The report, among other data, states "all-cause mortality was 7% lower" in the CT than in the CXR screening group. The report fails to say whether this is a relative or an absolute reduction, and, as you know, this is a critical distinction. It is critical because, if my calculations are correct, the 20% relative reduction in the cancer mortality detected represents only a 0.3% absolute reduction in deaths from cancer. Given a couple of other tid-bits peppering the report, a few numbers can be back-calculated:


So, again, if my calculations are correct, the all-cause mortality is ~6.4% in the CXR group and ~5.5% in the CT group, so no way can there be an absolute 7% reduction in mortality, and the absolute reduction then turns out to be 0.9% over the 5-year follow-up. And here is what it looks like graphically:


There actually is a reduction in non-cancer deaths in the CT arm. Why this is remains unclear so far, but may well be due to the observation bias, as the study was not blinded. If this is the case, the differences should smooth out over further follow-up period.

But here is my bottom line. Since we live in the world of competing risks, the question becomes "What is the value of screening with CT scans to prevent 0.3% of cancer-related deaths over 5 years?" And how much of this 0.3% benefit is due to lead time bias? This question cannot be answered yet, as we do not have the accounting of either the adverse events related to screening and follow-up, or of the financial costs, or of the patient utilities for either strategy. What we can guess is that the risks have to be miniscule in order not to overwhelm the very small benefit detected, even if it is real.

I am personally not that happy that this report was released the way it was -- no mention of absolute risk reductions, no explicit disclosure of the denominators, no hint as to what the comparative risks of the strategies are (both long- and short-term), and most importantly what implications either strategy has for the patients' quality of life. I guess all of these questions will be answered in the peer-reviewed publications that will result from this study. Fully and transparently. Without any politics or obfuscation. I am waiting with baited breath.