Does disproving a statistical null automatically render the clinical null disproved?

A good friend of mine lost her mother to pancreatic cancer recently. The whole process from diagnosis to her death took 6 weeks. And despite wonderful care from a palliative medicine team, the process proved grueling to her family. And no wonder: how do you assimilate a loved one's going from healthy to dead in six weeks? Of course, my friend's family made all the right choices, forgoing aggressive treatment in favor of maximizing their mother's comfort and quality of life. Their experience made me think of the new generation of cancer treatments, experienced by my father in his dying days, and how it all fits in the healthcare debate. 

Much of the debate around the healthcare reform has centered around the unsustainable cost trajectory, as well as the value of evidence in improving the quality and effectiveness of care. The Congress has appropriated a substantial amount of dollars for comparative effectiveness research (CER) intended to provide data comparing one treatment to another, rather than just a treatment to a placebo. This in turn has sparked the debate as to whether to include cost-effectiveness in this comparison. I am definitely in the camp that believes it should be included for the reasons so eloquently outlined by Weinstein and Skinner. At the same time this MSNBC article made me re-evaluate its need, at least in some special cases. Let's peel back the cabbage, specifically looking at Tarceva in advanced pancreatic cancer, exactly what my friend's mother was diagnosed with.

Tarceva, or erlotinib, is a kinase inhibitor manufactured by Genentech, indicated for the treatment of some cases of non-small cell lung carcinoma, and most recently approved by the FDA for advanced pancreatic cancer. Reading the package insert, it becomes clear that the FDA-approved 100 mg dose of this drug, if given in combination with gemcitabine, prolongs median survival by <2 weeks, from 6 months in gemcitabine+placebo arm to 6.4 months in the gemcitabine+Tarceva arm, for a p=0.028. This difference imparts statistical significance at the conventionally set p<0.05, and therefore renders Tarceva better than placebo. Period. 

Delving a tad more deeply into the peer-reviewed publication of the phase 3 trial, one gleans a few other facts. I quote:
Survival and Response

The final analysis was conducted after 486 deaths (239 on erlotinib and gemcitabine and 247 on placebo and gemcitabine). Overall survival was significantly longer in the erlotinib and gemcitabine arm with an estimated HR of 0.82 (95% CI, 0.69 to 0.99; P = .038; log-rank test stratified for performance status, extent of disease, and pain score at baseline; Fig 1A). Median survival times were 6.24 months versus 5.91 months for the erlotinib and gemcitabine versus placebo and gemcitabine groups with 1-year survival rates of 23% (95% CI, 18% to 28%) and 17% (95% CI, 12% to 21%), respectively (P = .023). A multivariate Cox regression analysis showed that erlotinib treatment (HR, 0.82; 95% CI, 0.69 to 0.99; P = .04) and female sex (P = .03) were significantly associated with longer overall survival. While there was an imbalance in male:female ratio between the arms, the treatment effect remains significant when adjusted for sex.
 Results of subgroup analyses of survival by baseline stratification factors and other factors such as sex, race, pain intensity score, and age are displayed in Figure 2.

Progression-free survival was significantly longer in the erlotinib and gemcitabine arm than the placebo and gemcitabine arm with an estimated HR of 0.77 (95% CI, 0.64 to 0.92; P = .004; log-rank test stratified for performance status, extent of disease, and pain score at baseline; median, 3.75 months v 3.55 months; Fig 1B).
So, what do we have overall? We have a hazard ratio of dying that very nearly crosses 1.0, thus coming perilously close to not disproving the null; we have a prolongation of median survival by 1/3 of a month, and a progression-free median survival prolongation by 1/5 of a month.

But given my fondness for Gould's "The Median is not the Message" essay, let's practice full disclosure and look at the tail of the Kaplan-Meyer curves above. As the text points out,
...1-year survival rates of 23% (95% CI, 18% to 28%) and 17% (95% CI, 12% to 21%), respectively (P = .023).
Indeed, these are significant differences, both statistically and clinically. Within the trial this represents the difference in favor of survival for 18 additional patients, thus rendering the cost of roughly $1.5 million for 1 year of life saved by my back-of-the envelope calculation. Of course, this difference is not adjusted for confounders, so it is difficult to say of the number is real of under- or over-estimated. Because the adjusted analysis is given as a hazard ratio of death, I cannot calculate the corresponding adjusted cost.

So, for me this begs the following question (and I would love to hear the thoughts from my colleagues who proudly proclaim being science-based and thus eschewing placebo effect as a valid way to get a therapeutic response): Is what we are seeing here real or is this in fact equivalent to a placebo effect? Is the median survival prolongation of <2 weeks indeed a real effect that means something to the patient and the clinician, or is it just clinical noise, if you will? In other words, should "disproving" the statistical null by default disprove the clinical null? Or does the bar for disproving the clinical null need to be set just a tad higher than a statistically significant increase in life expectancy of 2 weeks?

Obviously, no one knows a priori who will do better and who will not. So, without a crystal ball, it is one's values and preferences that have to drive these decisions. But does the society have a say in any of this, as we struggle with equitable distribution of a limited resource? Is $1.5 million for 1 year of life a good societal investment? And what is the quality of this life? And given that at least 1/2 of all treated patients get far less than extra 2 weeks of life, how do we strike the balance between a reasonable expectation of a response and a false hope?

My friend's family based their choices on their mother's wishes and their values and utilities for her comfort. The choice they made was very different from that made by my parents with regard to my father's palliation. Both were right for the respective families. But one may have been far too costly, both financially and emotionally.    

