Uncertainty in Kaplan-Meier curves

This post is indirectly related to things Bayesian.  One of the nodes in a Bayes net I have been working on is the tumor control probability (TCP) for oropharyngeal cancer.  Where do we get the TCP values?  One place is from Kaplan-Meier (KM) curves.  If you do a KM curve for each of several doses and then focus on a particular time point, the TCP values at that time can be obtained.  Now KM curves have a confidence interval which is +/- 1.96*sqrt(Variance).

Well, what is the variance?  According to the textbook, it is a function only of the survival probability at time, t, times the sum of a weighted (by 1/number_at_risk) average of the conditional risk at prior failure times.  In other words, it does not take into account the biological variability.

Take the example of two experiments. The first (the “naive”) experiment does not stratify patients by cancer stage, i.e. you draw from a patient pool that includes all patients with that type of cancer regardless of whether it is Stage I or IV.  Pick some “n” number of patients and perform the KM analysis.  Note that there is no measurement uncertainty: at any given time you know how many patients you have and how many fail.  The variance is a measure of some dispersion based on number at risk.  The second (“stratified”) experiment only chooses patients with one particular stage, e.g. II.  Choose the same number “n” of patients.  It is not unlikely that both experiments will give you the same KM curve since the naive experiment is an average effect over all stages.  Now if you were to repeat these two experiments a number of times in order to measure the variance at different times, you should get different variances.  In the naive experiment, your distribution of stages within any selected sample will vary somewhat with a resultant difference in survival.  In the  stratified experiment, the population distribution will be narrower.  (If you don’t think that is true, then figure out how they defined “stage” to begin with).  In this thought  experiment, there are two very different variances based on the biology.  Now, since we are measuring a mean which might be argued is distributed normally, the variance of the mean survival is relatively narrow, but there should still be a difference mathematically if not clinically significant.

My conclusion is that either the KM confidence interval doesn’t contain the  whole story or I don’t understand the statistics as well as I think I do. In any case, it has helped sharpen my thinking about the confidence limits on TCP curves.

Uncertainty and decision making

Contemplating a project of some colleagues regarding decision making under the uncertainty of where a tumor will be with respect to the radiation field during breathing led me to wonder about the whole range of uncertainties that should be considered. Traditionally in radiation oncology we are concerned whether the tumor will always be in the radiation field when you set up the patient on a daily basis for weeks.  When the tumor position is affected by respiration or bowel gas, then it is even harder to know this.  Recently, on-board imaging has helped us to understand (and sometimes manage) the motion.  Increasing the size of the radiation field (a.k.a. using a PTV) is one approach to reducing uncertainty.

But what about other  uncertainties?  Take any cubic millimeter.  What is the number of tumor cells?  Classic radiation biology uses Poisson statistics to calculate the uncertainty in radiation’s ability to sterilize the tumor.  So uncertainty exists and can be accounted for.  What  about the more recent realization that tumors do not contain a single clonogen but, rather, many genetically different  cells? Hopefully, genetic characterization would give us some insight as to how the differences affect the cells’ radiation sensitivity.  Epigentic factors, too, play a role in establishing a phenotypic radiation response.  However, even in this optimistic case where we have some mechanistic understanding, we can only alter the probabilities.  So here we have an understanding that uncertainty exists, and in some cases we may be able to characterize it, but at this point even accurate estimates of the probabilities are hard to come by.  Near the margins of the tumor, we talk about the clinical tumor volume which consists of “microscopic disease”, by which we mean possible tumor cells that we have no solid knowledge regarding their existence.  What we know comes from surgical/biopsy specimens or from clinical outcomes with regard to treating such a region in other patients.  Here our uncertainty is complete with regard to the particular patient and our only knowledge comes from population averages.

Much of radiation oncology (and medicine in general) is devoted to reducing the uncertainty by techniques such as recursive partitioning analysis and classification algorithms, e.g. support vector machines and logistic regression.  Concepts such as stage, grade, TNM classification are all ways of predicting  outcomes as  a function of therapies, thereby reducing our uncertainty.  Such musing leads us to consider the confluence of medical decision making and uncertainty.  On one side, we can say that the minimum uncertainty is when we know for sure that the treatment will effect a cure or will surely fail.  Then we have a probability of 1.0 or 0.0 and, hence, no uncertainty.  The most uncertainty we have is when there is a 50% chance of cure.  Surely it is better in the decision making realm to have no uncertainty.  However in the real world–that is, the world of the patient and doctor–a 50% chance of cure is better than 0%.  So we can conclude that uncertainty in these types of decisions is not necessarily a bad thing.  Therefore, we are left to continue our quest for better strategies for making decisions under uncertainty.  The question of the day is: do we want to continue understanding the biology to the point that we know exactly what will happen to a person when we know that in some fraction of the cases we will be depriving the patient of hope?

When prior beliefs interfere…

I have written before (here) about the influence prior beliefs have on how people act.  Two recent events (one lecture, one book) have led me to further contemplate this.

Daniella Witten, a biostatistician at UW, gave a bioethics lecture today on the Duke saga (her words) regarding genomics research and related clinical trials, and the consequent scandal when it turned out that the genomics science was not correct.  Although statisticians led the discovery, they themselves did not use any Bayesian arguments–what follows are my own ramblings.  One of the points that I got from Witten’s talk was that the very persistent reluctance for authorities at Duke (and elsewhere) to realize that the published papers were wrong came, at least partly, from their belief that it had to be correct.  There had been such hype about the potential of genomics to guide cancer therapies, and the PI was a good scientist, and good journals accepted the papers.  It took a lot of evidence before their prior probabilities were modified by data to the more correct posteriors.

This tale seems to me to be a good example of the type of thinking described by Jonathan Haidt in “The Righteous Mind: Why Good People are divided by Politics and Religion.”  He uses the metaphor of an elephant with a rider on its back.  The rider is the rationale mind whereas the elephant is everything else about us.  More often than not, the rider doesn’t guide the elephant, but rather spends a lot of time justifying the direction the elephant is going in on its own volition. To stretch the analogy, I think that part of the elephant’s momentum (in the vector sense) has to do with prior beliefs, without saying where they came from.  The rider is new data, and the inability of our rationale mind to change the elephant’s course reflects the difficulty we have in overcoming prior beliefs.  One way of thinking of it is that the processes that guide our “elephant” multiply the importance or frequency of a few early data so that once that prior has been established, it takes an awful lot to influence it later.

Medical physicists are Bayesians?

A look at whether medical physics are Bayesians through the example of maintenance of certification (MOC).

Abstract: Though few will admit it, many physicists are Bayesians in at least some situations. This post discusses how the world looks through a Bayesian eye. This is accomplished through a concrete example, the Maintenance of Certification (MOC) by the American Board of Radiology. It is shown that a priori acceptance of the value of MOC relies on a Bayesian attitude towards the meaning of probabilities. Applying Bayesian statistics, it is shown that a reasonable prior greatly reduces any possible gain in information by going through the MOC, as well as providing some numbers on the possible error rate of the MOC. It is hoped that this concrete example will result in a greater understanding of the Bayesian approach in the medical physics environment.


For several decades, a debate has raged regarding the nature of probabilities. On one side of the debate are “frequentists”. They hold that probabilities are obtained by repeated identical observations with the probability of any given outcome being the ratio of the number of events with that outcome to the total number of events. The classical example is the probability of observing a “heads” or “tails” when flipping a coin. On the other side of the debate are “Bayesians” (more on that name in a bit). They hold that probabilities can also represent the degree of belief in relative frequency of a given outcome. While there are many paths by which one can reach this point, there are several common ones. The influence of prior knowledge on one’s belief that a certain event will happen is certainly one ingredient. Another path by which people reach the Bayesian viewpoint is recognition of the fact that probabilities are often useful even when it is impossible to reproduce precisely the situation so that multiple measurements can be made, such as in the field of medicine.

For those of us in the medical field, randomized controlled trials (RCT) are our effort to achieve the frequentist goal of measuring outcomes in identical situations. However, we are usually more interested in discovering the differences in probabilities for different situations, namely when an element of a therapeutic procedure has been changed. The frequentist approach lies behind the statistical tests that are used to determine whether our observations warrant the conclusion that the therapeutic modification has resulted in a true difference or not. In other words, the frequentist view is one in which seeks to determine whether the data observed are consistent with a given hypothesis. This is to be contrasted with the Bayesian view in which one seeks to determine the probability of a certain hypothesis given the data.

All of this still leaves us with the question: Why do we care whether medical physicists are Bayesians or frequentists? One good reason has been in the news recently, namely, personalized medicine. How will we ever obtain the required numbers of patients if everything is personal? Even if we take “personal” to mean harboring one or several (nearly) identical genes, recent developments are demonstrating that biological processes are nearly always the result of a large set of genes. In addition, the role of epigenetic factors reduces the homogeneity in any group selected for their genetic homogeneity.

In general, medical physicists tend to be a bit under-educated with respect to probabilities and statistics, especially in a medical environment. A very good reference for the Bayesian statistical approach is “Bayesian Approaches to Clinical Trials and Health-Care Evaluation” by DJ Speigelhalter et al. This post is a brief attempt to highlight some of the issues, but should be considered a very faint ghost of a complete discussion. To make it more concrete, I have looked at a specific situation.

Continue reading

Standard Gamble is now becoming real life

An article in the NY Times  highlights an interesting development in which a theoretical construct is becoming an actual reality.  The article describes how immunotherapy is being tested in patients with melanoma.  Several drugs seem to result in some improvement in survival and combinations of drugs result in more drastic improvements.  However, these combinations can be lethal (in one test 3/46 died from the drugs), as well as causing myriad side-effects.

In decision theory, there is a concept of “utility” which is basically a quantitative measure of how much one values something.  In economic terms, this value is relatively easy to assess since you are usually dealing with either actual money or something with monetary value.  In health care, it is not so straightforward.  Would you rather live 10 years with pain or 5 years without?  One way of trying to assess these utilities when the outcome is uncertain (as it always is in medicine) is called the Standard Gamble.  Imagine trying to assess how someone values a given health state, for example living with “dry mouth” which  may limit how well you can swallow and eat certain foods.  In this test, the person is given a choice: (a) live the rest of your life with dry mouth, or (b) take a pill which has a probability, P, of curing you but also a probability, 1-P, of killing you instantly.  Let’s start off with P = 90%.  That is, if you take the pill, there is a 90% chance you will be cured of dry mouth for the rest of your life.  Do you take the pill with its 10% chance of dying or do you live the rest of your life with your condition?  The test consists of varying the probability, P, until the person cannot choose between them.  That value of P is then called the “utility for the condition of dry mouth.”

We now have the situation where patients (and physicians) can choose between a situation where the outcome is pretty well-known or trying a new therapy which has the promise of making things much better but can also kill you.  Now the situation is not exactly the same as the Standard Gamble since the new therapy brings with it some new complications even if they are not fatal.  But it does make one think of whether we can use data from these real life situations to study how utilities are measured.

Cromwell’s rule

From Think Bayes–Bayesian statistics made simple (Allen Downey,  Green Tea Press)

“Also, notice that in a Bayesian update, we multiply each prior probability
by a likelihood, so if p(H) is 0, p(HjD) is also 0, regardless of D. In the
Euro problem, if you are convinced that x is less than 50%, and you assign
probability 0 to all other hypotheses, no amount of data will convince you

This observation is the basis of Cromwell’s rule, which is the recommendation
that you should avoid giving a prior probability of 0 to any hypothesis
that is even remotely possible (see http://en.wikipedia.org/wiki/
Cromwell’s rule is named after Oliver Cromwell, who wrote, “I beseech
you, in the bowels of Christ, think it possible that you may be mistaken.”
For Bayesians, this turns out to be good advice (even if it’s a little overwrought).”

Bacterial decision making (really!)

There is a fascinating article in the latest edition of Physics Today (February, 2014):

Bacterial Decision Theory by Jane’ Kondev

I was initially attracted to the article by its provocative title.  Like a bee to honey, I could not resist looking at it.  There are a number of reasons I enjoyed it so much; here’s a synopsis followed by a slightly more digressive discussion:

  • It is interesting for its own sake about understanding how genes are expressed.
  • It provides a good example of how a Bayesian model (my interpretation–not the author’s) can be expanded from simple observed probabilities to include predictions from sophisticated and mathematical models.
  • It highlights the importance of modeling when doing statistical analyses.
  • It provides some background for thinking about the processes that might result in differential response of cancer cells (or normal cells, for that matter) to radiation and/or chemo.

In a quick synopsis, the article describes in great detail how E coli cells can convert to using glucose or lactose for energy.  The system operates as a function of several variables: presence/absence of lactose and glucose and the presence/absence of two molecules, namely Lac-repressor and CRP.  The CRP increases the likelihood that RNA polymerase will bind to the lac promoter portion of the DNA; the Lac-repressor sits on the promoter portion, thereby inhibiting the RNA polymerase binding.  Lac-repressor tends to be present when lactose is absent; CRP is present when glucose is absent.  Binding of RNA polymerase for making the protein that digests lactose follows the basic rules:

  • lactose +, glucose +, no RNA polymerase
  • lactose -, glucose +, no RNA polymerase
  • lactose -, glucose -, no RNA polymerase
  • lactose +, glucose – , RNA polymerase can bind

That makes a nice 2×2 matrix that is easily encoded into a Bayesian network (BN).  However, these are chemical reactions, not logical theses, so statistical mechanics is actually a better operational model.  A little calculation might give you probabilities that are a little different from 0/1 depending on concentrations.  A little more work gets you a full-blown model based on the free energies and entropies of the two states (bound/not bound).  In the BN world, you can now have a much more sophisticated, quantitative, continuous model without really modifying the BN in any substantial way.

The author also does a nice job of describing how having such a model can really help form the experimental setup required and the statistical analyses that should be performed to test this model.  This mirrors the discussion on the use of modeling strategies in performing appropriate statistical tests in the first chapter of “Regression Modeling Strategies” by F E Harrell (Springer, 2001).

Finally, the article also describes how other aspects of the cellular mechanism, such as transporter activity across the cell membrane, can lead to positive feedback and a switching between states.  As another example of bacterial free will, the article describes how E coli can switch phenotypes between antibiotic-sensitive to resistant and vice versa.  These interactions between cellular activity and the environment bring to mind some of the issues with differences between tumor cell responses to radiation.  The simple LQ model, while pretty good, might be greatly improved by including something like the antibiotic resistance mechanism described. Genetic instability might explain part of the hetergeneity of response, but so may  environmental factors and cellular feedback processes.