Uncertainty and decision making

Contemplating a project of some colleagues regarding decision making under the uncertainty of where a tumor will be with respect to the radiation field during breathing led me to wonder about the whole range of uncertainties that should be considered. Traditionally in radiation oncology we are concerned whether the tumor will always be in the radiation field when you set up the patient on a daily basis for weeks.  When the tumor position is affected by respiration or bowel gas, then it is even harder to know this.  Recently, on-board imaging has helped us to understand (and sometimes manage) the motion.  Increasing the size of the radiation field (a.k.a. using a PTV) is one approach to reducing uncertainty.

But what about other  uncertainties?  Take any cubic millimeter.  What is the number of tumor cells?  Classic radiation biology uses Poisson statistics to calculate the uncertainty in radiation’s ability to sterilize the tumor.  So uncertainty exists and can be accounted for.  What  about the more recent realization that tumors do not contain a single clonogen but, rather, many genetically different  cells? Hopefully, genetic characterization would give us some insight as to how the differences affect the cells’ radiation sensitivity.  Epigentic factors, too, play a role in establishing a phenotypic radiation response.  However, even in this optimistic case where we have some mechanistic understanding, we can only alter the probabilities.  So here we have an understanding that uncertainty exists, and in some cases we may be able to characterize it, but at this point even accurate estimates of the probabilities are hard to come by.  Near the margins of the tumor, we talk about the clinical tumor volume which consists of “microscopic disease”, by which we mean possible tumor cells that we have no solid knowledge regarding their existence.  What we know comes from surgical/biopsy specimens or from clinical outcomes with regard to treating such a region in other patients.  Here our uncertainty is complete with regard to the particular patient and our only knowledge comes from population averages.

Much of radiation oncology (and medicine in general) is devoted to reducing the uncertainty by techniques such as recursive partitioning analysis and classification algorithms, e.g. support vector machines and logistic regression.  Concepts such as stage, grade, TNM classification are all ways of predicting  outcomes as  a function of therapies, thereby reducing our uncertainty.  Such musing leads us to consider the confluence of medical decision making and uncertainty.  On one side, we can say that the minimum uncertainty is when we know for sure that the treatment will effect a cure or will surely fail.  Then we have a probability of 1.0 or 0.0 and, hence, no uncertainty.  The most uncertainty we have is when there is a 50% chance of cure.  Surely it is better in the decision making realm to have no uncertainty.  However in the real world–that is, the world of the patient and doctor–a 50% chance of cure is better than 0%.  So we can conclude that uncertainty in these types of decisions is not necessarily a bad thing.  Therefore, we are left to continue our quest for better strategies for making decisions under uncertainty.  The question of the day is: do we want to continue understanding the biology to the point that we know exactly what will happen to a person when we know that in some fraction of the cases we will be depriving the patient of hope?


Virtual trials

Physicists are fond of conducting “virtual trials” by which is meant that they select a number of random or representative cases, compute treatment plans for them using two different methods and then compare the results.  Usually these are done to show the differences (or lack thereof) between two different methods of radiation delivery, or sometimes, of optimization. 

In general, this is a reasonable and cost effective means of coming to some conclusion about the appropriate uses of new technology. However, as they are most often conducted, these trials do little to answer any relevant questions.  In general, they meet few, if any, of the criteria for a clinical trial.  Instead, it seems as though physicists have defined their own standards for a virtual trial.  What are these standards? How do they compare with the norms in clinical medicine?

Clinical trials are grouped into four stages, ranging from determination of the intervention’s safety, to its efficacy in a controlled group, to its efficacy in the population at large.  Do our physics-oriented virtual trials call into any of these categories.  At one end of the spectrum, physicists are concerned with safety, namely a Phase I trial.  They wish to avoid initiating a technology or procedure that will lead to patient harm.  At the other end, one could argue (thought physicists never do) that they are also conducting Phase IV-like trials since the cases are selected with little regard for the biological and physiological variables that can mediate the response to the intervention.  Most often, cases are selected because they are dosimetrically “interesting” or, on the other hand, “tractable”.  The latter characteristic underpins the continued popularity of virtual trials of prostate cancer with its two significant organs-at-risk.  Once those cases are dealt with and the technology has been shown to handle the simple situations, then interesting cases are selected based on their dosimetric complexity. 

Does this way of viewing the issue lead to any worthwhile considerations?  In the sense that clinical trials are now the gold standard for progress in medical practice, the answer is yes.  If we, as physicists, wish to lead the field forward by definitively answering questions, then we need to meet the same standards as other in the field.  So what are the characteristics of clinical trials that translate directly to a medical physics approach?

First, the endpoint of the trial must be described and justified at the beginning.  Too often, physicists merely pile up metrics at the end of the project, calculate statistical significance of differences, and then make some pronouncement based thereon.  Clinical trials do not have the luxury of waiting until the end of the trial to define their endpoints for several reasons, chief among them the ethics of human research.  Physicists are free of that limitation, but then suffer the possibility of being accused of cherry-picking the results.  More importantly, however, the failure to declare and justify the endpoints at the beginning vitiates the impact of the results since others are less likely to be convinced by this method of conducting the trial.  Providing a convincing rationale at the beginning of the work puts the results on a firm footing and helps structure the entire virtual trial.

Elucidation of a clear set of metrics by which to judge the trial’s efficacy must be done in conjunction with the relevant clinicians.   It is sometimes the case in current comparisons that dose metrics are tested for statistical significance with the somewhat absurd results that dose differences of less than 1 Gy are reported as significant.  Statistically, maybe (although it can certainly be argued that any set of cases in which dosimetric parameters that are so close in value, yet statistically significant, exhibit a homogeneity that hardly reflects clinical practice); clinically, no. [e.g. DC Weber, et al, Int J Radiat Oncol Biol, Phys, 75(5): 1578-86, 2009]  It is important to determine up front what is going to be conclusive evidence of improvement for the application being studied.  It is at this point that determination of the phase of the trial is important.  Evaluating safety is likely to result in a different set of trial metrics than would be used in a Phase III trial.

Rigor in methods is also an important component  of a virtual trial.  GIven the complexities of modern treatment plans, optimization algorithms are often used in virtual trials.  However, the algorithms in the current generation of treatment planning software is very operator dependent. In other cases, such as comparisons of protons and x-rays, different planning systems and dose calculation algorithms must be used. Great care must be taken  in designing methods that provide a fair comparison for plans.  In the case of optimization algorithms, user options must be constrained.  When different planning systems are used, some effort at judging their relative differences (outside the parameters of the trial) must be made. 

There is an additional burden that the use of optimization places on virtual trials that is not usually a part of clinical trials.  That is, clinical trials do not usually look at the correspondence between normal tissue outcomes (complications) in conjuction with tumor response.  In some cases, there may be reason to believe that tumor response is related to or coupled with normal tissue response and hence justifies the reporting of the correlation, but this is rarely done.  In inverse planning, the algorithm searches for some ideal solution and when it cannot find one that meets all the objectives, finds a plan that incorporates a trade-off between the competing (tumor vs normal tissue) objectives.  For this reason, it is imperative that virtual trials incorporating inverse planning report results for each individual, not just aggregate measures such as averages of single metrics. 

Finally, to make the connection between clinical trials and virtual trials, it is interesting to consider Phase III and IV trials.  One definition is: “Phase IV studies are conducted after the intervention has been marketed. These studies are designed to monitor effectiveness of the approved intervention in the general population and to collect information about any adverse effects associated with widespread use.” [Gates Foundation]  If we replace the word “marketed” with the words “clinically implemented”, then we have a good description of the introduction of new technologies and methods into clinical use including the performance of a virtual trial at the beginning of the process.  To those who argue that such trials are likely to not achieve statistical significance because of a lack of sufficient numbers of patients, one may ask whether there is justification in spending the money to purchase the new technology.

For those institutions that conduct virtual trials and based (at least partly) on the results take the next step of clinical use, it would be very worthwhile for them to collect data and report back on the correspondence between the trial and the clinical outcomes.  In many cases, e.g. IMRT and VMAT, the differences are so small that it is certainly not unethical to randomize patients between the two and measure the differences (if any) in outcomes, thereby conducting a Phase III trial.  For those who are so convinced that using the old technology is not justified, then certainly reporting on the outcomes and comparing them to the historical results and the conclusions of the virtual trial would be of great value.

In conclusion, it behooves the medical physics community to meet the standards that we and society in general (particularly given the Patient Protection and Affordable Care Act) expect of medical research.  These changes will enhance the usefulness of medical physics research, provide comfort to the public knowing that careful measures are being taken to insure the safe and efficacious introduction of new therapies, and hopefully also lead to the more rational use of our health care dollars.