Friday, December 23, 2011

H. pylori, Peptic Ulcer Disease, Bayes' Theorem, and treatment thresholds

I was slightly surprised today to read the authors' discussion in the Clinical Crossroads in JAMA (December 7th issue; original case November 1st issue - when the reader responses are published, there may be even more to say about this topic).  I was more thoroughly surprised when I cross-referenced UpToDate and the references that both used to defend the suspect practice.  Apparently Bayes' Theorem does not apply to H. pylori (HP), and neither does the concept of treatment thresholds.  Time to apply some uncommon sense.

At issue is whether a patient with peptic ulcer disease (PUD) should receive empiric treatment for HP, or whether "confirmatory" testing is indicated.  Early estimates suggested that 95% of patients with PUD will have HP (I recall that figure from the 13th edition of Harrison's Principles of Internal Medicine, now in it's 18th edition), but three epidemiological studies in the 1990s challenged this estimate as too high, with new estimates of 73% (study n=2394), 61% (study n=144), and 36% (studied n=95).  The binomial confidence intervals (CIs) for those numbers are 71-75%, 53-69%, and 26-45%.  The first study, with an estimate of 73% and n=2394 is actually a summary statistic derived from a combination of six other studies.  So, which estimate do we choose?  We could do a meta-analysis and combine all of these studies, but we already know what the result will be becasue the first study will trounce the others when weights are assigned.  So I'm going to just conclude that while the prevalence of HP in PUD might not be 95%, it is likely close to 73%.

This is important because you have to decide what threshold of probability you will use to trigger treatment of HP.  If that threshold is 75%, well then you might as well just treat everybody with PUD because the prior probability is so close to your treatment threshold.  One of four of the patients you treat will have been treated "unnecessarily", on average.  If that threshold is too low, then you need to raise it with testing.  But, testing is not without its own set of pitfalls, and the gains in reducing  false positives may be much less than you think.

The risk with testing, using one of various available assays (choosing among them presents its own set of challenges), is that you will falsely exclude HP given the high pre-test probability.  Using data from Table 1 of the JAMA article, I calculated the negative likelihood ratios (LRs) for the available tests and they range from 0.07 with urea breath testing to 0.34 with rapid urease testing.  Adjusting the prior probability of 73% by those likelihood ratios for negative tests reduces the probability of HP from 73% to 17-50%.  Faced with such a scenario, you now have to decide:  in this patient with PUD, is a probability of 17-50% low enough that you will not treat?  If it's not, you can treat in spite of the negative test, in which case you should not have gotten the test in the first place.  Alternatively, you test again.  Two negative tests with LRs in that range will reduce the posterior probability to about 10%, and three to around 5% or less.  Is it worth it to do two or three tests or would you be better off with a policy of just treating without testing?  I don't know.  But, you can use this same line of reasoning to see what your life will be like if you employ a policy of testing and compare that life to one with a policy of not testing.

Suppose we're treating a population of patients with PUD where the true prevalence of HP is 73% and that we select serology (perhaps the most convenient of the available tests; its (-)LR is 0.17, right about in the middle of the range) as our testing modality with a sensitivity of 88% and a specificity of 70%.  If we test 100 patients with PUD (number chosen for simplicity), we will get 72 positive results, 28 negative ones:

So, we've already done 100 tests, and we now need to deal with those 28 negatives (because, while we can't know which ones, some of them are false negatives).  Note that there are 9 false negatives and 8 false positives - if we stop testing here, we will miss treating 9 actual cases of HP and we will treat 8 cases which don't have HP.  Treating 8 that don't need it is better than treating 27 that don't need it with the no-testing strategy, but it comes at the cost of doing 100 tests.  To [try to] get out of this mess, we run another test on the 28 negatives:
In this round of testing, we get 14 positive tests out of 28, but 6 of those are false positive.  The tally of false positives is now 8+6=14, compared to 27 with a no-testing policy.  We have reduced our false positive rate by 50% (from 27 to 14), but we had to order 128 tests to do it.  If we did a third round of testing, we capture the last false negative, but we also capture 4 more false positive, and the false positive tally is now 8+6+4=18.  With a third round of testing (not shown), we will have done 140 serologic tests to reduce our false positive rate from 27 to 18.  Thanks to those 142 tests, there are now a whopping nine (9) patients who escaped unnecessary HP treatment.  Was it worth it?  Here is a summary of our options:
It all depends on the value you ascribe to things. But it's clearly oversimplistic to think that additional testing just magically sorts things out and saves you from treating false positives.  With each additional round of testing, you incur more false positives, cancelling out the perceived benefit of additional testing.  And you do a whole lot more testing.

If it were me, I would prefer to cut the chase and just be treated.  Especially if I were a patient paying for testing and treatment.

1 comment:

Note: Only a member of this blog may post a comment.