### January 26th, 2012

## Sizing Up Clinical Trials — Quickly and Intuitively

### John E Brush, MD

A pharmaceutical sales rep comes to your office bringing lunch. He shows you a graphic stating that Multaq (dronedarone) reduced the primary endpoint in the ATHENA trial by 24%. The fine print shows an impressive *P* value: <0.0001. You come away satisfied that this drug looks good. You may not realize it, but you also feel a sense of obligation to the rep for the lunch.

Nevertheless, you go to the *New England Journal* *of Medicine* paper to look into this for yourself. There you find that the primary endpoint — first hospitalization due to a cardiovascular event, or death — occurred in 31.9% of the treatment group and 39.4% of the placebo group. That represents a 19% relative reduction (apparently, the 24% figure was initially reported at a national meeting but is not reflected in the published data). The difference between the treatment groups was mainly due to the reduction in the rate of hospitalizations. A 7.5% absolute risk reduction for the combined endpoint is starting to look less impressive.

The inverse of this absolute risk reduction, the number needed to treat (NNT), is 13. The drug was discontinued because of adverse events in 12.7% of the treatment group and 8.1% of the placebo group, yielding a number needed to harm (at least enough to discontinue the drug) of 22. The average follow-up was 21 months, and the drug costs $276 per month.

You think to yourself, “I would need to treat 13 patients over an average of 21 months, at a total cost of more than $80,000 to prevent one hospitalization due to a cardiovascular event or one death. If I treat an additional 9 patients, I would cause enough harm to cause one patient to discontinue the drug.” Hmmm, not so good after all.

Psychologists, such as Gerd Gigerenzer and colleagues, tell us that we can be fooled by the relative risk reduction, which exaggerates the observed difference between treatment groups. Our example bears that out. The simple statement using the NNT is much more intuitive.

In his recent book *Thinking, Fast and Slow*, Daniel Kahneman summarizes years of research on how we use intuition to make decisions. According to Kahneman and others, we have two modes of thinking: a fast, intuitive mode, called System 1 thinking, and a slower, analytical mode called System 2 thinking.

System 1 — intuitive and almost automatic — looks for quick, easy answers and is a sucker for a story. System 2 — deliberate and analytical — is consciously effortful and plodding. System 1 sees the forest; System 2 sees the trees. System 1, intuitive as it is, misses details. But slow System 2 can’t handle uncertainty. We need System 1 to quickly size up situations and to integrate complex but incomplete data. Despite its proneness to specific errors, System 1’s gut reaction is often right on.

System 1 can be set up to fail or succeed. Kahneman gives this example: If you tell a person that a bat and a ball cost $1.10 in total and the bat costs $1.00 more than the ball, and then ask how much the ball costs, System 1 wants to quickly answer, “10 cents.” But with a little extra time and thought, System 2 comes up with the correct answer of 5 cents. The original question is a setup for a System 1 failure. But if you frame the question differently — by stating that the bat and ball cost $1.10 and the bat costs $1.05 — System 1 won’t get fooled.

By using NNTs, we can set up System 1 to succeed, as we try to integrate the results of complex clinical trials into our everyday practice. Kaul and Diamond suggest that a good NNT is less than 50, and the number needed to harm (NNH) should be much greater than the NNT. Of course, you should also consider the cost and clinical impact of the treatment in question.

The NNT for post-MI aspirin is only 14. For heparin to treat unstable angina, it’s 40. The NNT for clopidogrel is 48, for prasugrel 46, for warfarin 48, for glycoprotein IIb/IIIa antagonists 77, and for TAVR 5 (with an NNH of 7 for vascular complications and 26 for stroke). The NNT and NNH summarize the evidence in a format that our System 1 thinking can easily handle.

So the next time a sales rep shows you a new study, turn down the lunch and ask to see the absolute risk reduction. Divide that number into 100 to calculate the NNT. Then turn that number into a simple declarative sentence that allows your intuition to better appreciate the true value of the new drug or intervention.

**How often do you focus on the NNT when you assess clinical trials?**

John-

I think this is a great start, but we have to recognize the ways that even NNT and NNH can mislead us. I don’t think that rough guides like “NNT of less than 50” are all that helpful.

First of all, we have to think about what exactly is the benefit? If the NNT is 50 to save a life, that’s pretty impressive. If it’s to prevent an episode of restenosis or a groin hematoma, probably less so. So the NNT needs have units of benefit in order for it to be meaningful. And of course, composite endpoints that are so prevalent in our clinical trials can make things even more challenging.

Second, as you note in passing in your example above, you need to think about the timeframe. An NNT of 50 to prevent an MI over a 6 month treatment period, is a lot more impressive than the same NNT if the time frame is 5 years. Although most people intuitively understand this, the simple concept of NNT starts to become a bit less simple when one considers these other factors.

Nonetheless, I do use NNT all the time in sizing up trials. It’s just that you have to keep your wits about you and make sure you understand the hidden conditions when we use this rubric.

David,

Thanks for the comments. I agree that the NNT all by itself is not enough. So, as I said in the post, you need to put the NNT into a sentence. Example: “I would treat x patients (NNT) over y months or years (time-frame) to prevent one event or combined event (primary outcome). That simple declarative sentence actually simplifies a lot of important information about the trial. You can add the cost of treatment to your sentence, and you could even add modifiers to the sentence that signal what you think about the plausibility of the treatment in the first place.

I agree that the number 50 is just a rough guide and is just a starting point. Even more helpful is knowledge of the NNT of other trials of other drugs. After you to use this for a while, the numbers start to make sense. With practice, the NNT will give busy practitioners an intuitive feel for the strength of a trial, and how they should incorporate the new information into their treatment strategies.

Competing interests pertaining specifically to this post, comment, or both:none.

I agree completely, John. The only problem is that in my experience, almost no one ever says the whole sentence you suggest. Everything after the number itself is completely implicit.

Great discussion. Thank you Dr Brush and Dr Cohen for your insights. I will be looking forward for more such posts.

Great discussion. Thank you Dr Brush and Dr Cohen for your insights. I will be looking forward for more such posts.

Competing interests pertaining specifically to this post, comment, or both:None