Dean Ornish Likes Small Studies

Dr. Dean Ornish is a physician famed for his very low-fat diet and lifestyle changes as a way to halt and even reverse heart disease. His first study was sufficient to convince many insurance companies to pick up the tab for his therapies. Although his studies have often been small, there is good evidence that he has helped many people, including Bill Clinton, to deal better with their heart problems.

So, should we all go in the Ornish diet? Probably not, because there is little evidence that it will help ordinary people, and even less evidence that most people can tolerate it. The amount of fat allowed – 10% – is so small that you have to eliminate all kinds of food, starting with meat, but including egg yolks, butter and cream. Pastries are banished to the ninth circle of hell, which must be killing Mr. Clinton.

Elimination diets like this are notoriously hard to maintain for any length of time. The instant you say I can’t have a doughnut, I can think of nothing else. Nevertheless, if you have a bad ticker or clogged arteries, you might give it a whirl. Imminent death can really concentrate the mind.

The problem some people have with Ornish is that his studies lack statistical rigor. His first study, published in Lancet in 1990, claimed that lifestyle changes could reverse heart disease. He compared 22 coronary patients on his lifestyle program to 19 patient controls. He showed that, after a year, coronary arteries opened up in the lifestyle group (presumably reversing heart disease), but closed up in the controls.

It was a good result: indicating that heart disease could be reversed without drugs. But the study was small. Why does that matter? Well, the main point of statistics is to get rid of noisy data from the messy world of reality, and that takes big numbers to smooth things out. To take an extreme example, if you pick just one person at random and put him or her on an experimental protocol, such as eating Twinkies all day in order to prove that Twinkies cause pink-eye, how do you know that you just didn’t have the bad luck to pick someone who is prone to pink-eye? So researchers use more people to reduce the odds of accidentally picking an unrepresentative group. The more people you enlist in your study, the less likely you are to have your results caused by random chance.

Everything you need to know about p values in one paragraph: In statistics, to analyze your data, you make the perverse assumption that your treatment won’t work. If it doesn’t work, then at the end of your study, everything will look just like it did at the beginning and you will be totally bummed. This depressing expectation corresponds to a probability of 100% ordinary, or equivalently, a p value of 1. You will not be publishing your paper if all of your patients fail to respond to your treatment. If you see anything different from ordinary, however, perhaps your treatment is actually doing something. The less expected your results are, the lower the p value and the more likely that your treatment is responsible. If the probability is less than 5% (p < .05), you can get your paper published, because most scientists will be satisfied. Even more will be satisfied if you can get p < .01. Low p values mean your results are significant. That doesn’t mean a large effect, and it doesn’t mean dramatic. It just means that it’s less likely to be an accidental finding.

How do you get the p-value down? One way is to add more people to your study. For instance, let’s say you want to see if a coin is fair. When you flip a fair coin, you expect an equal number of heads and tails. So if you get 5 heads in a row, which seems unlikely, you might have a rigged coin, right? But the p value for that sequence is only .06, not enough to go to press. But just one more head – 6 heads in a row – cuts your p value in half to .03. Now you’re gold. Another two flips for 8 heads in a row gives you a p of .008, which is pretty convincing evidence that your coin isn’t fair. It makes sense that the more heads you get in a row, the less likely it is that your coin is fair. Similarly, every time you add a patient who responds to your treatment, it lowers the p-value and increases the significance of your findings. At some point, (p < .05 or p < .01) everyone is convinced that the success of your treatment is unlikely to be a fluke and no more proof is needed. Until then, the more the merrier!

Dr. Ornish’s study is plagued with high p values. HDL cholesterol levels have a p > .8. Triglycerides levels have p > .24. Apolipoproteins levels have p > .46. Blood pressures have p > .7 (systolic) and p > .8 (diastolic). These are unusable measures, and these readings therefore have no significance. The study just wasn’t large enough. These are all important measures of heart health, but Ornish may as well have skipped them.

Although none of the controls in the study died, one of the patients on the Ornish protocol did. Since there were only 22 patients in the experimental group, that represents a 4.5% rate of death for the year. For normal 50-somethings (the age of this group), the death rate is more like 1% in a year, plus or minus .1%.  That makes this death a big nasty outlier. A large dataset can absorb these outliers easier than a small one. Here it looks quite bad, but Dr. Ornish says that it was due to a patient who was too exuberant with his exercise. Interestingly, in other parts of the study, Ornish shows that the more exuberant the patients were about the diet, the better their healing. Apparently, there is an exuberance cut-off, but it isn’t delineated in the study. At any rate, because he died before a final measure could be taken, he was left out of the study.

Another problem with the size of the study has to do with women. Specifically, there was only 1 woman in the experimental group. This part of the study is about as significant as my Twinkie study above. To his credit, Ornish recognizes this serious limitation. We still are waiting to find out about the other half of the population.

So, in our crash-statistics course, we’ve learned why studies should be big. The Framingham Heart Study with 5,200 people is big. The Nurses’ Health study with 238,000 people is big. The Ornish study with twenty-two patients is not. And maybe Ornish is just tired of hearing about how small it is, because in a recent article for Medline, he says:

“It is a common belief that the larger the number of patients, the more valid a study is. However, the number of patients is only one of many factors that determine the quality of a study. Judging a study by the number of patients is like judging a book by the number of pages.”

That’s a nice metaphor, but it is wrong. Significance will always be bound up with size, and no magic can release us from those statistical shackles.

Ornish quotes Dr. Attilio Maseri, an Italian cardiologist, to buttress this claim:

“Very large trials with broad inclusion criteria raise grounds for concern for practicing physicians and for the economics of healthcare. The first is the fact that the larger the number of patients that have to be included in a trial in order to prove a statistically significant benefit, the greater the uncertainty about the reason why the beneficial effects of the treatment cannot be detected in a smaller trial.”

To which I say, “What?”

This may have suffered a little in translation from Italian, but it’s fairly unintelligible as is. You have to go to Maseri’s original article to realize that he is really saying that large studies are expensive, and that you can get more bang for your research buck if you select a smaller but more representative sample. That is a more defensible stand, but not the same as Dr. Ornish’s strangely unscientific statement.

Big studies are better, because they offer more reliable conclusions. Don’t let Dr. Ornish’s fame and clout let you think otherwise.


Don’t miss my pithy diet tweets! Follow me @NotchByNotch

6 thoughts on “Dean Ornish Likes Small Studies

  1. In Ornish’s case, I’m not convinced that it was the diet, but rather the exercise that widened coronary arteries–doesn’t this happen when you increase VO-2 max? Lungs widen, arteries open, blood pressure falls, and blood sugar falls as a result. I don’t think the diet had anything to do with it, frankly.

    What happens when you fill your diet (and your body) with a vegan diet, coupled with AHA’s recommendation to eat sugar itself (in the form of grains and candy)? You INCREASE blood sugar, and isn’t that the thing that causes heart disease IN THE FIRST PLACE? Not to mention you increase concentrations of goitogrens, bad fats, Omega-6, phytoestrogens, and lectins, to name a few. And an organic vegan diet is not much better than a commercial one–all that’s missing is huge pesticide levels.

  2. Wenchy,

    You’re correct. Ornish was studying “lifestyle” changes, which included yoga, exercise, meditation & counselling. Those confounding factors make it impossible to say anything about the diet per se.

  3. Yes, we need to do a bigger study with the other lifestyle changes with one group low fat vegan and the other low carb Paleo or ketogenic a la Jimmy Moore.

    The fact that one of his intervention group croaked and no of the control group did while perhaps not statistically significant is certainly clinically significant considering the small size of the study. I mean that is the only end point result of the study the others are only alleged surrogate markers.

    My take is that the yoga, meditation etcetera helped and the low fat vegan diet hindered. Butt of course, I await experimental clinical evidence.

  4. The statistics resolve it, as illustrated. Small study, modest effect, poor P value, nothing to see – move on. I don’t think we should *only* do massive studies because they either don’t happen for cost reasons or become weak epidemiological inferences. A well conducted small study that achieves statistical and clinically significant outcomes is a useful contribution to our knowledge – especially if they allow open access to the datasets or publish individual subject responses rather than just mean and s.d. (in some cases the s.d. is bigger than the mean !!).

    Nutritional studies would be better done in small homogeneous groups like “women” or “athletic black men” as taking a mix of such groups produces such a huge variance in parameters that statistical significance becomes a challenge to achieve at all. Do a few different groups separately and see what we learn, rather than mixing them up into a mile-wide distribution and learning nothing.

  5. That was Maseri’s point: more homogeneous samples can yield better results, even with small N. Diet or “lifestyle” studies are notoriously difficult, and this study was too small and heterogeneous to be very useful. Worse yet, the only measure that showed efficacy, angiogram measurement, was abandoned by Ornish in his subsequent studies, presumably because it isn’t very accurate.

    But we really need better studies. With 2/3 of the population obese or overweight, we clearly took the wrong path somewhere!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>