By Professor Doom

Last time around, I discussed the “go to” method for research nowadays, data mining. You simply take a large data set, and slice it into as many ways as needed until you get lucky. So many of today’s “hot” results come from this method, and the key to its success is a quirk in our research system: there’s no mechanism reinforcing the core idea of the scientific method, reproducibility.

Many scientific studies can’t be replicated. That’s a problem.

I don’t care how great your theory is, how testable your hypothesis is, or how “significant” your result is. If I can’t reproduce your result under controlled conditions, your result is crap, and is not science. Seriously.

Many of our biggest ideas in science today, like Global Warming (capitalized in the same way I capitalize Christianity or Buddhism), go one step beyond non-reproducibility, by being non-testable. Testability is another key idea of the scientific method, but I digress.

Today I want to focus more on medical-style statistical testing. It seems every day there are “promising” results regarding cancer treatment, but the gentle reader needs to understand that almost all cancer studies are nearly impossible to reproduce—sometimes by design, but also by reality. How can you get 100 patients, or even 10, that are exactly like the patients in the “promising” study? It’s a tough business, but your cancer treatment drugs will still cost $5,000 a day, even when based on such shaky science.

That's partly due to a few spectacular instances of fraud, such as when Dutch psychologist Diederik Stapel admitted in 2011 that he’d been fabricating his data for years.

--in addition to the statistical methods I use, you could always, you know, just make it all up. Nobody will check.

Anyway, nobody checks for reproducibility in scientific studies, and this is the primary reason modern science is basically broken—there are many great things being learned, but there is much hokum being published as well.

The only weakness of the data mining research method is sometimes you can’t get a good result (in other words, you get really, really, unlucky), or you get a result that you know just won’t fly, such as my “Republican males over 30 can eliminate headaches by drinking water” result of the previous essay. Ok, today’s mainstream media is so terrible that even a result that ridiculous might still get legs, but bear with me.

Suppose I want the result “Drinking water is clinically proven to eliminate headaches,” and I don’t want to data mine so that I’m restricted so some silly subset of the population like “Republican males over 30.”

I can still do it, and I can still do it in a way that, as far as peer review is concerned, is perfectly legitimate. Key, as always, is nobody’s going to try to reproduce my results because there is absolutely no grant money for that, and no rewards in any event (unless the result has huge economic implications, like cold fusion).

Recall that the way your result is publishable in many fields is it has a low p-value, and that a p-value below 0.01 (a 1% chance, one in a hundred) is all you need. This suggests a second method to lie with statistics:

Method 2: Repetition.

Wait, one in a hundred? Well, then, I’ll just run the study 100 times, or as many times as it takes until I get lucky. Now, I grant that this requires deep pockets (hi pharmaceutical companies!), but as far as peer review goes, they’ll never see the 99 studies that failed, because I won’t show them those. I’ll show them the one study that got the result I wanted.

*BAM* I’ve just “clinically proven” that water cures headaches.

I don’t think this second method is popular, but it’s impossible to detect this type of statistical lying, short of a whistleblower. The simple fact that so many scientific studies are not reproducible under controlled conditions, suggests that it may be much bigger than I think.

There’s another way I can “clinically prove” that drinking water cures headaches, and it’s about as popular as the first:

Method 3: Data Manipulation

The key “honest” method to data manipulation is the removal of “outliers,” data points that you think don’t belong in the data. For example, suppose I ask some college students their age, and get the following numbers:

{19, 19, 18, 20, 21, 19, 20, 21, 22, 1,918}

Now, that last number seems suspicious, right? There probably isn’t a college student that’s 1,918 years old. Maybe my finger slipped, and I have two college students here, ages 19 and 18…or maybe there just happened to be a nearly two thousand year old student in my class that day.

Homework assignment from one of my classes: "Go to a mall or something and ask 20 people their heights, and identify any outliers in your data set."

One student's answer: "According to my dictionary, an outlier is not in the usual place of business, so all my data are outliers."

--The student was serious.

Rather than take chances, I just call that 1,918 an “outlier” and toss it completely.

This may seem odd, but removing outliers is perfectly legitimate. It’s also a little risky. For example, civilization freaked out when we suddenly discovered a “hole in the ozone” when we launched new satellites. The old satellites didn’t see such a hole, because we had decided that when the instruments on the old satellites said “0 ozone” that the “0” was an outlier due to equipment malfunction, and we tossed it. Oopsie. The ozone hole is a natural event, and we missed it because of outliers.

Anyway, the point is, you can honestly remove outliers and risk missing important information…or you can just take out whatever data you don’t like until you get the result you want, and thereby generate fake “important information.”

So, if I want to show water can cure headaches, I just get a lot of people with headaches, make them drink water, ask about their pain reduction, and measure the results.

Then I toss all the people whose pain didn’t drop, calling those people outliers. There will be some whose pain level did drop, as headaches go away on their own sometimes, and I’ll keep those for my data. I get a low p-value, and submit my result to the peer review committee.

Aaannd once again I have a publishable result. All the committee sees is the data and results I show them, they don’t have to hear about outliers if I don’t tell them…and since our current system gives no rewards for trying to reproduce a study, I’m golden. Again.

Again, the weakness in this method is a whistleblower, but if I destroy all the data I’ve removed, a whistleblower will be hard pressed to prove his claim. Whistleblowers sure pop up, but the system has no decent way to deal with their claims.

Glaxo chief: Our drugs do not work on most patients

Look, there are many honest scientists, even honest psychologists and honest doctors out there, but the bottom line is we have very clear evidence that much of our modern “science” is fraud. It’s fair to ask how this is happening.

I suggest the fact that it’s stupid-easy to get a publishable result using dishonest statistics has much to do with problems of “science” today.

Confessions of a College Professor

Wednesday, July 6, 2016

Two More Easy Statistical Lies

Glaxo chief: Our drugs do not work on most patients

No comments:

Post a Comment

Blog Archive