Mark Twain, the 100th anniversary of whose death transpired just last month, was never known to be soft-spoken about his opinions. He popularized the phrase, “There are three kinds of lies: lies, damned lies, and statistics,” caustically opining the view that numbers can be used to dissemble truth. While Twain’s classic snarkiness may elicit frustrated delight from Stats 60 students everywhere, his story is only half formed. As statistics can be a tool for ill-intentioned academics to perpetuate falsehoods, a society well educated in statistics is the best defense against this kind of intellectual trickery. Today we’ll look at a recent case in which statistics were not used honestly, and how to guard against them in the future.
HIV is the definitive cause of AIDS, which kills over 3 million people every year worldwide. This fact has been replicated by studies over and over, and, thanks to scientific research, antiretroviral treatment can now extend life in HIV-positive patients by decades. One of the most vocal scientists disagreeing with this claim has been Peter Duesberg, who was once a bright young cancer researcher at UC-Berkeley (I would jibe the Golden Bear, but what follows is too grave). Duesberg published non peer-reviewed articles throughout the 1980s and ’90s expressing doubt that HIV caused AIDS, and ultimately secured publication in the Journal of Bioscience in 2003 claiming that AIDS was a chemical problem caused by recreational drugs, and HIV was merely a common passenger virus. His paper has several glaring problems that can appear innocuous at face value. For one of his main defenses, he cites a handful of case studies in which people with HIV did not develop AIDS, and attempts to use these cases to counterbalance the millions of cases a year in which people do. By cherry picking a few cases, Duesberg attempts to sow doubt by implying that a few cases in his favor should be valued equally to the millions of cases to the contrary. Another dastardly maneuver Duesberg uses is to analyze a correlation between AIDS patients and drug users, and formulate the conclusion that drug use causes AIDS. Presenting a causal link from simple correlations is another trick that can be used to imply a conclusion that simply isn’t true.
While Duesberg’s “research” was quickly dismantled, he was cited by South African President Thabo Mbeki for scientific proof that HIV did not cause AIDS, which caused an enormous national delay in testing for HIV and distributing antiretrovirals. Mbeki’s successor, Kgalema Motlanthe, was largely elected on the platform of addressing HIV/AIDS, and while the situation is improving, South Africa now has more of its citizens die annually from AIDS than any other country. With lives on the line, it makes no difference whether AIDS denialists’ faulty science resulted from incompetence or malice. Berkeley is currently investigating Duesberg for academic misconduct for dissembling information and not disclosing conflict of interest. It is imperative to scrutinize every aspect of any data you are presented with: who funded it, how big the sample size was, whether they are analyzing all information and if they are using valid statistical methods. Only then should you accept it as fact, and a strong background in statistics will greatly help you in this pursuit.
In addition to debunking faulty research, statistics can also do a tremendous amount of good in the world, and Stanford has been the global leader in statistical research for at least the last half century. The bootstrap resampling method developed by Brad Efron has allowed unprecedented predictive power and statistical inference, particularly in the growing field of biocomputation and statistical genetics. The Classification And Regression Trees algorithm developed largely at Stanford by Breiman, Friedman, Olshen and Stone provided a foundation for modern computational algorithms. David Siegmund’s change-point research gave clinical researchers the tools to determine whether overwhelming evidence early in a trial could be sufficient to end the trial early, and has thus saved many lives. The other contributions from Stanford to the field of statistics are truly too numerous to list, but rest assured that if you are looking for a place to learn more about the field, you’re in the right place.
A century onward, perhaps Mark Twain’s adage is half correct; that statistics can be used by ne’er-do-wells, but they are also the last line of defense against lies and damned lies. The sentiment that numbers can’t lie is simply misguided, and the mathematical knowhow to distinguish when someone is trying to lie to you with numbers is as much if not more important as the intellectual knowhow to distinguish when someone is lying to you with words. Scrutinize every number, every figure and every error bar as closely as you would a word, a claim or a statement. When used properly, statistical analysis is the best resource we have to winnow truth from uncertainty through the scientific method. In the coming decades, each of you will have the power to change the world in your field, and I can only hope that you use the power of statistics for good and for truth. For the few of you who don’t, the rest of us will be watching.
If you still don’t think you need a firm grasp on statistics to be an informed citizen, Jack has a fence for you to whitewash. You can send a check for the privilege to [email protected].