Tuesday, December 11, 2007

A Lancet Doubter

How did I come to participate in this dispute? I have no background in survey research and no expertise in Iraq. But I do make a habit of reading blogs written by smart people with different views from my own. Indeed, there is no better way to test your beliefs than to confront the best proponents of alternate theories. To that end, I read Tim Lambert's blog Deltoid. Lambert has defended the Lancet surveys extensively and, for the most part, his defense is correct. Indeed, the Lancet authors have benefited from the mistakes of many of their critics.

Yet just because many of the Lancet critics have been wrong does not mean that the articles themselves are correct. My first reaction to L1 was that its confidence interval for excess deaths, 8,000 to 194,000, was suspiciously close to zero. Every good statistician is skeptical of a result which just barely rejects what researchers call the "null hypothesis," in this case that mortality in Iraq was unchanged after the invasion. A small change in the model assumptions could easily make the effect go away. Since it was obvious that the Lancet authors had political motivations/ambitions (Roberts ran for Congress in 2006), I thought that they were probably guilty of cherry-picking their model, at least to some extent. They would not be the first researchers to do so.

Yet these suspicions were, for me, overwhelmed by my disgust with the behavior of the authors and their supporters. Although they provided some summary data for L1, they refused to pulge the household-level data and computer code that would help outside researchers (like me) to replicate their results. This is not the way that scientist ought to act.

In August, I made a presentation at the annual meeting of the American Statistical Association which argued that the results of the first Lancet survey were internally inconsistent. The technical details are opaque at best but the implication is that the authors purposely presented their data (by including outlier data from Falluja in some parts of the analysis and excluding it elsewhere) to mislead. Specifically, the 8,000 to 194,000 confidence interval is claimed to be "conservative" because it excludes the carnage in Falluja. I show that including Falluja would have widened the confidence interval enough to include zero, thereby not allowing the authors to reject the null hypothesis of no increase in mortality.

In English, my claim is that the authors specifically refused to provide the confidence interval for excess deaths using all their data because they knew that doing so would provide too much ammunition to their critics. Even today, they stubbornly decline to tell me or anyone else what the confidence intervals would be with Falluja included.

That paper was discussed at Deltoid and then picked up from there by Michelle Malkin. Suddenly, I was part of the Right Wing Noise Machine, even invited as a guest on my local talk radio station. Alas, I misinterpreted the orders from my Rovian overlords and spent most of the time defending the Lancet authors from the innumerate complaints that the host was making. He choose not to keep me on the air long enough to get to the point of my actual critique.

Fortunately, other scientists are working on the topic. Colin Kahl writes that the Lancet estimates are "dubious." Fritz Scheuren, past president of the American Statistical Association, claims that the response rates from L2 are "not credible." Stephen Fienberg, one of the most respected statisticians in the country, insisted that Les Roberts' refusal to share data with Michael Spagat and his co-authors was:

"just the wrong response. I, as an editor, would not publish a study for which the data was not shared."

If scientists like Kahl, Scheuren, Fienberg, Spagat and others think that your results are flawed and your behavior suspect, then you likely have a problem.

No comments: