Pages

Friday, March 09, 2012

SEAL OF DISAPPROVAL



How Much of the Neuroimaging Literature Should We Discard?

Since sub-optimally designed and analyzed fMRI studies continue to influence the field, there should be some mechanism for identifying discredited publications. This discussion was initiated by Professor Dorothy Bishop's critical analysis of a flawed paper. In response, Dr Daniel Bor wrote a thoughtful post on The dilemma of weak neuroimaging papers. He covered corrected and uncorrected statistics, a culture of sloppy neuroimaging publications, and whether a bad paper can be harmful or should be retracted.


Does Discarding Mean Retraction?

By "discarding" I meant disregarding the results from flawed articles, not retracting them from the literature entirely. This point was misunderstood by some. It seems that all commenters on Bor's post agreed that retraction is problematic.

Bor: "...it’s almost certainly impractical to retract these papers, en masse."

Neuroskeptic: "Retracting them all… never going to happen, but even it did, I don’t think it would help at all. Much better would be for readers to educate themselves or be educated to the point where they know how to spot sloppy stats."

Bishop: "I don’t think it’s realistic to expect a retraction, and certainly not just on the basis of reporting of uncorrected statistics. I think we should disregard the results of this study because of the constellation of methodological problems..."

Poldrack: "I think retraction of papers that report uncorrected statistics is a bit much to ask for; after all, most of the results that were published in the days before rigid statistical corrections were common have turned out to replicate, and indeed large-scale meta-analyses have shown a good degree of consistency..."


This is not a new debate. In cellular and molecular biology, there have been cases of technical artifacts which resulted in failures to replicate. Retraction Watch, your authoritative source for all retraction news in the scientific literature, discussed this issue at length last year (e.g., So when is a retraction warranted? The long and winding road to publishing a failure to replicate). In a subsequent post, Ivan Oransky concluded:
So to get back to those Retraction Watch threads to which we referred: Yes, failure to replicate seems a good reason to retract. And notices should explain what went wrong. Allowing authors to get away with failing to do so, in some well-intentioned but misguided attempt to lower the barrier to retractions — as the Journal of Neuroscience does, for example — is part of why some people seem to think that retractions mean fraud.

This prompted a response from Isis the Scientist, who asked What Warrants a Retraction?
This story and controversy is still evolving. To retract the paper now, without evidence of overt fraud or negligence, will mark it with the scarlet watermark of fraud. Because a retraction is the foremost boner-killer of science, the retraction is a weapon to be wielded carefully. It should be used in cases of fraud and negligence, but not in cases where there remains active debate or inconclusive evidence.

Even then, there is value to leaving errors and conflict in the literature and there must be a better tool than the retraction watermark...
...but no one seems to know what that better tool is. Until now.



Introducing STAMPS OF DISAPPROVAL, by Heather K. Phillips

Gone are the days of tearing work from the wall. These days, disapproval often takes the form of ambiguous encouragements. Put the language of critique in your hands with this series of 12 rubber stamps. Each stamp bears of fragment of abridged feedback associated with critique.

Now available for purchase at Schooled

But seriously, improvements in how the field corrects itself will require "a structured information overlay for all academic papers," according to Ben Goldacre. Databases of failures to replicate and post-publication assessment are needed. Some scientists would like to overturn the current system of peer review entirely.

Or maybe we can incorporate a creative system that uses the STAMPS OF DISAPPROVAL...

8 comments:

  1. Another option is meta-analyses that puts the spotlight on non-optimal procedures.
    For example, see figure 4 in this review http://www.sciencedirect.com/science/article/pii/S1053811911014005

    ReplyDelete
  2. Thanks for pointing out the importance of meta-analysis. There was an important and ambitious symposium at the Cognitive Neuroscience Society Meeting 2 yrs ago, a call for the neuroimaging community to advance beyond the current piecemeal single-study approach to produce comprehensive, structured, and searchable databases. The session was chaired by University of Colorado Boulder post-doc Dr. Tal Yarkoni, author of the excellent blog, [citation needed]. Here's part of the symposium summary:

    "The first speaker (Tal Yarkoni) will motivate the need for a cumulative approach by highlighting several limitations of individual studies that can only be overcome by synthesizing the results of multiple studies. The second speaker (David Van Essen) will discuss the basic tools required in order to support formal synthesis of multiple studies, focusing particular attention on SumsDB, a massive database of functional neuroimaging data that can support sophisticated search and visualization queries. The third and fourth speakers will discuss two different approaches to combining and filtering results from multiple studies. Tor Wager will review state-of-the-art approaches to meta-analysis of fMRI data, providing empirical examples of the power of meta-analysis to both validate and disconfirm widely held views of brain organization. Russell Poldrack will discuss a novel taxonomic approach that uses collaboratively annotated meta-data to develop formal ontologies of brain function."

    You can read more about it here: Motivating a Cumulative Cognitive Neuroscience.

    ReplyDelete
  3. Poor statistical analysis is only a small part of the problem. A huge part of the problem is systematic error, most notably from subject motion, that by numerous mechanisms may introduce spatial-temporal correlations that are not due to the BOLD effect. Even when spatial-temporal correlations are not the focus of a study motion adds lots of noise and the means by which motion is corrected is shoddy to say the least. Worse yet the means by which motion correction and temporal interpolation is presently done may be adding systematic errors that give bogus results for even simple block paradigm designs.

    There is value in fMRI but that wont matter soon if the field doesn't start getting serious about the science and foregoe the impulse to publish something sexy rather than something rigorous.

    ReplyDelete
  4. Unfortunately, there is no wall between pure science, where bad papers are allowed to live on in peace, and clinical practice, where bad papers are recycled into invalid survey articles, bad treatment guidelines, and nonsensical Cochrane syntheses.

    Bad neuroscience papers follow in the 30-year tradition of execrable psychiatric research. You can bet they will be used to formulate a neuroscience equivalent of the "chemical imbalance" fallacy and questionable treatments for real patients.

    ReplyDelete
  5. Dear Neurocritic,

    Thank you for your continued focus on these issues. I feel a real sense of crisis in the neuroimaging community, as if the big bluff has finally been called. Many of us are suffering a tremendous blow to our reputation because of some high-profile nonsense. Much of the criticism is justified, but I think neuroimaging is being picked on much more than other fields that harbor similar atrocities. Perhaps it's the "sexy" factor that does it.

    I recently noticed that another odious practice is quite common: showing a whole brain image with everything but your ROI masked out. That gives a false impression of a spectacularly clean brain, except of course for the region that "matters".

    As I commented on your previous post, Ed Vul's approach of calling attention to specific faulty papers has been extremely effective. As long as people can get away with doing sloppy stats, they will.

    My opinion generally (and of course I'm not the first to say this) is that there should be study registries for basic science, just like clinicaltrials.gov. I find pre-registration extremely useful in evaluating a clinical trial publication. Multiple comparisons and p-value fishing are so rampant these days that you can take almost nothing seriously.

    End of rant.

    ReplyDelete
  6. Thanks Neurocritic.

    A continuing problem is the later use by unsuspecting readers of papers which are flawed or retracted.

    The half-lives of these papers can be long.

    We don't yet have a flag for flawed/invalidated. In my own field of pheromones there are zombie papers which won't die because they get cited without the citer realising that later work shows the papers to be junk/unrepeatable/invalid.


    One good feature of MEDLINE is that it flags (I understand) retracted papers:
    'MEDLINE uses the words “Retracted Publication” as a subject heading alongside the offending piece.' mentioned in a paper appropriately titled '“Rubber stamping” retracted papers' Walter (2000) http://tinyurl.com/7o2zfhv [does MEDLINE go further and flag results later shown to be invalid/unrepeatable?]

    ISI Web of Science adds 'retracted article' to the the title of retracted papers e.g. "Odor maps in the olfactory cortex (Retracted article. See vol. 107, pg. 17451, 2010)" Zou et al 2005 PNAS 102:7724-7729.

    PNAS flags it in sidebar as retracted.

    I'm not sure how we'll do this to flawed but unretracted papers. A flag saying something like "see also xx" would be a starting point.

    ReplyDelete
  7. More on the lasting effects of bad science - long half-lives and continued impacts [sadly both behind £wall]:

    Neale, A, Dailey, R & Abrams, J (2010). Analysis of Citations to Biomedical Articles Affected by Scientific Misconduct. Science and Engineering Ethics, 16, 251-261.
    Abstr"We describe the ongoing citations to biomedical articles affected by scientific misconduct, and characterize the papers that cite these affected articles. The citations to 102 articles named in official findings of scientific misconduct during the period of 1993 and 2001 were identified through the Institute for Scientific Information Web of Science database. Using a stratified random sampling strategy, we performed a content analysis of 603 of the 5,393 citing papers to identify indications of awareness that the cited articles affected by scientific misconduct had validity issues, and to examine how the citing papers referred to the affected articles. Fewer than 5% of citing papers indicated any awareness that the cited article was retracted or named in a finding of misconduct. We also tested the hypothesis that affected articles would have fewer citations than a comparison sample; this was not supported. Most articles affected by misconduct were published in basic science journals, and we found little cause for concern that such articles may have affected clinical equipoise or clinical care." http://dx.doi.org/10.1007/s11948-009-9151-4

    Korpela, K (2010). How long does it take for the scientific literature to purge itself of fraudulent material?: the Breuning case revisited. Curr Med Res Opin, 26, 843-847. http://informahealthcare.com/doi/abs/10.1185/03007991003603804

    ReplyDelete
  8. tristram - Thanks for the links, I wasn't familiar with those papers.

    So What - The best quote of all about head motion was from Steve Petersen: “It really, really, really sucks. My favorite result of the last five years is an artifact,” says lead investigator Steve Petersen, professor of cognitive neuroscience at Washington University in St. Louis. - from Movement during brain scans may lead to spurious patterns

    ReplyDelete