Friday, June 14, 2013

A New Biomarker for Treatment Response in Major Depression? Not Yet.


Is a laboratory test or brain scanning method for diagnosing psychiatric disorders right around the corner? How about a test to choose the best method of treatment? Many labs around the world are working to solve these problems, but we don't yet have such diagnostic procedures (despite what some might claim). A new study by McGrath et al. (2013) might be a step in that direction, but the results are very preliminary and await further validation.

The principal investigator of that study is Dr. Helen Mayberg, a leader in neuroimaging studies of major depression. She and her colleagues have pioneered the use of deep brain stimulation (DBS) as a treatment for severe, intractable depression, which was "the culmination of 15 years of research using brain imaging technology," says Dr. Mayberg.


Psychotherapy or Drugs?

The choice of treatment modality in depression, as in other psychiatric disorders, is by trial and error. If one drug doesn't work, switch to another one. If your insurance covers it, a short course of evidence-based psychotherapy1 might be in order.

The whole concept of a DSM-based classification scheme for mental illnesses has come under fire, especially with the release of the new Diagnostic and Statistical Manual. In the real world, psychiatric disorders don't always show such clear boundaries; overlap and co-morbidity are common. The National Institute of Mental Health has endorsed a new approach, the Research Domain Criteria project, that incorporates dimensions of observable behavior along with neurobiological measures.

Here's where the new work by McGrath et al. (2013) fits in. Their goal was...
To identify a candidate neuroimaging “treatment-specific biomarker” that predicts differential outcome to either medication or psychotherapy.

Fewer than 40% of depressed patients remit with their first course of treatment, so this would be an important advance. A more scientific way of choosing among possible treatment options would benefit patients and society at large.

The study (registered at clinicaltrials.gov, NCT00367341) enrolled a total of 82 depressed people. The neuroimaging method might surprise some of you: FDG-PET to measure glucose metabolism -- not the popular and trendy resting state fMRI to examine functional connectivity or any sort of fMRI activation study. However, the authors cite an established literature using this technique in studies of antidepressant treatment response.

Patients diagnosed with moderate to severe depression (a score of 18 or more on the Hamilton Depression Rating Scale, HDRS) received a PET scan and were randomized to receive 12 weeks of either cognitive behavioral therapy (CBT, n=41) or escitalopram (Lexapro, n=39), an SSRI antidepressant. Sixty-three patients completed this phase and also had a PET scan. The endpoint considered a successful response to treatment was remission (HDRS score of 7 or less), while non-response was a change in HDRS of 30% or less. Partial responders were omitted, leaving the final groups as follows:
  • CBT remission, n=12
  • escitalopram remission, n=11
  • CBT nonresponse, n=9
  • escitalopram nonresponse, n=6
Right away we see that the number of patients in each group is very small, particularly for a study designed to identify biomarkers that will generalize to a larger population. Let me repeat that: a successful biomarker must generalize to an independent population. We haven't seen that here, so any conclusions drawn from this paper must be considered very preliminary.

How was the biomarker identified? The PET images were co-registered with the corresponding structural MRIs. A whole brain analysis identified regions showing a treatment × outcome interaction (at a significance level of p<.001 uncorrected). Six regions met this uncorrected standard: right anterior insula, right inferior temporal cortex, left amygdala,2 left premotor cortex, right motor cortex, and precuneus (medial superior parietal lobe). Most of these are pretty surprising, but even more surprising is that the rostral anterior cingulate (and subgenual cingulate, BA 25) were not involved:
Contrary to past published studies,63 the rostral anterior cingulate did not discriminate the outcome subgroups in either the main effect or interaction analyses. A post hoc examination of responder and nonresponder differences within each treatment arm did reveal a nonsignificant rostral cingulate activity difference, with metabolism in responders greater than nonresponders, but solely in the escitalopram group. While consistent with past reports, this finding did not meet the TSB [treatment-specific biomarker] criteria defined for the current study, ie, a region whose activity can differentiate both good and poor outcomes for both treatments.
- click on image for a larger view -


Effect sizes are shown in the table above. The brain regions were ranked in order of size of activation (which doesn't make sense for the amygdala), and the right anterior insula was chosen as the best potential biomarker because.... it had the largest cluster size? Or because it did marginally better than the other regions in terms of effect size (although this was not shown statistically).  As a hub for interoceptive awareness, attention, and emotion, the anterior insula makes the most sense scientifically (Craig, 2009). Certainly, it would be odd if glucose metabolism in the right motor cortex could predict response to CBT or SSRI...

At any rate, right insula hypometabolism at baseline was associated with remission to CBT and poor response to SSRI, and vice versa for hypermetabolism. There was overlap between the groups as shown below, but increasing the chances of successful treatment (even with no guarantees) would be better than a completely trial-and-error approach.3

Figure 3A (modified from McGrath et al., 2013). Right anterior insula as the optimal treatment-specific biomarker candidate.  A. Scatterplot of insular activity from individual subjects in the remitter (REM) and nonresponder (NR) groups. Note: the anterior insula is the only region where the interaction subdivides patients into hypermetabolic (region/whole-brain mean >1.0) and hypometabolic (region/whole-brain mean <1.0) subgroups.


A Nature news story says that Brain scan predicts best therapy for depression, but that would be a premature conclusion at best. Although this study might be considered promising, the results must be validated in larger independent samples of patients who are assigned to treatments according to their baseline insula PET scans.

With the newly prominent nattering nabobs of neuroimaging negativity, it's important to remember that it's not all neuroprattle and bunk. Some of this research is trying to alleviate human suffering.


Further Reading

The Sad Cingulate

Sad Cingulate on 60 Minutes and in Rats

The Sad Cingulate Before CBT

Deep Brain Stimulation for Bipolar Depression

Is CBT Worthless?

Where Are the Clinical Tests for Psychiatric Disorders?

The Dark Side of Diagnosis by Brain Scan


ADDENDUM (6/14/2013): David Dobbs has an excellent post on the same study, Talk Therapy or Pill? A Brain Scan May Tell What’s BestDobbs has written extensively about Dr. Mayberg and her work, including A Depression Switch? – New York Times and Depression’s wiring diagram.


Footnotes

1 But read LawsDystopiaBlog by Professor Keith Laws to see how flimsy the "evidence base" can sometimes be.

2 An earlier experiment showed that the amygdala might be a region that could help predict CBT response, using fMRI and response to emotional words.

3 Not to be a pedantic stick in the mud, but the combination of drugs and therapy is often the most successful.


References

Craig AD. (2009). How do you feel--now? The anterior insula and human awareness. Nat Rev Neurosci. 10:59-70.

McGrath CL, Kelley ME, Holtzheimer PE, Dunlop BW, Craighead WE, Franco AR, Craddock RC, & Mayberg HS (2013). Toward a Neuroimaging Treatment Selection Biomarker for Major Depressive Disorder. JAMA psychiatry (Chicago, Ill.), 1-9 PMID: 23760393

Subscribe to Post Comments [Atom]

14 Comments:

At June 14, 2013 5:42 AM, Blogger David Dobbs said...

Very nice write-up. Worth noting: Mayberg is among the first to note that this needs wider replication, and uses the term "putative" in the paper; she agrees When I spoke to her on the phone yesterday for my write-up about this study at Neuron Culture (link below), she said she's waiting to hear on a grant application to do the sort of study you say above is needed: a prospective study that puts patients in treatment according to the putative PET-scan biomarkers, and then sees if they do better (as a group) than a similar control population being assigned randomly.

My write-up at Neuron Culture: http://daviddobbs.net/smoothpebbles/study-brain-scans-may-predict-best-depression-treatment/

 
At June 14, 2013 10:41 AM, Anonymous Anonymous said...

One point of clarification -- you've bolded "uncorrected" (which seems to imply that this wasn't thresholded sufficiently?) but they also used a cluster threshold of 100 contiguous voxels. Depending on the smoothness of the data, this may (?) have been sufficient to maintain a whole brain threshold of p < .05.

 
At June 14, 2013 10:57 AM, Blogger The Neurocritic said...

Thanks, David. I'll put in a link to your excellent piece in the main part of the post.

Anonymous - Thanks for your comment. Let's see, they "smoothed to an in-plane resolution of 4.0 mm Full-Width Half-Maximum." At the other end of things, was the required cluster size too large to pick up some subcortical structures? The L amygdala cluster here was large, but might others have been missed?

 
At June 15, 2013 8:50 AM, Blogger Bernard Carroll said...


I am bothered by at least 2 aspects of this report. First, the authors cherry picked the data. They excluded analyses of nonremitters who nevertheless improved with treatment. Misleadingly, they labeled such cases partial responders, though for all we know many of these would meet the customary criterion of response (reduction of HDRS score by 50% and final HDRS score of 10 or less). They also excluded early terminators (dropouts), so this is not an ecologically valid predictive study. Overall, 40 of 82 randomized cases were excluded from consideration. The authors rationalized this approach by stating that these 40 cases were not included in the analyses “to avoid potential dilution of either the remission or the nonresponse groups.” JAMA Psychiatry should have required the authors to present the data on these 40 cases. We would wish to inspect those data for an intermediate position between the remitters and the nonresponders. Has mere regression to the mean been considered? After all, the mean effect size for the putative biomarker was identical in all remitters versus all nonresponders.

Second, these data are uncorrected for spontaneous improvement. We know all too well that spontaneous improvement rates (placebo response rates) can exceed 50% in depressed outpatients, and especially in symptomatic volunteer populations: an unstated number of these cases were recruited by advertising rather than being clinically referred. That also calls into question the ecological validity of the study. Before proceeding to speculate as these authors did about potential pathophysiology or association with genetic or immune variables, etc., there is a need to establish an association of the biomarker with specific or placebo-corrected treatment response. Until the authors address these matters their report will not be taken seriously.

 
At June 15, 2013 9:50 AM, Anonymous Psycritic said...

This is certainly a study aimed at generating, rather than testing, a hypothesis. I was going to post a longer comment, but Dr. Carroll beat me to it. I agree that excluding half the patients from analysis makes it hard to believe that this particular biomarker has much utility, even if the results are consistently replicated. (And I doubt they will be.)

 
At June 15, 2013 10:39 AM, Blogger The Neurocritic said...

Dr. Carroll - Thanks for your comments. I agree that it would be good to see the data from the partial responders (n=25 vs. the n=38 included), a considerable proportion of the patients. Perhaps the authors are planning to do this in a future publication. I went back to look at the clinicaltrials.gov protocol:

Primary Outcome Measures:
- remission defined as Hamilton Depression Rating Scale-17 score of less than or equal to 7 at 12 weeks [ Time Frame: Measured at weeks 10 and 12. ]

Secondary Outcome Measures:
- response defined as 50% change in Hamilton Depression Rating Scale-17 score at 12 weeks [ Time Frame: Measured at weeks 10 and 12. ]


Their primary outcome measure is remission as they described it in the paper, and the secondary outcome measure is the customary criterion of response as you stated. So they are probably looking at the latter.

But it is a bit bothersome that we don't know where the partial responders would fall on Figure 3A.

Psycritic - This brings us to your point about being a study aimed at generating, rather than testing, a hypothesis. The authors are likely aware of this, because they've submitted another grant to run a larger prospective study. But given the low numbers in the original four groups presented here, I'm not sure I would put all my eggs in the right anterior insula basket.

At any rate, the NIMH seems to be taking this report very seriously.

 
At June 15, 2013 11:16 AM, Blogger David Dobbs said...

As I remember Dr. Mayberg explaining to me (it's possible I have this wrong; my notes aren't before me), the study set aside partial responders because they sought to ID pts who reached full remission (Hamilton score less than 8), since partial responders in short trials like this — people who reduce Hamilton score but don't get under 8 — often drift back up. In other words, they were out to ID how well the scans could predict reliable, full remission.

Again, it's worth noting that the authors stress that this needs prospective, double-blind testing at bigger scale; thus their use of the phrase "putative" marker as description of their finding.

Small is naturally how one stars with such studies, no? Which is fine as long as a study doesn't get overhyped. This one got a lot of attention; but I don't feel either Mayberg or the NIMH (at least those I talked to there) have hyped it as a done deal.

 
At June 15, 2013 11:50 AM, Blogger Bernard Carroll said...


I don’t give much credence to this explanation for the cherry picking because the authors presented no data on the durability of remission. They just got the patients over the finish line at 12 weeks and that was that… no follow up. It sounds like ex post facto talk.

 
At June 16, 2013 9:32 AM, Anonymous Anonymous said...

A couple of points:
1. I agree, p <.001 with a 100 voxel cluster constraint may well result in a family-wise alpha of .05, depending on image smoothness.
2. I would be surprised if these findings generalized. Even if the N is small, they could have presented some jacknife analyses to try to assess generalization. Probably they did, but they are not reported because they did not work out.

 
At June 16, 2013 11:32 AM, Blogger The Neurocritic said...

Anonymous of June 16, 2013 9:32 AM - The results might have been significant at .05 corrected for FWE, but the paper didn't specifically say. The smoothing was 4 mm FWHM for the 34 participants with structural MRIs and 8 mm FWHM for the 4 participants without MRIs (who were normalized to a template).

There were no jackknife or permutation analyses presented, as you noted. Instead of resampling, they reported correlations (which did not assess generalizability beyond the current cohort):

"To further assess the generalizability of findings identified in this restricted analysis to the full sample of study completers, metabolic activity was correlated with percentage of change in HDRS score within each treatment group to determine if the putative biomarkers identified in the ANOVA showed the predicted general pattern in the full cohort of phase 1 treatment completers."

These results (which included remitters, nonresponders, and partial responders) were not all that great:

There was a significant correlation between baseline insula activity and percentage of change in HDRS scores in both the CBT and escitalopram groups. A positive correlation was shown for the CBT group (r = 0.55; df = 31; P = .001) (Figure 3). In contrast, the escitalopram-treated patients showed an opposite but less significant [i.e., non-significant] correlation (r = 0.31; df = 28; P = .09).

This goes back to Dr. Carroll's comment about partial responders: their data were included in Figure 3B (not shown in this post) but not as a separate group (as in Figure 3A). I missed this initially.

 
At June 16, 2013 2:37 PM, Blogger Bernard Carroll said...


To Neurocritic’s last comment: It is true that the partial responders are shown in Figure 3B, but they are not displayed usefully – the abscissa in this Figure is percentage change in HDRS, whereas the criterion of remission was different: it was HDRS less than 8. Thus, one cannot positively identify the 3 groups – remitters, partial responders, and nonresponders. The authors had the option to use distinctive symbols for clarity but they didn’t do that. As a psychometric note, percentage change scores are not the safest measures of response. So we still don’t really know where the partial responders fell in relation to the remitters and the nonresponders. I agree with Neurocritic that these cases of partial response should have been included in Figure 3A.

 
At June 18, 2013 1:55 PM, Blogger wiley said...

When I found out, seven years after my first "atypical" depression (that didn't respond to any of the numerous cocktails I had been given) that it was a severe iron deficiency that was made right as rain by high doses of iron and vitamin C; you can bet that I was rightfully angry at this profession and what it's "medical expertise" had put me through for seven years.

I wasn't "depressed". My body and brain were starving for oxygen. That has a tendency to slow down brain function and depress mood.

If they don't first rule out physical causes of what they describe as "depression" like b-12 deficiencies, iron deficiencies, diabetes, MS, thyroid problems, , being in an abusive relationship with no apparent way out, etc; don't find out if the person is taking a drug or drugs that can have a negative effect on mood--- like the pill, illicit drugs and/or alcohol ( a depressant); and don't talk to these subjects about what's going on their lives--just assuming that they suffer from a biological malady that can only be treated with drugs; then they reify their constructions. How convenient.

If they want to be taken seriously a medical doctors, then I suggest they start practicing like one.

 
At June 19, 2013 12:36 AM, Anonymous Anonymous said...

If you believe Irving Kirsch (he has pretty damn convincing evidence), antidepressant drugs are no better then placebo, with very few exceptions. If you also take into account the side effects of these drugs, then placebo comes out way ahead.

 
At June 19, 2013 8:48 PM, Blogger Sci Grumbler said...

One thing I find odd about this study is that the minimum HAMD score for inclusion is either 17 or 15. None of the groups has a mean HAMD of greater than 20. If I remember correctly, in patients with a HAMD score under 26 or thereabouts, antidepressants are no better than a placebo, and the study may just be identifying patients who'll respond to a placebo in drug form. (Which may be a good thing too.)
It also seemed odd to me that the ratio of CBT responders to non-responders in the melancholic group was similar to the overall CBT group. The literature seems to be pretty clear that melancholic depression responds to drugs and ECT, but not CBT.

 

Post a Comment

Links to this post:

Create a Link

<< Home

eXTReMe Tracker