Very Severely Stupid About Depression

An unassuming little paper in the latest Journal of Affective Disorders may change everything in the debate over antidepressants: Not as golden as standards should be: Interpretation of the Hamilton Rating Scale for Depression .

Bear with me and I’ll explain. It’s less boring than it looks, trust me.

The Hamilton Scale (HAMD ) is the most common system for rating the severity of depression. If you’re only a bit down you get a low score, if you’re extremely ill you get a high one. The maximum score’s 52 but in practice it’s extremely rare for someone to score more than 30.

First published in 1960, the HAMD is used in most depression research including almost all clinical trials of antidepressants. It’s come under much criticism recently, but that’s not the point here. The authors of the new paper, Kristen & von Wolff, simply asked: what does a given HAMD score mean in terms of severity?

It turns out that people have proposed no less than 5 different systems for interpreting HAMD scores. Do they all agree? Ha. Guess.
The pretty colors are mine. Just a glance shows a lot of variability, but the obvious outlier is the second one. That’s the American Psychiatric Association (APA )’s official 2000 recommendations . Their interpretations of a given point on the scale tend to be worse than everyone else’s.

This is most apparent at the top end. The APA use the terminology “Very Severe”, which doesn’t even appear on other scales. Much of what they class as “Very Severe” (23-26), two other scales class as “Moderate” depression! Amusingly, British authorities NICE seem to have been so unimpressed with this that they simply copied the APA’s scale and toned everything down a notch for their 2009 criteria.

Why does this purely terminological debate matter? Well. A number of recent studies, most notoriously Kirsch et al (2008) , have shown that antidepressants work better in more severe cases. See also my post here . The cut-off for antidepressants being substantially better than placebo generally comes out as about 26 on the HAMD in these studies.

Under the APA’s 2000 terminology, this is well into the “Very Severe” band. Hence why Kirsch et al wrote – in a phrase that launched a thousand “Prozac Doesn’t Work” headlines –

antidepressants reach… conventional criteria for clinical significance only for patients at the upper end of the very severely depressed category.

But for Bech, 26 is simply middle-of-the-road “major depression”. For Furukawa, it’s borderline “moderate” or “severe”. Hmm. So if they’d gone with those criteria, Kirsch et al would have written instead

antidepressants reach… conventional criteria for clinical significance only for patients with major depression, of moderate-to-severe severity.

All of these terminological criteria are arbitrary, so this isn’t necessarily more accurate, but it’s no less so. The irony of the fact that Kirsch et al used the American Psychiatric Associations own criteria to skewer modern psychiatry isn’t lost on me and probably wasn’t lost on them either.

But where did the APA get their system from? This is the most extraordinary thing. Here’s the paper they based their approach on. It’s an 1982 British study by Kearns et al. The authors wanted to see how the HAMD compared to other depression scales. So they used lots of scales on the same bunch of depressed patients and compared them to each other, and to their own judgments of severity. Here’s what they found:

You’ll recognize the APA’s categories, kind of, but they’re all shifted. Why? We can only guess. Here’s my guess. The scores in that Kearns et al graph were the average HAMD scores of people who fell into each severity band. The APA must have decided that they could use these to create cutoffs for severity.

How? It’s not at all clear. The mean score for “Moderate” was 18, but that’s the top end of Moderate in the APA’s book; ditto for “Mild”. The average “Very Severe” was 30 and the average “Severe” was 21 so the cut-off should have been 25 or 26 if you just went for the midpoint, in fact the APA went with 23. And so on.

That’s before we get into the question of whether you should be using these results to make cutoffs at all (you shouldn’t.) And the APA seem to have ignored the fact that the HAMD did not statistically significantly distinguish between “Severe” and “Moderate” depression anyway (p=0.1). Kearns et al’s graph shows that other scales, like the Melancholia Subscale (“MS”), would be better. But everyone’s been using the HAMD for the past 50 years regardless.

In Summary: Interpreting the Hamilton Scale is a minefield of controversy and the HAMD is far from a perfect scale of depression. Yet almost everything we know about depression and its treatment relies on the HAMD. Don’t believe everything you read.

Kriston, L., & von Wolff, A. (2010). Not as golden as standards should be: Interpretation of the Hamilton Rating Scale for Depression Journal of Affective Disorders DOI: 10.1016/j.jad.2010.07.011

Kearns, N., Cruickshank, C., McGuigan, K., Riley, S., Shaw, S., & Snaith, R. (1982). A comparison of depression rating scales The British Journal of Psychiatry, 141 (1), 45-49 DOI: 10.1192/bjp.141.1.45


12 August 2010 14:20

Anonymous said…

Interesting. I have to say I’ve always wondered about trials that rely solely on HAMD scores, as I’m not at all convinced about using only one measure of depression (and I’m sure I’ve read papers about dubious concurrence between self and other ratings of things like depression). Saying that though, I’m not sure there’s any great mood rating scales. All of them are problematic in some way, especially when you get into trying to severity labels.

12 August 2010 17:14 Marcos Hardy said…Thank you for your postings (this and all).
DSM-IV is driven by PhRMA. Is so over-inclusive, and DSM-V will be more so, that everybody (and I mean everybody) is or will be seen as mentally ill at some point in their lives, and if they are unfortunate enough to be within earshot of a physician (that prescribe antidepressants more often than psychiatrists), they will be medicated. By now the majority of the adult population in the USA is. Of need, APA’s HAMD scale is tilted towards magnifying the severity of depressions, loudly claiming for a pill to treat from sadness and grief all the way to melancholias. PhRMA requires the amplificationt of the patient base.
As you well point out, not all depressions respond equally to medications. That is because not all depressions are the same. It is not only the severity that determines the response to a pharmacological intervention, but also the roots of the depression: Is there childhood trauma? How old is the depressed patient? Man, child or woman? And so on. Based on the flimsiest of evidence, and contradicting every single known tenet of the neurobiology of depression, common to prescribe 2 antidepressants in the so-called “non-responders,” particularly in the “severe” ones. Severity is not only in the eyes of the scale employed; but also in the subjectivity of the rater. 12 August 2010 21:55 Bernard Carroll said…Max Hamilton would surely turn in his grave if he knew how his scale is now being used. Max was professor of psychiatry in Leeds, U.K., and he was firmly grounded in clinical science. For starters, his original scale has been stretched from 17 rated items to 28 at my last count, but the tinkerers haven’t done the hard work of validating the add-ons. Hamilton emphasized that his scale is not a diagnostic tool, just a severity measure once the diagnosis is made by a clinician.

As for scoring ranges, my impression is that the relationship between global severity and Hamilton scale score is sigmoid rather than linear. That is why scores above 35 are rare. That is also why a score of just 6 or 7 can be an ominous signal in a person who earlier achieved remission with a score of 0 or 1. Context is everything. 13 August 2010 02:15 pj said…“if they are unfortunate enough to be within earshot of a physician (that prescribe antidepressants more often than psychiatrists), they will be medicated.”

I’ve recently gone into a medical hospital and audited their antidepressant use – suffice to say that, as the literature generally shows – physicians are very bad at even recognising depression and unlikely to treat it even when they do. This is due to a belief that medical patients are justifiably depressed. 13 August 2010 21:02 Neuroskeptic said…Personally, I’m impressed with the Bech Melacholia Sub-scale of the HAMD.

“Depressed mood”, “Guilt”, “Impaired Work and
Activities”, “Psychomotor Retardation”, “Psychic Anxiety” and “Somatic Symptoms, General” (which is tiredness, loss of energy etc.)

It distinguishes between severity levels better, antidepressants work better vs. placebo using the sub-scale than they do on the full scale HAMD (which is evidence against the idea that antidepressants only work by reducing anxiety & insomnia…). 14 August 2010 18:34 saltedlithium said…I’ll have to ask my doctor which version is used in Canada. If we don’t have our own ‘Thing’ we usually mix and match with the UK and American models. In this case it seems like it could get a little weird.

And the Furukawa model, what’s “asymptomatic” depression? I have all the criteria to be depressed, but it hasn’t manifested? Doesn’t that just diagnose me as being born? 15 August 2010 06:56

Anonymous said…

This is very interesting. I’ve read quite a few articles that state the average patient diagnosed with depression has severe depression. It always seemed a bit implausible to me, as the term severe would seem to imply a major impairment in functioning, and provided that the prevalence estimates are correct, there should be a greater disease burden.
This sort of “severity inflation” can’t be good for convincing society that depression is a serious disease. When I hear very severe depression, I think of depression so severe that the person no longer functions and is either hospitalized or close to being hospitalized. Classifying someone who is still functional (though may find that functioning is more difficult than usual) as very severely depressed tends to reinforce the idea that depression is not a serious disease.

18 August 2010 06:33 Neuroskeptic said…Right, that’s a very good point.

The thing about the HAMD is that even a quite low score could indicate very severe impairment, depending on which items were contributing.

Conversely, you can get a “very high” score of say 26, simply by scoring 1/2 of the maximum on each question. So that would mean mild insomnia, some anxiety, reduced appetite but “food intake about normal”, loss of interest in work and hobbies (but not an actual decrease in productivity), etc.