Exercise therapy for chronic fatigue syndrome
Lillebeth Larun, Kjetil G Brurberg, Jan Odgaard-Jensen, and Jonathan R Price
Editorial group: Cochrane Common Mental Disorders Group
What has gone on before:
The original version of the Cochrane review claimed that graded exercise therapy was effective and safe for people with ME. It overwhelmingly relied on subjective measures from a small handful of researchers who typically work together and whose studies consistently replicate the same, methodological flaws. The issues with the original review were as follows:
- Five of eight studies used the Oxford Criteria.
The Agency for Healthcare Research and Quality in the US (AHRQ), in their own reanalysis of GET and CBT for ME, stated that the Oxford definition was so broad that it would include patients with other diseases and should be retired. The NIH Pathways to Prevention Report urged that the Oxford criteria be retired, stating that use of the criteria “could impair progress and cause harm”.
- The review rated the PACE trial was ‘low bias’.
Even without the numerous and bizarre stories surrounding PACE, outcome-switching, cherry-picking, sole use of subjective measures (and omission of objective data that had been gathered) make this an insupportable premise.
- All other studies also solely relied on subjective measures and were open-label (unblinded).
The review ignored objective outcomes from exercise interventions, which have generally failed to confirm subjective reports of benefits. Read more about why open-label trials are subject to bias from both subjects and clinicians/researchers here.
Cochrane editors asked for the review’s authors, including lead author Lillebeth Larun, to respond to the concerns outlined above (submitted as a complaint by Robert Courtney) before it published a re-submission. Read more about this here.
In response, Reuters released an article claiming that the review was withdrawn due to patient pressure — a dramatic narrative with which the ME community is all too familiar. The message Cochrane released explicitly stated, however, that the reconsideration was due to the quality of the cited research.
Over 40 scientists and clinicians signed a letter thanking Cochrane for doing the right thing.
The Cochrane review had been cited as part of the evidence base for recommending CBT and GET by Danish Parliament. However, in March of this year they reversed their decision, unanimously agreeing to recognise myalgic encephalomyelitis (ME) as a distinct disease, removing it from the “functional somatic syndromes” category, and promoting the World Health Organization (WHO) diagnostic codes for ME.
This shows that the concerns stakeholders raised about Cochrane was enough to cause some to look more critically at the evidence.
Cochrane’s review re-issued
Exercise therapy for chronic fatigue syndrome, by Lillebeth Larun, Kjetil G Brurberg, Jan Odgaard-Jensen, and Jonathan R Price under the Cochrane Common Mental Disorders editorial group, was released on October 2.
A lack of newer studies and evidence (and omission of unfavorable evidence) presents an obstacle to clarity and completeness.
The original report included studies from 2014 or earlier, notably omitting the evidence from the revised AHRQ report on CBT and GET efficacy, the IOM Report and its conclusions, and the Pathways to Prevention document from NIH which, while released in December 2014, has a 2015 publication date. The new version of the report cites the IOM Report and the Pathways to Prevention report, but these do not appear to inform their conclusions.
Moreover, some studies were left out of the mix despite being well within the timeframe: the Núñez study (2011) found that a course of GET and CBT left people with ME with worse physical function on SF-36 and in greater pain than the control group after one year. It was not included in the analysis. Stordeur et al. (2008) analysed the effectiveness of CBT and GET using data from the Belgian CFS Knowledge Centers. This was a large study that found no objective improvements after CBT and GET using exercise testing (VO2 max). They also found that fewer people were able to work and more people were receiving illness benefits after undertaking these therapies, implying a loss of physical functioning. This study was not included in the analysis.
Who was included? Who was missing?
The Cochrane report notably mentions that they used solely Fukuda and Oxford diagnostic criteria and emphasizes that the studies they considered de facto only included patients who were “able” to participate in exercise therapy. Oxford solely relies on the symptom of chronic fatigue in the absence of other diagnoses that may explain it. This criteria was used in approximately ⅔ of the included studies.
What symptoms were included? What symptoms were missing?
Oxford neither requires nor acknowledges post-exertional malaise as a symptom. Fukuda does list PEM as one of the potential eight symptoms that might help diagnose CFS, but it does not require PEM for diagnosis. It perhaps goes without saying that no researcher who uses CCC, ICC, or IOM definitions of the disease — all of which require PEM for diagnosis — would propose employing exercise as part of a therapeutic regimen.
Thus, it’s impossible to urge reviewers to include the consensus criterias or the IOM criteria here. No studies on the therapeutic effect of exercise exist that use these criteria — for good reason. The ‘decision’ to only include Oxford- and Fukuda-based studies and predominantly to use Oxford was not a decision at all.
All the same, even Fukuda has other symptoms listed as potentially meaningful in ME besides fatigue, pain, physical functioning, quality of life, depression, and sleep; in fact, major depressive disorder is exclusionary in Fukuda. Why did these studies that used Fukuda not measure the symptoms associated with this very simplistic criteria, such as PEM, issues with memory and concentration, headache, and sore throat or tender lymph nodes? This is part of the presentation of the illness, after all. The simplification of an already-simplistic criteria within the included studies certainly reads as a misunderstanding of the nature of the disease or an exclusive focus on specific aspects of patient experience.
Cochrane reports that selection bias was low.
Selection bias is the bias introduced by how researchers include or do not include patients in their studies. For example, if a study solely reached out to healthcare advocacy organizations, they might find that they had a more severely-affected patient population than if they searched solely through general medical practitioners’ offices. Cochrane reported that selection bias in the studies they included is low.
It perhaps goes without saying that judgment of selection bias is dependent on how the reviewers believe a population ought to be defined. Selecting solely patients who can exercise to some degree may result in an extreme selection bias to less severely-affected individuals, or possibly to those who have been misdiagnosed. We strongly disagree with Cochrane’s statement that there is low selection bias in the included studies.
Cochrane reports that performance bias is high.
Subjects may report their improvement or lack thereof differently, depending on whether they are in the treatment group or the placebo group. This is especially important given that, in the PACE trial, patients were sent promotional materials over the course of the study, and new patients were recruited through materials that may have biased them about the effectiveness of the therapies in question. We agree that performance bias is high in the selected studies.
Cochrane reports that detection bias is high.
Researchers who know which group subjects are part of may monitor their well-being differently, group to group. Cochrane reported that the studies they considered were likely to be biased in this manner. We agree that detection bias is high in the selected studies.
The nature of control and comparison groups introduces detection bias and performance bias.
In particular, one of the unstated/unexplored sources of detection bias and performance bias is the treatment of CBT, exercise, and pacing groups in the included studies. In some studies, pacing referred to “adaptive pacing”, a technique that still encouraged subjects to increase activity gradually over time, making it definitively an “exercise therapy” rather than pacing as patients, clinicians, and researchers typically define it.
In other studies, “standard care” was used as a control group. However, standard care in this sense meant no treatment or clinical attention beyond answering questionnaires. Such a control group would strongly favor the placebo effect in the experimental group, particularly in studies that solely use subjective measures as indicators of success.
Finally, in at least one study, the control group was made up of patients on the waiting list. It should be clear that this is not an appropriate comparison group and that once again, the placebo effect would heavily bias the experimental group.
The most appropriate control group is likely the “flexibility/relaxation” group, which still received clinical attention and regular check-ins but without ramping up activity over time which is, after all, the intervention being tested.
The “allegiance effect” points to significant researcher and clinician bias.
Psychotherapeutic treatments and treatments based on presumed psychiatric etiology are especially susceptible to the allegiance effect, in which the treatment favored by the investigator “tends to produce the superior outcome” (Westen, Novotny, & Thompson-Brenner, 2005). While it stands to reason that investigators will pursue the topics in which they are most invested, the consequences of the allegiance effect cannot be discounted, especially in the cases where the final conclusions laid out by the authors do not appear to be congruent with their data.
You can read an amusing paper that discusses the allegiance effect and its consequences for empirically-supported therapies here, or read a meta-analysis on the allegiance effect in psychotherapy by Luborsky et al. (2006). Part of the issue of allegiance bias in psychological therapies in particular is that measures are often subjective, and the clinician may unconsciously prod the subject to respond to their favored therapy or not respond to a therapy they consider ineffective. For example, a therapist’s confidence in their therapy of choice is almost certainly perceptible to the patient, even when this is not overtly advertised.
A lack of follow-up further weights the placebo effect and introduces additional issues.
The majority of the studies included study short-term effects, and did not inquire after harms. For example, only one study looked for adverse reactions to the therapies. Only one study with 43 subjects looked at pain, using the Brief Pain Inventory in comparison with a control group and a CBT group. Physical functioning scores were all SF-36 rather than from objective measures such as actimeter; we know in at least one study, the objective actimeter data was thrown out, presumably for its lack of congruency with favored outcomes. And quality of life was only measured in one study, in which those who exercised had lower quality of life than both control and CBT groups.
The longest follow-up was for approximately 1.5 years. True, long-term follow-up is important in all therapies, but particularly in CBT, in which transitory improvement is the not the exception but the rule. Westen et al. (2005) said of a meta-analysis of CBT in depression,
“Numerous studies have shown that CBT and IPT (and a number of lesser known brands) produce initial outcomes comparable with those obtained with medications. Over the course of 3 years, however, patients who receive these 16-session psychotherapies relapse at unacceptably high rates…” (Westen et al., 2005).
Chalder’s own long-term follow-up, also omitted, found that the subjective reports of improvement had disappeared. This is also in accordance with the Belgian long-term follow-up discussed above.
Symptomatic effects and overall well-being
Cochrane also judged whether or not GET and CBT were useful in mitigating symptoms associated with fatiguing illness. In order to do so, they separately examined studies that compared exercise groups to control groups; CBT and exercise groups; studies that compared exercise groups to adaptive pacing therapy; and studies that examined exercise and antidepressant use. Conclusions rated of moderate certainty or higher are boldfaced. Click on the box in the upper, right-hand corner to read the pdf within the article or you can download it by clicking here.
Let’s take a look at our pieces of evidence rated ‘moderate’:
- Exercise therapy probably reduces fatigue (short-term, in comparison to control group).
- Exercise therapy probably makes little to no difference in fatigue (long-term, in comparison to CBT).
- Exercise therapy probably has little to no effect on depression (long-term, in comparison to CBT).
Note that the negative findings are for longer-term studies, which again jives with studies on long-term follow-up for CBT and GET. The positive finding is for a study of just a few months. Recall that harms — a major concern in people with ME following exercise — are rarely measured such that the Cochrane review shows either ‘very low’ or no evidence for adverse reactions, pain, or pain intensity.
We are left mystified, therefore, that the conclusion Cochrane states is that “Exercise therapy probably has a positive effect on fatigue in adults with CFS compared to usual care or passive therapies”, although they do at least add that they are ‘uncertain’ if the effect persists in the long-term.
Garbage in, garbage out
There is, unfortunately, a lot more in the review that is inadequate, incorrect, or so heavily biased that it’s indistinguishable from inadequate or incorrect. However, in many ways the committee has been constrained by the therapy they have chosen to examine.
When confronted with increasing weakness, it’s logical to decide to ramp up activity over time. Despite post-exertional malaise, people with ME may make the attempt several times before realizing that increased activity does not pave the way to increased capacity as it did when they were well. Rather, it leads to further impairment.
To believe that increased exercise is an effective therapy worth testing in a clinical trial, researchers and clinicians must believe that patients’ symptoms are either incorrect, imagined, or immaterial. This de facto leads to theories that the patient doesn’t know what’s best for him or her, even when it comes to the most basic self-interrogation and self-care. Deconditioning, “fear of movement”, and central sensitization as explanations for exercise’s potential success are all built on the foundation of this disbelief and dismissal.
It is no wonder that an analysis of studies built on this foundation will showcase not only a very narrow range of the world’s ME research, but highlight some of the most dismissive, belief-based, and biased work in the field.
Why perform a Cochrane review on exercise therapies in ME at all? Perhaps because it’s the only treatment with significant research to support it, no matter how poor. Our government institutions’ lack of funding and oversight have left a vacuum in which therapies solely supported by subjective measures of wellness have the greatest number of publications and therefore seem ripe for review. Our government institutions must step up their investment in medical research and clinical trials for ME that is not based on a denial of the disease-process.
Because unfortunately, Cochrane can’t do a meta-analysis on research that doesn’t yet exist.