Cort Johnson Interviews Tom Kindlon (Part Three)
In early 2012, Cort Johnson began an interview with the Irish ME Association Officer and peer-review published author, Tom Kindlon. This interview project was revived and edited by Stukindawski in early 2013. The interview is in three parts, with an introduction by Stukindawski, and the other parts can be accessed by clicking the following links:-
In this third and final part of the interview, Cort and Tom discuss the effect of the UK-based PACE Trial on government views about CBT. Tom describes how the positive media coverage of the trial seems at odds with the substance of the paper itself. Finally, Tom’s area of expertise and subject of his paper: “Reporting of Harms Associated with Graded Exercise Therapy and Cognitive Behavioural Therapy in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome”, is up for discussion. In particular, Cort and Tom look at the importance and relevance of patient surveys when considering the safety of CBT/GET for ME/CFS patients.
CORT: Do you have any sense how different the government views CBT/GET post the PACE trials. They were a huge deal – the UK government spent an enormous amount of money on them – they clearly hoped they would validate their focus on them yet the trials suggested the therapies were not very effective; certainly not worth putting alot of money into.
TOM: (Note: this answer was written before the latest PACE Trial paper (McCrone et al.) came out but my views haven’t changed – I’ve added a bit on the new information at the end of the answer.)
I’m not sure if the PACE Trial has changed views of CBT/GET much.
The media coverage gave the impression proponents of such therapies would be happy with the results. For example, “Talking And Exercise Could Cure ME – Study” (Sky News) and “Study finds therapy and exercise best for ME” (Guardian). The latter newspaper included a quote from Trudie Chalder saying “twice as many people on graded exercise therapy and cognitive behaviour therapy got back to normal” compared with those in the other two treatment groups. This sort of coverage wasn’t restricted to the lay media: in the accompanying editorial in the Lancet , Drs. Bleijenberg & Knoop said a “strict criterion” for recovery was used and “in accordance with this criterion, the recovery rate of cognitive behaviour therapy and graded exercise therapy was about 30%”. The BMJ in its “All you need to read in the other general journals” section said: “Less than a third of patients were cured by either treatment – (30% (44/148) after CBT and 28% (43/154) after graded exercise therapy)”.
One problem, for anyone who would not agree with these impressions, was that a lot of the media interviews had been arranged before most people had received a copy of the paper. Eventually, quite a lot of people (44) did submit formal letters to the Lancet in response to the trial (most of these responses were released and virtually all of them were critical). Eight of the letters (including one from me) were published in the end, making a variety of points about the trial by the Lancet. However, letters to the Lancet can use only a maximum of 250 words and 5 references so what could be said was somewhat limited. Also, even those who read the published letters in the Lancet (only a small percentage of those who would have seen the original coverage) might be swayed by the authors, who got the final word, along with an editorial in the same issue which defended the trial and criticised critics of the trial.
A lot of the problems with the trial possibly only become apparent when one knows the field well and also, importantly, has read the published protocol and seen the changes that were made. If one doesn’t know the ME/CFS literature well, one may not be aware of the problems that the lack of objective outcome measures, for example, can cause – that research has shown improvements on questionnaires don’t necessarily mean a higher level of activity has been achieved. If one hasn’t read the PACE Trial’s protocol, which most people won’t have done, one won’t know that the authors had a definition for recovery that was much more stringent than the post hoc definitions of normal fatigue and functioning reported on in the Lancet: [“’Recovery’ will be defined by meeting all four of the following criteria: (i) a Chalder Fatigue Questionnaire score of 3 or less , (ii) SF-36 physical Function score of 85 or above [47,48], (iii) a CGI score of 1 , and (iv) the participant no longer meets Oxford criteria for CFS , CDC criteria for CFS  or the London criteria for ME “. In comparison, normal fatigue and functioning was defined as having a Chalder (Likert scoring) of 18 or less and a SF-36 physical functioning score of 60 or more. And with all the information that people deal with in their lives, most people will probably never reflect that the Chalder and SF-36 thresholds are nothing like what a healthy person would expect to be if they were “back to normal”/”recovered”/”cured” — for example, the population norms used for the SF-36 included elderly adults as well as those of work age with disabilities or other disabling illnesses. Also, the new “normal” levels for fatigue and physical functioning were so broad that somebody could deteriorate from their initial levels (i.e. before starting a therapy) and still be counted as having normal levels!Tate Mitchell has illustrated this visually in Figure 1 at:
There is a possibility that attitudes might change to the PACE Trial results if there is more reflection on the 6 Minute Walking Test (6MWT), the only objective* measure that was published. There was no difference between the CBT group and the control group on this measure. The GET group did reach, on average, 35m further (after the authors made some adjustments based on some baseline factors which could affect the change). However, the final average figure was still only 379m. The interesting thing about this statistic is that we have data to compare it with other disease groups and the final results suggest that the CFS group remained very ill (see, for example, Figure 2 below prepared by biophile, a member of Phoenix Rising). Indeed, as I pointed out in the paper in the Bulletin of the IACFS/ME, using figures from population norms, one would expect a group (of average height, which was 39 years old and which is 77% female) to walk an average distance of 644m (vs 379m).
Figure 2: (Biophile, a member of Phoenix Rising, created this informative figure – Click it to view an enlarged image)
The theoretical model used to justify this particular form of CBT and GET doesn’t suggest merely that such interventions should provide some degree of palliative relief, but rather predicts that they should be able to actually reverse the condition. The very modest therapeutic results from this model strongly suggest otherwise—that the underlying model is false, for most patients anyway [we are not given full information on the spread of the results; perhaps a small percentage of the (Oxford criteria) CFS patients did improve relatively dramatically with GET, with the results of the others comparable to the controls].
However, once established, health systems can be slow to change—there are dozens of services in the UK (mainly England) based on the CBT/GET-model for the illness. These involve hundreds of individuals who can influence people through further publications, education courses, media comments, etc. An interesting comparison is Belgium which had an external audit of its five rehabilitation Chronic Fatigue Syndrome clinics that use CBT/GET. Unfortunately, the audit is only available in Dutch or French –but I did French for eight years in school (and did another three years of night classes afterwards) so was able to read the audit report. A subsequent (mainly English language) report summarised the results of the rehab clinics: “Psychological problems or psychiatric co-morbidities improved, but still fell outside the range of healthy adults. Physical capacity did not change; employment status decreased at the end of the therapy” and “no patient was considered to be totally cured.”One might think these results might lead the authorities to question that such a rehabilitation approach should be the primary method of treating the condition (as opposed to be, say, an adjunct therapy offered to some). However, from what I hear, things haven’t changed in Belgium following the audit. There seems to be a similar situation occurring in the Netherlands, with CBT having a dominant role in the health system: but I don’t have a good handle of everything that happens there as I don’t speak Dutch. If I had more time, I would write a paper critiquing the definitions of recovery/full recovery that have been used by Dutch researchers there as they seem very questionable. For example, one study claimed 37% of patients had recovered (as defined by scores on two subjective measurements) following CBT yet the total average hours worked only increased from 9.4 hours to 11.4 hours. I had a letter published in the Lancet recently questioning the definition of recovery they used in the FITNET Trial. It would be interesting if external audits of the services in the UK were performed, ideally using more objective measurements like were used in Belgium. The clinical trial environment can be an artificial one – such empirical evidence might have more of an effect on attitudes of those in charge of health budgets.
One thing to remember with all this data is that the more severely affected have invariably been excluded from trials of CBT and GET so we know little about this group. Most patient groups, I think it’s true to say, think the more severely affected should be extra careful with any programs that encourage the scheduling of increased activity. Some physicians have also commented on this issue. Dr. Darrel Ho-Yen, a microbiologist who published several papers in the past and wrote the popular book, “Better Recovery from viral illness”, says he does “not recommend an emphasis on greater activity until a patient feels 80% normal”. Another doctor with similar views is Dr. Martin Lerner who, along with treating many patients, had personal experience of the condition but is now better. He “prohibits” exercise until CFS patients reach 7 or more on his Energy Index Point Score. This is a high score – if you score more than 5 you are in recovery. A score of 7 means an individual who can work a sedentary 40-hour per week job, who does not need to nap during the day, is up from 7AM to 9PM, and does light housekeeping. He says, “if you exercise before that you’re going to go backwards”.
To me, this is a very important issue. Doctors and other healthcare workers will so often recommend exercise and activity programs to their patients. I think there needs to be a big emphasis on finding the patients for whom such programs could be dangerous. It is not good enough in my mind if a program “on average” helps patients, if some are left much worse by it. Much more care is taken with the prescription of drugs.
As I mentioned above, I wrote the above before the latest paper (McCrone et al, 2012) came out. It has not changed my views; whether it will change views of those in the medical establishment in the UK, I’m not sure. The media again did not communicate the poor results on many measures; the coverage was generally uncritical using the view presented from the press release that GET and CBT were cost-effective. The cost savings largely come from savings from informal care (i.e. given by family and friends): participants were asked how many weeks on average they utilised such care over the past six months – there were significant savings in the CBT and GET groups particularly when “adjusted” figures were used (the unadjusted figures were much smaller; it is unclear to me at this stage what they adjusted for). The authors used the national mean earnings to calculate the value of such informal care; other figures such as the minimum wage, or even the median wage (which would be influenced less by very high wages), would give lower values. On other measures, GET and CBT did poorly. For example, neither CBT nor GET led to an improved rate of days of lost employment [Means (standard deviations): APT: 148.6 (109.2); CBT: 151.0 (108.2); GET: 144.5 (109.4); SMC (alone): 141.7 (107.5)] (Table 2) (2). Neither CBT nor GET led to improvements in numbers receiving welfare benefits or other financial payments – indeed there were deteriorations across the board (Table 4). The Department of Work and Pensions (DWP) funded this trial – I believe this may be the first trial they ever funded. They did this presumably with the hope or expectation that the therapies would lead to improvements in employment levels and need for benefits, rather than other measures, so from their point of view the trialled therapies would not be successes. It’s also interesting to note that the cost effectiveness calculations that say CBT and GET are value for money do not involve the costs of welfare benefits. I could add a lot more but this response is already quite long; many people including myself have written comments, sometimes with multiple references, on this paper which can be read at this link
Since writing this, a team of members of Phoenix Rising has also produced an informative and educational site, “ME Analysis – evaluating the results of the PACE Trial”, along with some highlights on their Facebook page. My responses here have been quite long and detailed, while that site has different levels of depth depending on a reader’s preferences. Well worth checking out.
* Technically, there are some concerns about the 6 Minute Walking Test as an objective measure of change. For example, it is known that there is a training effect with the 6 Minute Walking Test – people do better on subsequent tests. The GET cohort were supposed to go for regular walks so should be better able to judge how fast they could walk; people in the control groups weren’t encouraged to walk and might go too slow or fast initially in the test and hence might “underperform” compared to the GET group. A related point is that no practice trials were used – in other populations, those being tested usually do the test once or twice before the recording starts (sample reference). Also, no measures of gaseous exchange were taken so one can’t be sure that everyone pushed themselves to the same extent. The GET group might be more willing to push themselves as they may have wanted to impress themselves and/or the tester, as their goals were related to exercise. They were also told to consider symptoms that might occur following exercise as a natural response to increased activity and hence might be less concerned about pushing themselves too hard. These all suggest that the distance of 35m could be (somewhat) artificial; the “real” difference between groups could be less.
CORT: You gathered together the results of patient surveys of CBT/GET effectiveness (see below, several of which are quite large). They show a therapy that helps some but is not as benign as CBT/GET researchers state. How different are they from study results? Do you believe they provide a better assessment of the harms possible in CBT/GET than the study evidence?
A. Firstly, we should not just assume that exercise is necessarily a safe process.. This was highlighted in a general paper entitled “Dangerous exercise: lessons learned from dysregulated inflammatory responses to physical activity”: “exercise can cause harm and is associated with, for example, bronchoconstriction, chronic musculoskeletal injury, and on rare occasions, anaphylaxis and sudden death.” They recommend that “like pharmaceutical therapies, prescribing exercise as therapy, an activity that is gaining in acceptance throughout the medical community, must be predicated on understanding the risks and benefits of exercise as thoroughly as possible.”
Various studies have also shown that exertion can trigger symptoms; indeed exertion-related symptoms are seen as essential for all (?) definitions of M.E. and ME/CFS as well as generally being alluded to in CFS definitions. The interesting thing about this is that studies have found that symptoms correlate with objective blood measurements e.g. Richards et al., 2001 and (Kennedy et al., 2005). We also, at this stage, have dozens of papers which report an abnormal physical response to exercise in the condition. A paper by Frank Twisk MBA BEd BEc and Michael Maes MD PhD is probably the most thorough in collating exercise abnormalities [“A review on cognitive behavorial therapy (CBT) and graded exercise therapy (GET) in myalgic encephalomyelitis (ME) / chronic fatigue syndrome (CFS): CBT/GET is not only ineffective and not evidence-based, but also potentially harmful for many patients with ME/CFS”].
Then there exists research by the Light team in Utah that specifically ties such findings together (i.e. abnormal symptoms and biological measurements following exercise): they looked at the effect of moderate exercise on patients with CFS and controls and found that, following the exercise, cytokine activity correlated with symptom flare in those with CFS (White et al., 2009); they also reported on some other correlations between symptoms and objective markers in another paper [“Moderate exercise increases expression for sensory, adrenergic, and immune genes in chronic fatigue syndrome patients but not in normal subjects” (Light et al., 2009)].
Another finding that would increase our suspicion that exercise programs might cause problems in ME/CFS is the changes seen on repeat exercise testing. The Pacific Fatigue Laboratory exercised CFS participants 24 hours apart and found there was a mean decline of 22% in VO2 max (VanNess et al., 2008). The research team said that this is unique and significantly different from what is found in other chronic diseases where VO2 max initially can be low but is reproducible to within a small number of percentage points on a repeat cardiopulmonary exercise test. This suggests it would be hard to find levels of exercise that could be done without problems. And indeed, when researchers have tried to find levels of activity that would not provoke symptoms, they have had difficulties e.g. Nijs et al., 2008 and Van Oosterwijck et al., 2010.
So such findings add to the credibility of the survey data. Also, it has previously been recognised by various reviews of published papers that the reporting of harms (which is the technical term for reporting possible adverse reactions and events) has been poor in ME/CFS trials of CBT and GET e.g. Cochrane Review by Price et al., 2008, Chambers et al., 2006 (which was the basis of the NICE 2007 guidelines in the UK) and a recent paper by Van Cauwenbergh et al., 2012. So it isn’t necessarily the case that the quality of such harms data (as opposed to data on efficacy measures) is vastly superior in trials. Also a Cochrane review of GET found there was a trend for a higher dropout rate in GET as compared to the control group which suggested participants may have had difficulties with the exercise programs [Edmonds et al., 2004].
The survey data I collated is from a variety of countries. I do think it is of use. While randomised controlled trials and the like are seen as the best evidence with regard to the efficacy of treatments, things are more complicated when one is looking at harms. With pharmacological agents, in the US there is post-marketing surveillance to look for adverse events and reactions that occur. In other countries there are schemes like the yellow card scheme where doctors, and sometimes patients, report adverse events that appear to be possibly associated with a drug. This information is collated centrally somewhere. This is in contrast to the situation with GET or graded activity-oriented CBT, where there is nobody collating such information. Patients who have had bad experiences may simply stop attending a particular practitioner. If they do report problems, there is a good chance the practitioner may not be particularly happy, particularly if they are a non-medical healthcare professional such as a physiotherapist/physical therapist who may have a few other interventions they can offer. If an apparent adverse reaction does manage to get recorded somewhere, it’s likely that it will be understood to mean that either the patients performed the intervention incorrectly and/or the individual practitioner did something wrong. It is unlikely to be considered that the therapy itself might be problematic given all the talk that GET and CBT had been proven to be “evidence-based therapies” in trials, even though, as I have pointed out, the reporting of harms has not been good in those trials.
I think it’s also important to explicitly point out that even if it were the case that the reporting of harms was fine in the published research, trials can take place in an artificial environment and not represent what happens in the “real world” in terms of clinical practice.
I do recognise that such survey data is not without its limitations. For example, as I mentioned, there can be different programs under the one heading of CBT and there may be different results for these variations and this isn’t made clear by the aggregate data. One would certainly ideally want more information about the adverse events experienced and how long they lasted – information from drugs would be much more detailed. However, the data is not that different from clinical global impression (CGI) scores that are often be used as primary and secondary outcome measures in trials.
Prof. Peter White has previously pointed out that such survey data doesn’t give information on whether a suitable professional was involved in the therapy, specifically mentioning a 2003 survey by Action for M.E. showing this was often not the case. However, when one actually looks at the results of the 2003 survey, one sees that the sample size was very small and it’s far from clear that the survey shows the results were better for people who did do it under an appropriate professional (I can’t give a link to the results as I don’t believe the data is now online). Moreover, if one looks at the larger Action for ME/AYME survey from 2008 (additional information at this link), it asked patients under what circumstances they had done GET. Of those who had performed it under a NHS specialist, 31.27% (111 out of 355) reported being made worse by it; this was a similar percentage to those who reported the same outcome from GET in another scenario (33.02%, 70 out of 212).
The survey data doesn’t show how compliant patients were with programs. However, even in trials, this information has generally not been collected e.g. actometer data. Sometimes exercise logs have been used but these generally have not been reported in trials.
Survey data doesn’t give information about the diagnosis of the patients. However it is far from clear that one could say that patients in trials are a more homogeneous group. A lot of the trials have used the very heterogeneous Oxford criteria which don’t require symptoms other than mental and physical fatigue. It also mentions excluding patients with neurological conditions, although it is unclear how many genuine M.E. patients were excluded by that criterion. One CBT trial used the Fukuda criteria except that the symptom requirements outside of fatigue were not used (Prins et al, 2001). It may be the case that graded activity-oriented CBT or GET are reasonably suitable for some sorts of “chronic fatigue”, or some individuals with chronic fatigue, but not others.
The survey data doesn’t include any biological measurements. It would be more convincing [that patients were made worse by a therapy] if some sort of objective measure had shown a deterioration. However the same can be said for trials of CBT or GET which have not employed biological measurements as outcome measures in terms of assessing what they show with regard to efficacy and harm. I think is unclear which biological measurements would be the best to use at this stage – hopefully our understanding of the pathophysiology will improve in the coming years.
I would like to see more long-term data, both from trials and also surveys. I have come across people who are worse years after participating in an exercise program. It’s always hard to say that some sort of permanent damage has been done (as perhaps somebody might improve eventually), but, for whatever reason, there has been a long-term negative effect and this can ruin lives: if a drug had caused such an effect, questions might be raised as to whether it was appropriate to have it on the market (i.e. whether it should be withdrawn), at least until the effects could be studied further. Hopefully with a better understanding of the biology of the illness and the effects exercise can have, such long-term problems can be prevented in the future. Activity is a normal part of living; however, something abnormal happens with ME/CFS and we need to be able to spot when a particular level of activity is causing problems for patients, particularly if it could lead to long-term problems. Ideally, it would be great if some sort of equivalent to self-testing in diabetes could be developed for ME/CFS, which would show patients – and indeed professionals – that the patient involved needs to be careful or else they could develop long-term problems. In the meantime I believe patients and professionals need to be very careful about prescribing increasing activity and exercise, using as much caution as would be used with a powerful drug that can sometimes cause serious problems. Given the fact that there is some evidence that symptoms correlate with oxidative stress, they [symptoms] seem a useful way to decide whether a level of activity is causing problems. Basing activity levels on symptoms is one way to describe pacing and the survey data I collected showed the numbers/percentages reporting being made worse from such an approach were very low (i.e. pacing was reported as being “safe”).
The percentages reporting being made worse, particularly with GET, are striking. Something unusual clearly happens in exercise programs in ME/CFS. If one then adds in the dozens of papers that have found abnormal responses to exercise in the condition, I think one is building a strong case that exercise programs shouldn’t be prescribed as freely to ME/CFS as they would be to most other people in the world.
Acknowledgement: I would like to thank Lily Chu MD MSHS from whose comments on draft versions of my paper (that was published in the Bulletin of the IACFS/ME) I learned a lot. As I mentioned before elsewhere, all of the following also gave me useful input on the paper for which I’m grateful: Andrew Kewley, Clara Valverde, Deborah Waroff, Ellen Goudsmit, George Faulkner, Greg & Linda Crowhurst, Jane Colby, Janelle Wiley, Jennie Spotila, Joan Crawford, Karen M. Campbell, Karl Krysko, Kelly Latta, Pat Fero, Peter Kemp, Sean*, Simon McGrath & Susanna Agardy. I showed my draft response to these questions to a few of these so thanks again to those who helped with this, although I take responsibility for it.
* That’s all he wanted