A Guide to the 2006 Critical Assessment of Microarray Data Analysis (CAMDA) Conference (Integrating Laboratory, Gene Expression, Gene Mutation and Proteome Data)
The analyses of the 2003 CDC Wichita data did not stop with the publications in the Pharmacogenomic’s Journal. While the CDC researchers and others took a stab at the data the CDC apparently gave another group of independent researchers their shot at it. Unlike the researchers associated with the Pharmacogenomic’s studies the CAMDA effort was international in scope. Researchers from Finland, Canada, the U.K., Italy, Australia, Korea and the U.S. presented their findings in a series of papers presented at the CAMDA conference at Durham, North Carolina in June of 2006.
The CAMDA conference takes the form of a contest in which the presenters vie to produce the best paper using the same gene microarray data set, in this case involving CFS. An enormous amount of very creative work resulted in a presentation of 10 papers and three posters seeking to elucidate biological characteristics unique to CFS, provide a biomarker, and open new avenues CFS research and treatment. They used much the same data set as did the Pharmacogenomic’s researchers; gene microarray data from about half the genome, gene mutation data from about 50 neuroendocrine genes, extensive laboratory data focusing on the endocrine system, clinical data on fatigue, symptom severity, and psychological parameters, plus some other researchers had data on protein levels as well. Proteome data – which provides information on current protein composition – is the analog to gene microarray data. These and the Pharmacogenomic’s studies make up the most complex efforts yet made to understand CFS.
These are conference reports, not published papers, and as such they are less finished than the Pharmacogenomic’s papers: the number of patients are not always noted, the gene data is not well annotated, etc. Some of the reports came from studies that were ongoing. Many will probably reappear as peer-reviewed papers in the future.
These researchers were not always as cognizant of the intricacies of CFS as the Pharmacogenomic’s authors – some collapsed the distinction between the CFS patients and the idiopathic fatigue patients, and one group plainly stated that their ignorance of CFS pathophysiology may have impeded their efforts. Unlike many of the researchers associated with the Pharmacogenomic’s studies, none of the CAMDA researchers had any expertise in CFS and none were associated with the CDC.
Not surprisingly given the competitive nature of the conference most of these studies were more complex and experimental than the Pharmacogenomic’s studies. Given the experimental nature of the effort, it was also not surprising to find that the results were mixed; some studies were real successes while others were only partially successful and a few were largely failures. One study purported to find a biomarker for CFS. A review of the conference presentations suggests that a great deal of data did not show up in the papers and we can expect more complete papers in the future.
Except for the winning paper the summaries of these papers will be brief. (These papers and the presentations accompanying them can be found at http://www.camda.duke.edu/camda06/papers/Microsoft Powerpoint will open the presentation).
Many of these papers are very complex and difficult to understand. Clarifications are gratefully accepted.
Voted “BEST PAPER OF THE 2006 CAMDA CONFERENCE”
A BIOMARKER FOR CFS FOUND? Presson, A., Sobel, E., Papp, J., Lusis, A., and S. Horvath. 2006. Integration of genetic and genomic approaches for the analysis of Chronic Fatigue Syndrome.
This group’s goal was to find a potential gene expression biomarker in CFS. They took a much more rigorous approach to solving this problem than is usually done. Instead of simply evaluating which genes are more active in CFS, this group used a statistical process to create sets of genes called ‘network modules’ whose expression was correlated with each other. Within the modules (if I am reading this correctly), they identified particular gene or genes that displayed high ‘connectivity’. These ‘hub genes’ were intimately involved in whatever kinds of unique type of gene activity that were occurring in the CFS patients.
They found five modules of genes whose expression rose and fell in the CFS patients. In order to check if these modules made any difference in the disease they used a statistical program to determine if any symptoms associated with CFS were correlated with them. Four of the five modules were associated with at least one symptom commonly found in CFS.
These researchers focused on one set of genes or one module whose expression was associated with increased levels of abdominal pain and for overall symptom severity. When they examined this group of genes from a functional standpoint they found it contained many genes involved in nitrogen metabolism and muscle development.
They then examined the frequency of 40 mutations (single nucleotide polymorphisms (SNPs)) in several neuroendocrine genes to determine which if any contributed significantly to symptom severity. They found that a mutation on the tryptophan hydroxylase gene was associated with more severe symptoms in CFS. An examination of the gene modules found that the tryptophan hydroxylase mutation was also most highly expressed in those genes associated with more severe symptoms. We have seen this gene before – researchers in the Pharmacogenomics studies also highlighted mutations in tryptophan hydroxylase. This enzyme breaks down serotonin, a neurotransmitter involved in pain perception, mood, libido, lung and gut functioning and others. An overactive enzyme would lead to decreased serotonin levels and vice versa.
At this point, then, this group has uncovered two factors associated with increased symptom severity in CFS; a set of genes and an inherited mutation in the gene encoding the enzyme that breaks down the serotonin neurotransmitter.
The next step was to find a potential gene biomarker. To ensure that the putative biomarker was valid they required that it pass three tests; it also had to be associated with increased symptom severity, it had to be highly expressed in the subset of CFS patients carrying the tryptophan hydroxylase mutation, and it had to display high connectivity with other genes in the symptom severity module.
Eight genes passed those tests and this group focused on one – the FOXN1 gene – apparently because of its biological plausibility. This gene, called the Forkhead Box gene, plays a role in T-cell development and mutations in this gene have been shown to cause dysfunctional T-cells and an impaired immune response. This is an intriguing gene as we know that reduced levels of the main cytotoxic element in both T-cells and natural killer cells (NK), perforin, occur in CFS. CFS patients also display increased T-cell activation – perhaps in response to impaired T-cell functioning. Could the low perforin levels in CFS be caused by a mutation in this gene? Or does this gene simply indicate increased T-cell activation because of a pathogen or a problem with regulating the immune system.
Despite the current emphasis on the HPA axis and neuroendocrine functioning and all the conflicting immune studies it seems that here we are again back scrutinizing the immune system. As noted above FOXN1 gene is particularly interesting because it can be connected with one of the few consistently found immune abnormalities in CFS – impaired NK cell functioning. This gene’s association with a polymorphism that alters the rate of serotonin metabolism adds weight to the notion, now becoming fairly well expressed in the gene expression studies, of a disrupted neuro-immune interaction plays a role in CFS. This makes sense given all the multi-systemic symptoms in CFS. Serotonin is fascinating in its connections to brain induced fatigue, mood disorders and vascular problems, all of which may occur in CFS.
This gene may not be expressed in all CFS patients but it appears to be expressed in a subset of the most severely ill ones. It has not to my knowledge shown up in any of the gene expression studies. One very nice aspect of this gene is that because genomic and antibody markers are available for it should be easy to study in CFS.
As noted above this was only one of eight genes that this group thought might qualify as a biomarker and several of the others were quite intriguing. One, peroxisomal biogenesis factor (PEX6), is involved in the neurological system and metabolism. Peroxisomes play important roles in detoxification and in fatty acid breakdown and there is increasing evidence of fatty acid problems in CFS. An upcoming issue of Phoenix Rising will focus on this subject. Several peroxisomal genes have been highlighted in the gene expression studies. Another, the peroxiredoxin 3 (PRDX3) gene, is involved in antioxidant activities and another, myelin expression factor 2 (MYEF2), is involved in nerve cell development. All could fit in, in one way or another, with various findings in CFS..
Dr. Vernon’s decision to focus on this paper in her presentation to the conference underscores its importance. There was some indication that she was going to bring additional evidence of the Forkhead Box gene’s validity as a biomarker to the conference. Her powerpoint presentation suggested that she did but we can’t be sure as we don’t have the text to go with it. If she had confirming evidence, this study certainly would have been the highlight of both the Pharmacogenomics and CAMDA conference’s efforts.
Altered Genetic Networks in CFS
Kirova, R., Langston, M., Peng, X., Perkins, A. and E. Chessler. A systems genetic analysis of Chronic Fatigue Syndrome.
This group used the SNP, gene expression and proteome data to try to identify the ‘molecular networks’ present in CFS. This group’s aim was to determine if the genes in CFS, idiopathic fatigue and healthy controls interacted with each other differently. This group was not interested in single genes that were more highly expressed in CFS, they were looking at entire networks to see if the networks themselves were different.
For example the increased expression of the gene for the receptor for cortisol is usually accompanied by the increased or decreased expression of other genes. This is partially because cortisol production causes the activation of a large number of immune and other genes. The ‘cortisol network’ plays an important role in the body’s response to stressful situations of all types. Each time it is invoked, for example, it should tell the immune system to shut down. The multi-systemic nature of CFS with its immune, endocrine and neurological and other problems could suggest that our bodies are not ‘networking’ properly. This study, at least as far as I understand it, examined the composition of some of genetic networking going on in our bodies.
The initial examination of gene expression networks found that the CFS patients had different networks of strongly correlated genes than did the idiopathic fatigue or the healthy controls. This, of course, suggests different patterns of gene activation and that distinct disease processes are occurring in the CFS patients.
A correlation analysis of the protein, SNP and gene expression data allowed them to associate specific genes and proteins with gene mutations. Specifically, they were able to associate the increased production of three proteins with mutations in three neuroendocrine genes (COMT, CRHR1 and CRHR2) in CFS patients but not the others. This suggests that mutations in the genes for the corticotropin releasing hormone and catechol-O-methyltransferase result in altered gene expression and protein production in CFS. We will see a similar scenario in the Lee paper below which indicates that certain gene mutations result in different patterns of gene activation in CFS patients. Interestingly Lee also found a unique gene expression network associated with a CRH polymorphism in CFS. We will see that both CRH and COMT will show up again and again in the studies.
This group also found different gene expression networks in CFS patients with and without depression. When they associated neuroendocrine gene mutations with CFS phenotype (symptom) data they found that mutations in the N3RCI and the TH genes were associated with depression. Mutations in four neuroendocrine genes (POMC, MAOA, MAOB, COMT) were associated in mostly physical symptoms while those of NR3CI and CRHRI were associated with mental symptoms. The authors believe CRH may be a central gene in CFS as it was associated with both physical and mental symptoms.
Corticotropin releasing hormone (factor) lies at the top of the HPA axis. During the stress response CRH production by the hypothalamus prompts the pituitary to produce ACTH which in turn causes the adrenal glands to produce cortisol
Lee, E., Seoae, C., and T. Park. Integration of expression data and genotype data: application of Chronic Fatigue Syndrome data.
The brevity and complexity of this paper and the authors’ poor command of the English language made understanding it difficult. This group tried to differentiate CFS patients from controls by integrating single nucleotide polymorphism (SNP) and gene expression data. First they identified the effects that gene mutations (SNPs) had on gene expression. They showed, for instance, that a mutation in the gene encoding for the glucocorticoid receptor (NR3C1) resulted in the altered expression of 148 other genes. These mutations make a difference! They then examined the effects that same gene mutation had on gene expression in the different groups.
They found three instances where a gene mutation in CFS was associated with an unusual pattern of gene expression. Two involved genes coding for the cortisol receptor and one a gene coding for the corticotropin releasing hormone (CRH) receptor. Another altered relationship concerning the cortisol receptor was found in CFS patients both with and without depression.
The evidence is starting to add up that cortisol and its associated network may play an important role in CFS – one that belies the only mild hypocortisolism found in this disease. One study found a trend toward increased rates of mutation in a cortisol transport gene. Now we see that the gene networks involving CRH and cortisol are altered in CFS. By showing that mutations in the cortisol and CRH genes have different effects on gene expression in CFS patients compared to controls, these studies suggest alterations in the entire neuroendocrine immune network may be present in CFS. This suggests thatt is not just the gene mutations that are of concern but the networks in which they are embedded. This may mean that small reductions in say cortisol production could have larger effects than expected.
These findings of unique gene network patterns provide further evidence that a unique disease process is occurring in CFS. That much the same set of genes is highlighted in the Kirova, Lee and Goertzel studies is extremely encouraging and suggests that these researchers may have begun to hone in on a few critical neuroendocrine genes in CFS.( One wonders, though, what these researchers have found if they examined a network of immune genes as well.)
A Key Biological Pathway Altered in CFS?
Emmert-Streib, F., Glynn, E., Seidel, C., Bausch, C. and A. Mushegian. Detecting pathological pathways of the Chronic Fatigue Syndrome by the comparison of networks.
This group hypothesized that CFS will be explicable not through the examination of the expression of single genes but, as with the other groups, by examining the expression of multiple genes involved in multiple pathways. This group focused on the behavior of entire systems.
This study looked at the gene activity occurring in twelve biological pathways, some which have clear connections to CFS (lipoprotein metabolism, complement, bacterial defense, fatty acid biosynthesis) and some of which did not appear to (meiosis, regulation of cell shape). Only data from more severely ill CFS patients and healthy controls was compared.
The gene expression difference in two of the pathways (complement, notch signaling) was almost statistically significant, and that in the protein amino acid ADP-ribosylation pathway was clear. Several studies suggest increased activation of the complement system occurs in CFS. It is involved in clearing pathogens from the body.
The ADP ribosylation pathway regulates DNA repair and replication, transcription and cell death. This process was once believed limited to the nucleus of cells but is now believed to occur in the mitochondria as well. It is induced when mitochondrial or nucleic DNA is damaged possibly because of reactive oxygen species (ROS, i.e. free radicals).
Since oxidants and free radicals attack DNA, and oxidative stress levels are raised in CFS, it appears likely that high free radical activity is inducing the expression of this pathway in CFS. Several physicians including Dr. Nicholson, Dr. Cheney and Dr. Myhill believe impaired mitochondrial activity (reduced energy output) is central in CFS. This study presents indirect evidence that this may be so.
Altered ADP ribosylation, then, could be associated with several aspects of CFS including increased free radical production and oxidative stress, impaired mitochondrial activity and metabolic problems and increased rates of cell suicide.
Goertzel, B., Coelho, L. and C. Pennachin. Identifying potential biomarkers for chronic fatigue syndrome via classification model ensemble mining.
Some of the data in this summary came from the presentation. Like the other studies, this study’s statistical approach was well beyond my grasp. Instead of using ‘clustering’ techniques they used categorization techniques to analyze gene expression and gene mutation (SNP) data. First, they asked which four gene mutations best differentiated CFS patients from controls and found that those in the tryptophan hydroxylase (TPH2), cortisol receptor (NR3C1), serotonin (5-HTT), and corticotropin releasing hormone receptor (CRHR2) genes did so with 75% accuracy. This is not apparently not particularly impressive; what they really want is 90%+ accuracy but they said they didn’t expect better results given the vagueness of the CFS definition.
The increased expression of a group of genes involved in histone deacetylation allowed them to build a model of CFS based on histone deacetylation and cell suicide (apoptosis). Several studies have indicated increased levels of cell suicide in CFS. High rates of cell suicide are found in infection, toxin exposure, cancer, etc. They noted that the interaction between histone deacetylation and apoptosis is well known. Histone deacetylation also plays an important role in methlylation – a process some believe is disturbed in CFS. Histone decetylation also plays an important role in gene transcription – most CFS gene expression studies have highlighted genes involved in mRNA transcription.
Their gene expression study highlighted immune, ion channel, and other genes. An attempt to uncover groups of highly expressed interrelated genes highlighted a series of glucocorticoid, neuronal, metal ion and immune genes. Another look at the gene mutations in this group highlighted genes similar to those we have seen before (TPH2, COMT, NR3C1, TH, CRHR1). TPH2 and COMT appeared to be at least twice as important as the other genes.
The authors suggested their results indicate that interoception plays a central role in CFS. They concluded that “Taken together these results support the general concept that CFS may be a systemic disorder involving problems with both the brain and the endocrine system and complex feedback dynamics between these two organs’.
Neuro-endocrine-immune Genes Differentiate CFS patients from Idiopathic Fatigue Patients From Healthy Controls.
Earl F. Glynn, Frank Emmert-Streib and Arcady R. Mushegian. 2006. An attempt to categorize the severity of the chronic fatigue syndrome disease using affective disorder pathways.
Most of the information on this study came from the presentation. Contrary to the title of this paper this group tried to differentiate CFS from CFS-like patients and from controls using neuroendocrine genes associated with both affective (mood) disorders and the immune system. They found eight genes that differentiated the worst off (CFS patients) from the least worst off subjects (healthy controls), four genes that differentiated the worst off (CFS) from the middle group (idiopathic fatigue) and 50 genes that differentiated the worst/moderate worst off patients from the healthy controls. The authors seemed surprised, for some reason, at this last result even though prior studies have indicated that CFS and idiopathic fatigue patients have more in common with each other than either group does with healthy controls.
Worst Off vs. Controls – Nervous system - glutamate receptor (GRIK3), tyrosine kinase (EPHB2), pro-melanin-concentrating hormone-like protein (PMCHL1), Brain mn043 protein (RTN4); Endocrine - nuclear receptor 5 A2 (NR5A2), thyroid peroxidase isoform 2/3 (TPO),; Immune –IL-23 receptor (IL-23R), Sema domain (SEMA 3C)
Worst Off vs. Intermediately Ill – Immune - complement (CARDIO), apoptosis/caspase (CISH), TGF-b signaling/Notch Signaling (Furin). Nervous system - Alzheimer’s (IDE)
Worst Off/Intermediately Ill vs. Controls – 50 genes several of which have been seen before.
Once again we see a combination of largely nervous system and immune genes.
THE PARTIAL SUCCESSES
Parkhomenko, E., Tritchler, D., Ho, Hi-Yip, Chan, Chi-Kin and J. Beyene. 2006. Analysis of microarray gene expression, gene expression and clinical data to identify biomarkers for chronic fatigue syndrome.
These researchers attempted to differentiate CFS patients with and without major depression from idiopathic fatigue and healthy controls using gene expression, laboratory and clinical data. They also tried to differentiate the gradual versus sudden onset CFS patients using the 117 genes that the earlier Whistler study found were different in the two groups.
It was intriguing, given the size of the gene data base (@10,000 genes) in this study that it contained only 38 of the 117 assessed by Whistler in her study. This suggests that a good portion of the variability in the gene expression results seen thus far may be due to the use of different gene datasets; i.e. a lot of the gene expression studies are looking at different genes.
Of the 38 assessed by both studies only four had increased expression in the present study. Among these genes was one involved in nervous system and ion channel (sodium) functioning, another involved in neuroendocrine functioning, and interestingly, one involved in, among other things, methyl-transferase activity.
They found 9 genes differentially expressed in CFS patients with and without major depression, 3 genes differentially expressed in depressed idiopathic fatigue patients relative to controls and 8 genes differentially expressed in depressed vs non-depressed idiopathic fatigue patients. No set of genes differentiated the CFS patients from the controls.
These gene expression results, then, were far more effective at differentiating depressed patients than they were at differentiating CFS patients from the others.
THE MOSTLY FAILURES
Bassetti, M. Bernabe, M., Borile, M., Desilvestro, C. et. al. Validation of CFS classification with different data sources.
Instead of attempting to integrate the different data sets (clinical, laboratory, SNP, gene expression and proteomic), this group assessed how effective each kind of data set was at differentiating CFS patients from controls. They were able to create an algorithm using the clinical (symptom) data that successfully differentiated the CFS patients from the controls but were unable to do so using the lab or SNP data. While they were able to distinguish a unique protein signature in the CFS patients, they also found two other protein signatures of greater statistical significance spread across the both groups. The sole bright spot in this study was the identification of a set of largely metabolic genes that were differentially expressed in CFS patients.
Kennedy, P., Simoff. S., Catchpoole, D., Ubaudi, F., Al-Oqaily, A., Yildiz, S., Du, Y. and D. Skillicorn. Does CFS have a biological basis? A constructionist approach.
The title of this paper does ‘CFS Have a Biological Basis’ is misleading. Because this group combined CFS and idiopathic fatigue patients into one study group a more accurate title would have been something like ‘Do fatiguing illnesses have a biological basis’? The authors may have been compelled to add the idiopathic fatigue patients to the sample because the number of CFS patients was too small for the type of genomic and protein studies they wished to do. Even with the inclusion of the I. F. patients the researchers indicated the total number of patients was insufficient.
None of the biological data proved effective in differentiating the ‘CFS patients’ from the controls. A new test called Gene Feature Ranking that examined how many genes are needed to differentiate the different groups found, perhaps not surprisingly, that many types of genes are needed to differentiate the CFS patients from the controls. The authors felt this precluded the possibility of a clear genetic marker for CFS.
Another statistical analysis of the laboratory findings found that the fatigued patients could be differentiated from controls using the mean corpuscular volume of the red blood cells. The authors did not seem impressed by these findings — even though they were different they were still in the normal range. The authors said the readings indicated a ‘slight inefficiency in oxygen distribution’ existed in CFS patients. Similarly, the CFS patients also demonstrated increased CO2 levels. The researchers stated that the difference in the blood chemistry between the CFS patients and others was ‘minimal to non-existent’. They are continuing their study but suggest that ‘any biological basis found in CFS will be subtle’.
Lim, S., Le, W, Hu, P., Xing, B., Greenwood, C. and J. Beyenne. Integration of clinical, SNP and microarray gene expression measurements in prediction of chronic fatigue syndrome.
These researchers tried to differentiate CFS patients from controls by integrating clinical, SNP and gene expression data. They found that the gene expression data contributed very, very little to their ability to do so. They were able to come up with a formula incorporating 2 clinical measures (poor sleep, tender lymph nodes), 10 SNP’s and 50 gene expression results to accurately predict 73% of the CFS patients. That most of the differentiating abilities came from the two clinical measures suggested that the others contributed little to our understanding of CFS. At the end of the paper we learned, however, that this group also collapsed the distinction between the CFS and idiopathic fatigue patients and merged them together.
None of the three study groups (two at the CAMDA conference and one of the Pharmacogenomic’s groups) that put CFS and idiopathic fatigue patients together on one group had positive results.
Summary – It is perhaps not surprising, given the many different approaches these groups took toward analyzing the Wichita data that the results were so varied. An important question regarding those studies that were successful is whether their results were consistent with each other and with those in the Pharmacogenomic’s studies. It is one thing to replicate results using the same experimental techniques, it is quite another to come up with the same results using different approaches. Finding one set of genes showing up again and again despite the different approaches taken would suggest that they do indeed play a role in CFS.
This, in fact, did occur at least with regard to the gene polymporphisms. The four studies that successfully examined the gene mutation data highlighted the following genes (since some studies consisted of several analyses there are more than four results).
Neuroendocrine Genes/ Number of Studies Highlighted In
Glucocorticoid receptor (cortisol) NR3C1 – 7
Tryptophan hydroxylase (TH) – 5
Corticotropin Releasing Hormone Receptor 2 (CRHR2) – 4
Catcehol-O-Methyltransferse (COMT) – 4
Monamine Oxidase B (MAOB) – 3
Propiomelanocortin (POMC) – 3
Corticotropin Releasing Hormone Receptor 1 (CRHR1) – 2
Corticotropin Releasing Hormone (CRH) – 2
Given the fairly large data set (50 genes) the congruence of the results seems remarkable and suggests that mutations in a distinct suite of hormones (cortisol, corticotropin releasing hormone) and neurotransmitters (serotonin, norepinephrine/epinephrine) interact in ways that negatively affect CFS patients. A recent separate study by the CDC that found increased rates of mutations and linkage disequilibrium in the NR3C1 genes in CFS patients further buttressed the central role cortisol appears to play in these interactions. These results appear to validate the CDC’s decision to focus on neuroendocrine gene mutations in CFS. How important these gene mutations are is unclear; statistically the results were not always as significant as the researchers would have wished but the coherence of the results seems compelling.Add Your Comment