Mapping Transcription Regulatory Circuits in the Nematode C. elegans
Overall goal
We use a variety of experimental and computational systems biology approaches to map and characterize gene regulatory networks and to understand how regulatory circuitry controls animal development, function, and homeostasis. Ultimately, we aim to understand how dysfunctional networks affect or cause diseases like diabetes, obesity and cancer.
Differential gene expression and gene regulatory networks
The human genome contains ~25,000 predicted protein-coding genes. Most of these genes are differentially expressed in space and/or time and in response to environmental or pathological cues. As a result, each cell/tissue/organ in the body expresses a different subset of the total gene collection. The first and one of the most important levels of gene regulation is transcriptional: transcription factors (TFs) bind to cis-regulatory DNA sequences and activate or repress gene expression. While the mechanics of transcription have been studied intensely for the past 20 years or so, little is known about where, when and how each of the 25,000 genes is regulated and by which of the ~1500 predicted human TF(s).
The presence of large numbers of TF-encoding genes in metazoan genomes, the multiple protein-DNA and protein-protein interactions TFs engage in, together with the concerted action of multiple TFs per gene, suggests that complex gene expression patterns are the result of intricate transcription regulatory networks in which many TFs are connected to their target genes and to each other. Such networks can be represented as graph models in which "nodes" correspond to proteins or genes, and "edges" (i.e. links between nodes) represent functional or physical interactions between those proteins/genes (see Figure 1). Our first goal is to identify transcription regulatory networks by identifying interactions between TFs and their target genes (protein-DNA). Longer term, we aim to integrate these networks with other types of interactions such as those between microRNAs and their targets (RNA-RNA interactions), between different TFs (protein-protein interactions), between TFs and cofactors (protein-protein interactions) and, between RNA binding proteins and their targets (protein-RNA). We use various network properties, network motifs and other topological measures to udnerstand how transcription regulatory networks behave and how they are similar to or different from other types of networks.
Figure 1. Integrated regulatory networks contain transcriptional interactions (protein-DNA, black lines); post-transcriptional microRNA interactions (RNA-RNA, red lines); post-transcriptional RNA binding protein interactions (protein-RNA, dotted black lines) and dimerizing interactions (protein-protein, blue lines). Adapted from Walhout, Genome Research 2006.
Gene-centered, or gene-to-protein, methods for the identification of TF-target gene interactions
We have developed high-throughput, gene-centered (gene-to-protein) methods that can be used to map physical interactions between regulatory genomic regions and transcription factors (TFs). Specifically, we have adapted the yeast one-hybrid (Y1H) system for use in high-throughput settings and with single copy, complex DNA sequences as “bait” (Deplancke et al.,Genome Research 2004; Vermeirssen et al., Nature Methods 2007). This provides a complementary alternative to more popular TF-centered (protein-to-gene) methods such as chromatin immunoprecipitation (ChIP). Although powerful, ChIP assays suffer from conceptual and technical limitations. For instance, they are only suitable for broadly and/or highly expressed TFs for which high-quality antibodies are available. In contrast, Y1H assays can retrieve rare TFs in an unbiased, condition-independent manner. Importantly, gene-centered methods such as the Y1H system can be used to generate what we refer to as “TF binding profiles” for loci of interest – something that cannot be done using TF-centered methods, unless they are performed for all TFs of an organism and under all relevant developmental and environmental conditions (Figure 2).
Figure 2. There are two conceptually different approaches to identify physical interactions between transcription factors (TFs) and their target genes.
C. elegans as a model system
We predominantly use C. elegans as a model system to study the networks that control differential gene expression at a systems level because:
The complete C. elegans genome sequence is available and is predicted to contain ~20,000 protein-encoding genes, which is approximately the same number as in humans! We have identified 940 predicted TFs among these protein-coding genes (Reece-Hoyes et al., Genome Biology 2005; Vermeirssen et al., Nature Methods 2007).
The C. elegans genome is only 100 Mb, 30 times smaller than the human genome. Since exons are approximately equal in size and number, this means that the regulatory genomic space is much smaller in worms. Thus, we have less potential regulatory sequence to interrogate.
C. elegans is a relatively simple animal. Its development occurs in a stringently programmed manner and the entire lineage of the 959 somatic cells in hermaphrodites has been described, which allows the unambiguous identification of temporal and spatial gene expression patterns.
The animal is transparent, which allows us to follow development, phenotypic aberrations and gene expression patterns in real time using light microscopy (See Figure 3).
C. elegans is a genetically tractable organism and many convenient genetic techniques have been developed that allow the molecular dissection of biological processes. These include the generation of transgenic animals for gene expression studies, and RNA mediated interference (RNAi) for the examination of loss-of-function phenotypes (see Figure).
C. elegans has proven to be instrumental in understanding human biology because many genes, pathways and biochemical processes are highly conserved. For example, studies of oncogenic Ras and apoptotic pathways have been pioneered in C. elegans.
What have we learned?
By using genes expressed in the digestive tract (Deplancke et al., Cell 2006) or neurons (Vermeirssen et al., Genome Research 2007), we have mapped initial tissue-relevant transcription regulatory networks that are enriched for TFs that are themselves expressed in the tissue of interest.
We identified “TF hubs”, or TFs that bind a disproportional large number of promoters. These TFs are frequently essential for the survival of the animal, indicating that their highly connected network phenotype is relevant in vivo (Deplancke et al., Cell 2006).
We have identified a set of novel putative TFs that do not possess a recognizable DNA binding domain, but that robustly interact with promoters (Deplancke et al., Cell 2006).
Figure 3.C. elegans: superworm!!
(Image by Christian Grove)
We have identified “TF modules”, TFs that share many of their target genes. This has helped us to connect network architecture to network functionality (Vermeirssen et al., Genome Research 2007).
We have mapped an integrated transcriptional and post-transcriptional microRNA network and found that this network contains a feedback network motif in which TFs that bind a microRNA promoter are themselves regulated by that same microRNA. In addition, we introduce a novel network parameter that we name “flux capacity” that captures the high information flow capacity that TFs and microRNAs that participate in these feedback motifs often possess (Martinez et al., Genes & Development 2008).
In collaboration with the Ambros lab, we have generated a resource of transgenic C. elegans that express the green fluorescent protein (GFP) under the control of a microRNA promoter (Martinez et al., Genome Research 2008). This resource can be used to annotate microRNA function and to follow up on hypotheses generated by (integrated) network studies. Using this resource, we found that microRNAs that belong to the same family are more likely co-expressed than microRNAs that belong to different families. In addition, we found that several microRNAs are subject to post-transcriptional regulatory mechanisms.
Click below to view our YouTube video made in conjunction with our recent publication in Cell.
Representative Publications
2009
Grove, CA, F deMasi, MI Barrasa, DE Newburger, MJ Alkema, ML Bulyk and AJM Walhout (2009). A multi-parameter network reveals extensive divergence between C. elegans bHLH transcription factors. Cell, in press
Reece-Hoyes, JS, B Deplancke, MI Barrasa, J Hatzold, RB Smit, HE Arda, PA Pope, J Gaudet, B Conradt and AJM Walhout (2009). The C. elegans snail homolog CES-1 can activate gene expression in vivo and share targets with bHLH transcription factors. Nuc. Acids Res., Epub ahead of print April 16.
Martinez, NJ and AJM Walhout (2009). The interplay between transcription factors and microRNAs in genome-scale regulatory networks. BioEssays, 31: 435-445.
2008
Martinez, NJ, MC Ow, JS Reece-Hoyes, MI Barrasa, VR Ambros and AJM Walhout (2008). Genome-scale spatiotemporal analysis of Caenorhabditis elegans microRNA promoter activity. Genome Res., 18: 2005-2015.
Martinez, NJ, MC Ow, MI Barrasa, M Hammell, R Sequerra, L Doucette-Stamm, FP Roth, V Ambros and AJMWalhout (2008). A C. elegans genome-scale microRNA network contains composite feedback loops with high flux capacity. Genes Dev., 22: 2535-2549.
Ow, MC, NJ Martinez, P Olsen, S Silverman, MI Barrasa, B Conradt, AJM Walhout and V Ambros (2008). The FLYWCH transcription factors FLH-1, FLH-2 and FLH-3 repress embryonic expression of microRNA genes in C.elegans. Genes Dev., 22: 2520-2534.
Grove, CA and AJMWalhout (2008). Transcription factor functionality and transcription regulatory networks. Mol. Biosyst., 4: 309-314.
Mukhopadhyay, A, B Deplancke, AJWalhout and HA Tissenbaum (2008). Chromatin immunoprecipitation (ChIP) coupled to detection by quantitative real-time PCR to study transcription factor binding to DNA in Caenorhabditis elegans. Nat. Protocols, 3: 698-709.
2007
Vermeirssen, V, B Deplancke, MI Barrasa, JS Reece-Hoyes, HE Arda, CA Grove, NJ Martinez, R Sequerra, L Doucette-Stamm, MR Brent and AJ Walhout (2007). Matrix and Steiner-triple-system smart pooling assays for high-performance transcription regulatory network mapping. Nat. Methods, 4: 659-664.
Vermeirssen, V, MI Barrasa, CA Hidalgo, JA Babon, R Sequerra, L Doucette-Stamm, AL Barabasi and AJ Walhout (2007). Transcription factor modularity in a gene-centered C. elegans core neuronal protein-DNA interaction network. Genome Res., 17: 1061-1071.
Barrasa, MI, P Vaglio, F Cavasino, L Jacotot and AJ Walhout (2007). EDGEdb: a transcription factor-DNA interaction database for the analysis of C. elegans differential gene expression. BMC Genomics, 8: 21.
Reece-Hoyes, JS, J Shingles, D Dupuy, CA Grove, AJ Walhout, M Vidal and IA Hope (2007). Insight into transcription factor gene duplication from Caenorhabditis elegans Promoterome-driven expression patterns. BMC Genomics, 8: 27.
2006
Deplancke, B, V Vermeirssen, HE Arda, NJ Martinez and AJ Walhout (2006). Gateway-compatible yeast one-hybrid screens. CSH Protocols (online journal).
Walhout, AJ (2006) Unraveling transcription regulatory networks by protein-DNA and protein-protein interaction mapping. Genome Res., 16: 1445-1454.
Walhout, AJ (2006) Networking at the second Interactome Meeting. Expert Rev Proteomics, 3: 477-479.
Wang, Y, SW Oh, B Deplancke, J Luo, AJ Walhout and HA Tissenbaum (2006). C. elegans 14-3-3 proteins regulate life span and interact with SIR-2.1 and DAF-16/FOXO. Mech Ageing and Dev., 127: 741-747.
Deplancke, B, A Mukhopadhyay, W Ao, AE Elewa, CA Grove, NJ Martinez, R Sequerra, L Doucette-Stamm, JS Reece-Hoyes, IA Hope, HA Tissenbaum, SE Mango and AJ Walhout (2006). A gene-centered C. elegans protein-DNA interaction network. Cell, 125: 1192-1205.
2005
Mukhopadhyay, A, B Deplanke, AJ Walhout and HA Tissenbaum. 2005. C. elegans tubby regulates life span and fat storage by two independent mechanisms. Cell Metabolism, 2: 35-42.
Davison, EM, MM Harrison, AJ Walhout, M Vidal and B Horvitz. 2005. lin-8, which antagonizes C. elegans Ras-mediated vulval induction, encodes a novel nuclear protein that interacts with the LIN-35 Rb protein. Genetics, 171: 1017-1031.
Walhout, AJ and SJ Boulton. 2005. Biochemistry and Molecular Biology. Wormbook, eds. The C. elegans Research Community, WormBook, http://www.wormbook.org
Reece-Hoyes*, JS, B Deplancke*, J Shingles, CA Grove, IA Hope and AJ Walhout. 2005. A compendium of C. elegans regulatory transcription factors: a resource for mapping transcription regulatory networks. Genome Biology, 6: R110. (* indicates co-first author)
2004
Fisk Green, R, M Lorson, AJ Walhout, M Vidal and S van den Heuvel. 2004. Identification of critical domains and putative partners for the Caenorhabditis elegans spindle component LIN-5. Molecular Genetics and Genomics, 271: 532-544.
Han, JD, N Bertin, T Hao, DS Goldberg, GF Berriz, LV Zhang, D Dupuy, AJ Walhout, ME Cusick, FP Roth and M Vidal. 2004. Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature, 430: 88-93.
Dupuy, D, QR Li, B Deplancke, M Boxem, T Hao, P Lamesch, R Sequerra, S Bosak, L Doucettte-Stamm, IA Hope, DE Hill, AJ Walhout and M Vidal. 2004. A first version of the Caenorhabditis elegans promoterome. Genome Research, 14: 2169-2175.
Deplancke, B, D Dupuy, M Vidal and AJ Walhout. 2004. A Gateway-compatible yeast one-hybrid system. Genome Research, 14: 2093-2101.
2003
Ge, H, AJ Walhout and M Vidal. 2003. Integrating "omic" information: a bridge between genomics and systems biology. Trends in Genetics, 19: 551-560.
2002
Walhout, AJ, J Reboul, O Shtanko, N Bertin, P Vaglio, H Ge, H Lee, L Doucette-Stamm, KC Gunsalus, AJ Schetter, DG Morton, KJ Kemphues, V Reinke, SK Kim, F Piano and M Vidal. 2002. Integrating interactome, phenome, and transcriptome mapping data for the C. elegans germline. Current Biology, 12: 1952-1958.
2001
Walhout, AJ and M Vidal. 2001. Protein interaction maps for model organisms. Nature Reviews Molecular Cell Biology 2: 55-62.
Walhout, AJ and M Vidal. 2001. High-throughput yeast two-hybrid assays for large-scale protein interaction mapping. Methods 24: 297-306.
Davy, A, P Bello, N Thierry-Mieg, P Vaglio, J Hitti, L Doucette-Stamm, D Thierry-Mieg, J Reboul, S Boulton, AJ Walhout, O Coux and M Vidal. 2001. A protein-protein interaction map of the Caenorhabditis elegans 26S proteasome. EMBO Reports, 2: 821-828.
2000
Walhout, AJ, R Sordella, MA Brasch, G Temple, JL Hartley, N Thierry-Mieg and M Vidal. 2000. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287: 116-122.
Walhout, AJ, G Temple, MA Brasch, JL Hartley, MA Lorson, S van den Heuvel and M Vidal. 2000. GATEWAY recombinational cloning: application to the cloning of large numbers of open reading frames or ORFeomes. Methods in Enzymology, “Chimeric genes and proteins” 328: 575-592.
Endoh, H, AJ Walhout and M Vidal. 2000. A green fluorescent protein-based reverse two-hybrid system: application to the characterization of large numbers of potential protein-protein interactions. Methods in Enzymology, "Chimeric genes and proteins" 328: 74-88.
Walhout, AJ, SJ Boulton and M Vidal. 2000. Yeast two-hybrid systems and protein interaction mapping projects for yeast and worm. Yeast 17: 88-94.
1999
Walhout, AJ and M Vidal. 1999. A Genetic Strategy to Eliminate Self-Activator Baits Prior to High-Throughput Yeast Two-Hybrid Screens. Genome Res. 9: 1128-1134.
1998
Walhout M, H Endoh, W Wong, N Thierry-Mieg and M Vidal. 1998. A model of elegance. Am. J. Hum. Gen., 63: 955-961.
Rotations
Several full and half rotation projects are available in the laboratory. The availability of ongoing projects and the specific interest of the student will be of importance to design a rotation project. Techniques commonly used in the lab include molecular cloning using the Gateway recombinational cloning system, yeast one- and two-hybrid assays for the detection of protein-DNA and protein-protein interactions, the creation of transgenic C. elegans by ballistic transformation, chromatin-immunoprecipitations, RNA mediated interference, microscopy, quantitative RT-PCR, computation and network modeling. Students who are interested should contact Dr. Walhout for more information.
Lab Members
From left to right: Efsun Arda, John Reece-Hoyes, Inma Barrasa, Chris Grove, Katie Brown, Marian Walhout, Yuan Shen, Ashley Carraher, Sankar Jerayaj and Natalie Martinez.
Vanessa Vermeirssen (Postdoctoral fellow) - now at the Flemmish Institute for Biotechnology Bart Deplancke (Postdoctoral Fellow) - now at the EPFL, Lausanne, Switzerland - http://deplanckelab.epfl.ch/
Academic Background
Marian Walhout obtained her B.S. (1992) and Ph.D. (1997) degrees from Utrecht University, The Netherlands. She did her post-doctoral work at Harvard Medical School in the lab of Dr. Marc Vidal. She joined the Program of Gene Function and Expression at the University of Massachusetts Medical School in 2003.
Phone: 508-856-4364 E-mail: Marian.Walhout@umassmed.edu Keywords:
Organisms - C. elegans,
Systems Biology,
Gene Expression,
Protein-DNA recognition
This is an official Page/Publication of the University of Massachusetts Worcester Campus Interdisciplinary Graduate Program 55 Lake Avenue North Worcester, MA 01605
Questions or Comments?
Email: igp@umassmed.edu
Phone: 508-856-4900