We found previously that neither a 6-kbp promoter fragment nor even a 120-kbp yeast artificial chromosome (YAC) containing the whole GATA-3 gene was sufficient to recapitulate its full transcription pattern during embryonic development in transgenic mice. In an attempt to further identify tissue-specific regulatory elements modulating the dynamic embryonic pattern of the GATA-3 gene, we have examined the expression of two much larger (540- and 625-kbp) GATA-3 YACs in transgenic animals. A lacZ reporter gene was first inserted into both large GATA-3 YACs. The transgenic YAC patterns were then compared to those of embryos bearing the identical lacZ insertion in the chromosomal GATA-3 locus (creating GATA-3/lacZ “knock-ins”). We found that most of the YAC expression sites and tissues are directly reflective of the endogenous pattern, and detailed examination of the integrated YAC transgenes allowed the general localization of a number of very distant transcriptional regulatory elements (putative central nervous system-, endocardium-, and urogenital system-specific enhancers). Remarkably, even the 625-kbp GATA-3 YAC, containing approximately 450 kbp and 150 kbp of 5′ and 3′ flanking sequences, respectively, does not contain the full transcriptional regulatory potential of the endogenous locus and is clearly missing regulatory elements that confer tissue-specific expression to GATA-3 in a subset of neural crest-derived cell lineages.
Infection of human cells with oncogenic adenovirus type 12 (Ad12) induces four specific chromosome fragile sites. Remarkably, three of these sites appear to colocalize with tandem arrays of genes encoding small, abundant, ubiquitously expressed structural RNAs--the RNU1 locus encoding U1 small nuclear RNA (snRNA), the RNU2 locus encoding U2 snRNA, and the RN5S locus encoding 5S rRNA. Recently, an artificial tandem array of the natural 5.8-kb U2 repeat unit has been shown to generate a new Ad12-inducible fragile site (Y.-P. Li, R. Tomanin, J. R. Smiley, and S. Bacchetti, Mol. Cell. Biol. 13:6064-6070, 1993), demonstrating that the U2 repeat unit alone is sufficient for virally induced fragility. To identify elements within the U2 repeat unit that are required for virally induced fragility, we generated cell lines containing artificial tandem arrays of the entire 5.8-kb repeat unit, an 834-bp fragment spanning the U2 gene alone, or the same 834-bp fragment from which key U2 transcriptional regulatory elements had been deleted. The U2 snRNA coding regions within each artificial array were marked by an innocuous single base change (U to C at position 87) so that the relative expression of supernumerary and endogenous U2 genes could be monitored by a primer extension assay. We find that artificial arrays of both the 5.8- and the 0.8-kb U2 repeat units are fragile but that arrays lacking either the distal sequence element or both the distal and the proximal sequence elements of the promoter are not. Surprisingly...
Like other DNA-containing viruses, the three origins of herpes simplex virus type 1 (HSV-1) DNA replication are flanked by sequences containing transcriptional regulatory elements. In a transient plasmid replication assay, deletion of sequences comprising the transcriptional regulatory elements of ICP4 and ICP22/47, which flank oriS, resulted in a greater than 80-fold decrease in origin function compared with a plasmid, pOS-822, which retains these sequences. In an effort to identify specific cis-acting elements responsible for this effect, we conducted systematic deletion analysis of the flanking region with plasmid pOS-822 and tested the resulting mutant plasmids for origin function. Stimulation by cis-acting elements was shown to be both distance and orientation dependent, as changes in either parameter resulted in a decrease in oriS function. Additional evidence for the stimulatory effect of flanking sequences on origin function was demonstrated by replacement of these sequences with the cytomegalovirus immediate-early promoter, resulting in nearly wild-type levels of oriS function. In competition experiments, cotransfection of cells with the test plasmid, pOS-822, and increasing molar concentrations of a competitor plasmid which contained the ICP4 and ICP22/47 transcriptional regulatory regions but lacked core origin sequences resulted in a significant reduction in the replication efficiency of pOS-822...
Human mitochondrial DNA contains two major promoters, one for transcription of each strand of the helix. Previous mapping and mutagenesis data have localized these regulatory elements and have suggested regions important to their function. In order to define, at high resolution, the sequences critical for accurate and efficient transcriptional initiation, a linker substitution analysis of the entire promoter region was performed. Each promoter was shown to consist of approximately 50 base pairs comprising two functionally distinct elements. These and previous data strongly support a mode of transcription initiation requiring minimal sequences surrounding the initiation sites that are likely interactive with core polymerase and upstream regulatory domains capable of binding a transcription factor that modulates the efficiency of transcription initiation. Furthermore, in at least one case, this upstream regulatory domain is capable of operating bidirectionally.
In order to understand gene regulation, accurate and comprehensive knowledge of transcriptional regulatory elements is essential. Here, we report our efforts in building a mammalian Transcriptional Regulatory Element Database (TRED) with associated data analysis functions. It collects cis- and trans-regulatory elements and is dedicated to easy data access and analysis for both single-gene-based and genome-scale studies. Distinguishing features of TRED include: (i) relatively complete genome-wide promoter annotation for human, mouse and rat; (ii) availability of gene transcriptional regulation information including transcription factor binding sites and experimental evidence; (iii) data accuracy is ensured by hand curation; (iv) efficient user interface for easy and flexible data retrieval; and (v) implementation of on-the-fly sequence analysis tools. TRED can provide good training datasets for further genome-wide cis-regulatory element prediction and annotation, assist detailed functional studies and facilitate the decipher of gene regulatory networks (http://rulai.cshl.edu/TRED).
Expression of housekeeping genes involves regulation at comparable levels in a wide spectrum of cells. To define the cis-regulatory elements in the human S6 ribosomal protein (rpS6) gene, we made a series of deletions of the upstream non-transcribed region, including or excluding exon 1 or intron 1 sequences. The mutated rpS6 gene regulatory regions were fused to the chloramphenicol acetyltransferase reporter gene and transfected into HeLa and COS-1 cells. The results have identified three parts of the rpS6 gene that are required for efficient and specific transcription. The core promoter includes only a 40 bp region upstream of the transcription start site and initiation region. Both upstream and intronic elements enhance transcription from the core promoter. Furthermore, mutation of the splice donor site of intron 1 almost completely abolished the enhancing activity of the intronic transcriptional modulator. We used gel retardation assays to identify sequence-specific binding sites in the upstream region and in the proximal half of intron 1. Both common and different nuclear factors that bind the rpS6 gene promoter were identified in extracts from HeLa and COS-1 cells, suggesting that different transcription factors may bind specifically to the same binding region and might be interchangeable in their function to ensure high-level expression of housekeeping genes independently of the cell type.
Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computational algorithms is difficult. However, significant advances have been made in the computational methods for modeling and detection of DNA regulatory elements. The availability of complete genome sequence from multiple organisms, as well as mRNA profiling and high-throughput experimental methods for mapping protein-binding sites in DNA, have contributed to the development of methods that utilize these auxiliary data to inform the detection of transcriptional regulatory elements. Progress is also being made in the identification of cis-regulatory modules and higher order structures of the regulatory sequences, which is essential to the understanding of transcription regulation in the metazoan genomes. This article reviews the computational approaches for modeling and identification of genomic regulatory elements, with an emphasis on the recent developments, and current challenges.
Identification of transcriptional regulatory elements represents a critical step in our ability to reconstruct transcriptional regulatory networks from gene expression profiling datasets. To facilitate computational identification of candidate gene regulatory elements from whole genome sequences, we have developed the TFBScluster web server that integrates several tools for the genome-wide identification and subsequent characterization of transcription factor binding site clusters that are conserved in multiple mammalian species. Either the human or mouse genomes can be used as the reference sequence with direct links from the search results to the ENSEMBL and UCSC genome browsers. Moreover, TFBScluster provides seamless integration of transcription factor binding site searches with genome annotation and gene expression profiling data, to allow prioritising computational predictions for subsequent experimental validation. TFBScluster is publicly available at .
The comprehensive inventory of functional elements in 44 human genomic regions carried out by the ENCODE Project Consortium enables for the first time a global analysis of the genomic distribution of transcriptional regulatory elements. In this study we developed an intuitive and yet powerful approach to analyze the distribution of regulatory elements found in many different ChIP–chip experiments on a 10∼100-kb scale. First, we focus on the overall chromosomal distribution of regulatory elements in the ENCODE regions and show that it is highly nonuniform. We demonstrate, in fact, that regulatory elements are associated with the location of known genes. Further examination on a local, single-gene scale shows an enrichment of regulatory elements near both transcription start and end sites. Our results indicate that overall these elements are clustered into regulatory rich “islands” and poor “deserts.” Next, we examine how consistent the nonuniform distribution is between different transcription factors. We perform on all the factors a multivariate analysis in the framework of a biplot, which enhances biological signals in the experiments. This groups transcription factors into sequence-specific and sequence-nonspecific clusters. Moreover...
Transcriptional factors (TFs) and many of their target genes are involved in gene regulation at the level of transcription. To decipher gene regulatory networks (GRNs) we require a comprehensive and accurate knowledge of transcriptional regulatory elements. TRED () was designed as a resource for gene regulation and function studies. It collects mammalian cis- and trans-regulatory elements together with experimental evidence. All the regulatory elements were mapped on to the assembled genomes. In this new release, we included a total of 36 TF families involved in cancer. Accordingly, the number of target promoters and genes for TF families has increased dramatically. There are 11 660 target genes (7479 in human, 2691 in mouse and 1490 in rat) and 14 908 target promoters (10 225 in human, 2985 in mouse and 1698 in rat). Additionally, we constructed GRNs for each TF family by connecting the TF–target gene pairs. Such interaction data between TFs and their target genes will assist detailed functional studies and help to obtain a panoramic view of the GRNs for cancer research.
The identification of transcriptional regulatory modules within mammalian genomes is a prerequisite to understanding the mechanisms controlling regulated gene expression. While high-throughput microarray- and sequencing-based approaches have been used to map the genomic locations of sites of nuclease hypersensitivity or target DNA sequences bound by specific protein factors, the identification of regulatory elements using functional assays, which would provide important complementary data, has been relatively rare. Here we present a method that permits the functional identification of active transcriptional regulatory modules using a simple procedure for the isolation and analysis of DNA derived from nucleosome-free regions (NFRs), the 2% of the cellular genome that contains these elements. The more than 100 new active regulatory DNAs identified in this manner from F9 cells correspond to both promoter-proximal and distal elements, and display several features predicted for endogenous transcriptional regulators, including localization within DNase-accessible chromatin and CpG islands, and proximity to expressed genes. Furthermore, comparison with published ChIP-seq data of ES-cell chromatin shows that the functional elements we identified correspond with genomic regions enriched for H3K4me3...
The REDfly database of Drosophila transcriptional cis-regulatory elements provides the broadest and most comprehensive available resource for experimentally validated cis-regulatory modules and transcription factor binding sites among the metazoa. The third major release of the database extends the utility of REDfly as a powerful tool for both computational and experimental studies of transcription regulation. REDfly v3.0 includes the introduction of new data classes to expand the types of regulatory elements annotated in the database along with a roughly 40% increase in the number of records. A completely redesigned interface improves access for casual and power users alike; among other features it now automatically provides graphical views of the genome, displays images of reporter gene expression and implements improved capabilities for database searching and results filtering. REDfly is freely accessible at http://redfly.ccr.buffalo.edu.
Filamentous fungi produce a variety of secondary metabolites of diverse beneficial and detrimental activities to humankind. The genes encoding the enzymatic machinery required to make these metabolites are typically clustered in fungal genomes. There is considerable evidence that secondary metabolite gene regulation is, in part, by transcriptional control through hierarchical levels of transcriptional regulatory elements involved in secondary metabolite cluster regulation. Identification of secondary metabolism regulatory elements could potentially provide a means of increasing production of beneficial metabolites, decreasing production of detrimental metabolites, aid in the identification of ‘silent’ natural products and also contribute to a broader understanding of molecular mechanisms by which secondary metabolites are produced. This review summarizes regulation of secondary metabolism associated on transcriptional regulatory elements from a broad view as well as tremendous advances in discovery of cryptic or novel secondary metabolites by genomic mining in the basis of this knowledge.
Transcription factors control cell-specific gene expression programs by binding regulatory elements and recruiting cofactors and the transcription apparatus to the initiation sites of active genes. One of these cofactors is cohesin, a structural maintenance of chromosomes (SMC) complex that is necessary for proper gene expression. We report that a second SMC complex, condensin II, is also present at transcriptional regulatory elements of active genes during interphase and is necessary for normal gene activity. Both cohesin and condensin II are associated with genes in euchromatin and not heterochromatin. The two SMC complexes and the SMC loading factor NIPBL are particularly enriched at super-enhancers, and the genes associated with these regulatory elements are especially sensitive to reduced levels of these complexes. Thus, in addition to their well-established functions in chromosome maintenance during mitosis, both cohesin and condensin II make important contributions to the functions of the key transcriptional regulatory elements during interphase.
The mosquito Aedes aegypti is the principal vector for the yellow fever and dengue viruses, and is also responsible for recent outbreaks of the alphavirus chikungunya. Vector control strategies utilizing engineered gene drive systems are being developed as a means of replacing wild, pathogen transmitting mosquitoes with individuals refractory to disease transmission, or bringing about population suppression. Several of these systems, including Medea, UDMEL, and site-specific nucleases, which can be used to drive genes into populations or bring about population suppression, utilize transcriptional regulatory elements that drive germline-specific expression. Here we report the identification of multiple regulatory elements able to drive gene expression specifically in the female germline, or in the male and female germline, in the mosquito Aedes aegypti. These elements can also be used as tools with which to probe the roles of specific genes in germline function and in the early embryo, through overexpression or RNA interference.
Post-transcriptional regulatory programs governing diverse aspects of RNA biology remain largely uncharacterized. Understanding the functional roles of RNA cis-regulatory elements is essential for decoding complex programs that underlie the dynamic regulation of transcript stability, splicing, localization, and translation. Here, we describe a combined experimental/computational technology to reveal a catalogue of functional regulatory elements embedded in 3’-untranslated regions (3’UTRs) of human transcripts. We used a bidirectional reporter system coupled with flow cytometry and high-throughput sequencing to measure the effect of short, non-coding vertebrate-conserved RNA sequences on transcript stability and translation. Information-theoretic motif analysis of the resulting sequence-to-gene-expression mapping revealed linear and structural RNA cis-regulatory elements that positively and negatively modulate the post-transcriptional fates of human transcripts. This combined experimental/computational strategy can be used to systematically characterize the vast landscape of post-transcriptional regulatory elements controlling physiological and pathological cellular state transitions.
We have analyzed a series of 5' deletions of the RAS2 gene to investigate its complex transcriptional regulation in the yeast Saccharomyces cerevisiae. Two positive transcriptional regulatory elements were identified. Element A regulates two of the three clusters of RAS2 transcripts. This element is capable of activating a heterologous promoter and contains two copies of the sequence CCTCGCCCC. Although one copy is sufficient for partial transcriptional activation, both copies are required for maximal RAS2 induction. Deletion of one copy resulted in a reduced level of RAS2 mRNA, selective loss of cluster II transcripts and reduced ability to activate the heterologous CYC1 promoter. Each of the 9 bp C rich repeats of element A is part of a sequence with extensive homology to a transcriptional regulatory element upstream of the human epidermal growth factor receptor (EGFR) gene. Element B contains a tandem duplication of a 21 nucleotide sequence TACATATATATATATCTTAG and activates cluster I RAS2 transcripts in the absence of Element A. The physiological role of these deletions was determined by assaying their ability to support growth on a nonfermentable carbon source. RAS2 promoter deletions containing either element A or B were able to overcome this growth defect characteristic of ras2 mutants cells. Deletion of both elements resulted in an insufficient amount of RAS2 protein for growth on a non-fermentable carbon source.
We have identified three elements in the noncoding region of human papillomavirus type 6 (HPV-6) that regulate transcription when assayed in recombinant plasmids containing the bacterial gene for chloramphenicol acetyltransferase. One was a silencer that reduced expression in both a species- and tissue-dependent manner. The second was an enhancer element that was tissue specific. The third was a weak promoter that showed some tissue specificity. These elements have been localized within the noncoding region by analysis of 5'-to-3' and 3'-to-5' deletions with two HPV-6 subtypes, HPV-6e and HPV-6g. HPV-6g differs from HPV-6e by the presence of an additional copy in tandem of a 136-base-pair (bp) sequence and by an 8-bp sequence containing a 3-bp deletion. Silencer activity, assayed in plasmids with the simian virus 40 minimum promoter which were transfected into NIH 3T3 cells, could not be overcome by the enhancer activity of the simian virus 40 72-bp repeats. The 413-bp fragment of A of HPV-6g showed silencer activity, while the corresponding HPV-6e fragment containing the 8-bp change did not. Enhancer activity of HPV-6g was localized to fragment C of 326 bp which contains the 136-bp repeat. Dot blot hybridizations reflected relative chloramphenicol acetyltransferase activities and demonstrated enhancer and silencer activities at the RNA level. Analysis of the interaction of these activities in naturally occurring variants should provide information on tissue specificity and regulation of gene expression of HPVs and may provide information on the mechanism of action of transcriptional regulatory elements in eucaryotic cells.
We sought exonic transcriptional regulatory elements by shotgun cloning human cDNA fragments into luciferase reporter vectors and measuring the resulting expression levels in liver cells. We uncovered seven regulatory elements within coding regions and three within 3' untranslated regions (UTRs). Two of the putative regulatory elements were enhancers and eight were silencers. The regulatory elements were generally but not consistently evolutionarily conserved and also showed a trend toward decreased population diversity. Furthermore, the exonic regulatory elements were enriched in known transcription factor binding sites (TFBSs) and were associated with several histone modifications and transcriptionally relevant chromatin. Evidence was obtained for bidirectional cis-regulation of a coding region element within a tubulin gene, TUBA1B, by the transcription factors PPARA and RORA. We estimate that hundreds of exonic transcriptional regulatory elements exist, an unexpected finding that highlights a surprising multi-functionality of sequences in the human genome.
The mosquito Aedes aegypti is the principal vector for the yellow fever and dengue viruses, and is also responsible for recent outbreaks of the alphavirus chikungunya. Vector control strategies utilizing engineered gene drive systems are being developed as a means of replacing wild, pathogen transmitting mosquitoes with individuals refractory to disease transmission, or bringing about population suppression. Several of these systems, including Medea, UD^MEL, and site-specific nucleases, which can be used to drive genes into populations or bring about population suppression, utilize transcriptional regulatory elements
that drive germline-specific expression. Here we report the identification of multiple regulatory elements able to drive gene expression specifically in the female germline, or in the male and female germline, in the mosquito Aedes aegypti. These elements can also be used as tools with which to probe the roles of specific genes in germline function and in the early embryo, through overexpression or RNA interference.