Appreciable advances into the process of transcript elongation by RNA polymerase II (RNAP II) have identified this stage as a dynamic and highly regulated step of the transcription cycle. Here, we discuss the many factors that regulate the elongation stage of transcription. Our discussion includes the classical elongation factors that modulate the activity of RNAP II, and the more recently identified factors that facilitate elongation on chromatin templates. Additionally, we discuss the factors that associate with RNAP II, but do not modulate its catalytic activity. Elongation is highlighted as a central process that coordinates multiple stages in mRNA biogenesis and maturation.
The regulation of gene expression is one of the most intensely studied areas in all of science. Differential gene expression in multicellular organisms forms the foundation of cell-type specificity. Deregulation of the appropriate pattern of gene expression has profound effects on cellular function and underlies many diseases. Although there are many cellular processes that control gene expression, the most direct regulation occurs during transcription. The transcription of protein-coding genes in eukaryotes is performed by RNA polymerase II (RNAP II). Up until recently, the vast majority of studies aimed at elucidating the molecular mechanisms of transcription regulation have focused on early stages, such as the formation of a transcription initiation complex (preinitiation) and initiation (see below). For many years, transcript elongation has been thought of as the trivial addition of ribonucleoside triphosphates to the growing mRNA chain. It is now apparent that this process is a dynamic and highly regulated stage of the transcription cycle capable of coordinating downstream events. Numerous factors have been identified that specifically target the elongation stage of transcription. Importantly, multiple steps in mRNA maturation, including pre-mRNA capping, splicing, 3′-end processing, surveillance, and export, are modulated through interactions with the RNAP II transcript elongation complex (TEC). It also appears that distinct elongation factors function in specific transcriptional contexts; the requirements for these specialized factors are largely unknown and highlight the need to better understand elongation in vivo. Many molecular and biochemical approaches have been used to quantify the different aspects of elongation, many of which are discussed below. It remains important to assimilate both in vitro and in vivo experimental systems when discussing what constitutes an elongation factor. An elongation factor can be defined as any molecule that affects the activities of or is associated with the TEC. We suggest that there are at least two major types of transcript elongation factors: one family, referred to as active elongation factors, and a second family, denoted as passive elongation factors. The difference resides in whether the factor affects the catalytic activity of RNAP II, regardless of whether the factor remains associated with the TEC. Passive factors refer to those that are associated with the moving polymerase, but do not affect its catalytic activity. Thus, an active elongation factor might transitorily associate with the TEC and, after executing its function, dissociate from the moving polymerase.
In this review, we focus mainly on the regulation of transcript elongation and how this stage of the transcription cycle is central to the integration and coordination of multiple events during transcription. Our discussion includes the “classical” elongation factors that modulate the activity of RNAP II, and the more recently identified factors that facilitate transcript elongation on chromatin templates. We end our review discussing the central role of elongation in the coordination of events that result in efficient mRNA production and export to the cytoplasm.
The generation of a mature mRNA molecule by RNAP II involves multiple processes, some of which occur sequentially and others in parallel. The primary phases of transcript generation form a so-called transcription cycle and include preinitiation, initiation, promoter clearance, elongation, and termination. The transcription cycle starts with preinitiation complex (PIC) assembly at the promoter. The PIC includes the general transcription factors (GTFs) IID, IIB, IIE, IIF, and IIH and RNAP II, as well as several additional cofactors (Orphanides et al. 1996). Formation of an open complex between RNAP II and the DNA template is a prerequisite for transcription initiation. Melting of the double-stranded DNA into a single-stranded bubble is an ATP-dependent process and requires the action of two GTFs, IIE and IIH (Goodrich and Tjian 1994; Holstege et al. 1996; Kim et al. 2000). Once the open complex is established, transcription initiation occurs upon addition of the two initiating nucleoside triphosphates (NTPs) dictated by the DNA sequence and formation of the first phosphodiester bond. The presence of all NTPs allows RNAP II to clear the promoter, although ATP appears to be particularly important for the transition from initiation to elongation (Dvir et al. 1996). Before RNAP II becomes engaged into productive transcript elongation, it must pass through a stage known as promoter clearance (see below). During this stage, the PIC is partially disassembled: a subset of GTFs remains at the promoter, serving as a scaffold for the formation of the next transcription initiation complex (Zawel et al. 1995; Yudkovsky et al. 2000). The GTF TFIIH plays an integral role in facilitating promoter clearance, in part by preventing premature arrest (Goodrich and Tjian 1994; Dvir et al. 1997; Kumar et al. 1998). However, of all the GTFs, only TFIIF can be found in the RNAP II TEC (Zawel et al. 1995). Once the promoter is cleared, the next round of transcription can be reinitiated. Reinitiation of transcription has the potential to be a much faster process relative to the initial round, and is responsible for the bulk of transcription in the cell (Jiang and Gralla 1993; Ranish et al. 1999; Orphanides and Reinberg 2000). The final step in the cycle is transcript termination. At this stage, the mRNA is cleaved, polyadenylated, and transported to the cytoplasm, where it will be translated (Proudfoot et al. 2002). Interestingly, the transcript cleavage-polyadenylation specificity factor (CPSF) can be recruited by TFIID apparently during PIC formation (Dantonel et al. 1997). Thus, transcription initiation and termination are interconnected and might influence each other's efficiency. In addition, factors that affect posttranscriptional events such as RNA transport and surveillance are associated with the TEC, although these factors appear not to affect the catalytic activity of RNAP II. While cleavage coupled to polyadenylation releases the nascent transcript from the TEC, RNAP II continues transcription for several kilobases and the actual process of termination is more complex (release of RNAP II), although dependent on the cleavage activity (see below).
The transition between transcription initiation and elongation, promoter clearance, has long been considered as a significant phase in transcription regulation. It is important to note that although an arbitrary boundary has been placed between promoter clearance and early elongation, it is difficult to distinguish these as separable stages in many experimental systems (Uptain et al. 1997). The earliest stages of transcription are marked by instability of the transcription complex and a measurable tendency to release the RNA. Appearance of the stalled RNAP II–DNA complex and short RNA products during promoter clearance indicates that immediately after transcription starts, a stable TEC is not yet formed, and RNAP II is viewed to be in a mode of abortive initiation. Newly initiated RNAP II has a tendency to slip laterally (Pal and Luse 2002). A recent study has demonstrated that a decrease in slippage is a two-step process: first, slippage is greatly reduced after the length of the RNA–DNA hybrid reaches 8–9 nt, and then it becomes undetectable after synthesis of a 23-nt-long RNA (Pal and Luse 2003). The first step in the increase of RNAP II stability can be explained by previous work. Biochemical studies have determined that RNAP II TECs are unstable before the RNA–DNA hybrid reaches 8 nt in length (Kireeva et al. 2000). In agreement with this result, the high-resolution crystal structure of the yeast RNAP II TEC containing a 14-nt RNA shows that the RNA–DNA hybrid within the transcription bubble is 9 nt long (Gnatt et al. 2001). At present, it is not understood how a 23-nt-long transcript stabilizes the elongating RNAP II (Pal and Luse 2003). One possibility is that the interaction of the extended transcript with the RNA exit channel promotes additional changes in the conformation of the enzyme, which results in further stabilization. Structural studies of the RNAP II TEC containing a RNA transcript longer than 14 nt would provide a more definitive answer to this question. The recent cocrystal of RNAP II–TFIIB identified that TFIIB might impede the exit path of the newly formed RNA (Bushnell et al. 2004). The authors suggest that the continued presence of TFIIB could result in abortive initiation, and subsequent removal of TFIIB would allow for promoter escape. Many of the details surrounding promoter clearance await characterization, and although complex, understanding this process remains an important task to better comprehend the transition between initiation and elongation.
Promoter clearance coincides with the beginning of another cycling event within the transcription cycle: phosphorylation of the C-terminal domain (CTD) of Rpb1, the largest subunit of RNAP II (Fig. 1). The RNAP II CTD contains multiple repeats of the heptapeptide sequence YSPTSPS. The number of these repeats increases with genomic complexity: 26 in yeast, 32 in Caenorhabditis elegans, 45 in Drosophila, and 52 in mammals. The existence of the hypo- and hyperphosphorylated forms of RNAP II (IIA and IIO, respectively) was first described in the early 1980s (Dahmus 1981). Studies using functional assays together with antibodies specific to one or the other form of RNAP II demonstrated that RNAP II in the PIC was unphosphorylated (Laybourn and Dahmus 1989; Lu et al. 1991), whereas transcription-competent RNAP II was heavily phosphorylated on its CTD (Christmann and Dahmus 1981); subsequently, Ser 2 and Ser 5 residues were identified as the major modification sites. More recently, chromatin immunoprecipitation (ChIP) analyses in yeast and Drosophila were performed using antibodies that apparently can specifically recognize RNAP II phosphorylated at Ser 2 or Ser 5 (O'Brien et al. 1994; Komarnitsky et al. 2000) These studies revealed that RNAP II phosphorylated at Ser 5 associates with promoter-proximal regions of transcribed genes and is apparently not detected in the 3′-end region. At the same time, the amount of the enzyme phosphorylated at Ser 2 increases toward the 3′-end of genes. Thus, it appears that CTD phosphorylation at Ser 5 correlates with transcription initiation and early elongation (promoter clearance), whereas Ser 2 phosphorylation is associated with RNAP II farther away from the promoter. It is important to note that a recent report characterizing the antibodies frequently used to distinguish between Ser 5 and Ser 2 CTD phosphorylation revealed that the Ser 2-specific antibody (H5) recognizes both the Ser 5 and Ser 2 phosphorylated forms (Jones et al. 2004). It may be necessary to re-examine some previously published information, although the general trends are likely to remain valid. It is now established that the RNAP II CTD plays a central role in recruiting protein factors involved in elongation, as well as in mRNA maturation, surveillance, and export (Hirose and Manley 2000; Orphanides and Reinberg 2002; Proudfoot et al. 2002). Moreover, these processes are now known to occur cotranscriptionally, and many events during the synthesis of a mature mRNA are co-regulated (Maniatis and Reed 2002; Orphanides and Reinberg 2002). This fascinating discovery has altered our understanding of the scope of the reactions that are connected to transcript elongation and is discussed in more detail below. We begin our review of transcript elongation by discussing factors that directly influence the activity of the RNAP II machinery.
The phosphorylation cycle of the CTD of the large subunit of RNAP II. Initially, the unphosphorylated RNAP II CTD is targeted on Ser 5 by the kinase activity of the Kin28 subunit of TFIIH. Early after transcription initiation, the TEC is arrested at a “checkpoint” (see Fig. 3), to ensure proper pre-mRNA capping. Release from this checkpoint involves the kinase action of P-TEFb (Bur1/2 or Ctk1 in yeast), which targets Ser 2 on the CTD. Subsequently during elongation, protein phosphatases dephosphorylate Ser 5 residues (presumably, see text) within the CTD. The precise details surrounding the identity of the phosphatases and the specific residues that are targeted are not clear. However, the Ser 5-specific phosphatase Ssu72 is involved in allowing the correct transcript cleavage necessary for efficient termination. Both Ssu72 and the FCP1 phosphatase, which also targets Ser 5, are involved in recycling the RNAP II for reinitiation and subsequent rounds of transcription, although the specifics of recycling are currently unknown. See text for details.
Efficient transcript elongation must overcome several blocks that are intrinsic to RNAP II catalytic activity and its chromatinized DNA template. There are three principle impediments to transcript elongation, which include transcriptional pause, arrest, and termination (Uptain et al. 1997; Shilatifard et al. 2003). Many of the identified elongation factors that have been ascribed a mechanistic function serve to counteract or alleviate these blocks. Transcriptional pausing occurs when the RNA polymerase halts the addition of NTPs to the nascent RNA transcript for a time before resuming productive elongation on its own (Fig. 2). This pausing has been demonstrated for all three eukaryotic DNA-dependent RNA polymerases, as well as viral and prokaryotic RNA polymerases (Uptain et al. 1997). Transcriptional arrest can be defined as an irreversible halt to RNA synthesis, whereby the RNA polymerase cannot resume productive elongation without accessory factors. Although pausing and arrest are functionally defined, they are not absolute concepts, nor are they mutually exclusive events. For the purposes of our discussion, however, we consider them as distinct. During termination, the RNA polymerase and RNA transcript are released from the DNA, effectively ending the elongation stage of transcription. Transcriptional pause and arrest in vivo are most likely caused by a combination of identifiable DNA sequences, protein factors, and the nascent transcript. The factors that specifically aid RNAP II to circumvent these obstacles are discussed in more detail below.
Transcriptional pause and arrest. Transcriptional pause by RNAP II is a natural mode of regulation caused by a slight misalignment of the RNA 3′-OH and the active site of the enzyme. Transcriptional pause is self-reversible and is regulated by numerous cellular factors to alleviate (TFIIF, Elongins, ELL, CSB, FCP1, and DSIF) or exploit this pause that ensures proper pre-mRNA modifications (NELF). Transcriptionally arrested RNAP II resumes elongation in a process that requires RNAP II-mediated cleavage of the nascent transcript stimulated by TFIIS. See text for details.
Transcriptional pausing by RNAP II occurs in the absence of ancillary factors and was previously believed to be a result of the polymerase sliding backward a few nucleotides, or “backtracking,” causing a misalignment of the catalytic site of the enzyme with the 3′-OH end of the RNA transcript. A recent study has, however, demonstrated that pausing of a bacterial RNA polymerase can be independent of backtracking and is probably caused by a structural rearrangement within the enzyme and DNA sequences, which results in the formation of an “unactivated” intermediate (Erie 2002; Neuman et al. 2003). Transcription catalyzed by bacterial RNAP and RNAP II are mechanistically analogous; therefore, it is likely that similar unactivated intermediates can be formed by RNAP II. In contrast to transcriptional arrest, pausing is self-reversible, and is thought to be a natural mode of transcriptional regulation. This point is highlighted by the existence of many factors that modulate transcriptional pause, and thus, the rate of transcriptional elongation. These factors include TFIIF, the ELL family, Elongins, FCP1, CSB, and DSIF (see below; Table 1). Additionally, a factor that promotes transcriptional pausing has also been identified and is termed NELF (negative elongation factor) (see below). In particular, experiments in vitro suggest redundant roles for TFIIF, ELL, FCP1, and Elongins; however, in vivo evidence indicates otherwise, and perhaps each factor stimulates elongation within specific transcriptional contexts.
Identified elongation factors of RNA polymerase II
The human GTF IIF, a heterodimer comprised of RNAP II-associating protein 30 (RAP30) and RAP74, plays an integral role in recruiting RNAP II during PIC formation (Conaway et al. 1991; Flores et al. 1991; Orphanides et al. 1996). Aside from its function in initiation, TFIIF diminishes the time RNAP II is paused and stimulates the rate of RNAP II transcriptional elongation (Flores et al. 1989; Price et al. 1989; Bengal et al. 1991; Izban and Luse 1992a; Tan et al. 1994). TFIIF was initially purified as a factor that directly binds immobilized RNAP II (Sopta et al. 1985) and for its requirement in transcription (Flores et al. 1989). Both subunits of TFIIF contribute to transcription initiation and elongation events (Tan et al. 1994; Lei et al. 1998). Mutational analyses have provided evidence that TFIIF is important for proficient promoter escape by suppressing abortive initiation, and acts in concert with TFIIH on early elongation intermediates to prevent arrest (Yan et al. 1999). TFIIF itself is regulated by phosphorylation, primarily on the RAP74 subunit, and this modification alters its initiation and elongation activities (Kitajima et al. 1994). A subunit of GTFs TFIID, TAF250 (TAF1), and TFIIH has been shown to selectively phosphorylate RAP74 (Ohkuma and Roeder 1994; Dikstein et al. 1996; Yankulov and Bentley 1997; Yonaha et al. 1997), and RAP74 itself appears to contain autophosphorylation activity (Rossignol et al. 1999).
Following its release from the transcription complex, TFIIF reassociates with the TEC, in particular when the RNAP II complex has stalled (Zawel et al. 1995). Consistent with this finding, TFIIF has the ability to associate with both the hypophosphorylated form of RNAP II (IIA) present in initiation complexes, and the hyperphosphorylated form of RNAP II (IIO) found in the TEC (Zawel et al. 1995). Elongation events near the promoter may require different factors from those that are far away from the promoter to facilitate productive elongation, particularly on very large genes. ChIP assays demonstrated that in contrast to other elongation factors, such as TFIIS and Spt5, TFIIF localized predominantly near the promoter region and not uniformly throughout the coding region (Krogan et al. 2002b; Pokholok et al. 2002). However, TFIIF did appear in the coding and 3′-untranslated region in a similar manner as some PAF subunits (see below; Table 1) (Krogan et al. 2002b). These findings are in full agreement with mechanistic studies demonstrating that TFIIF does not travel with the RNAP II in the TEC, but reassociates with RNAP II molecules that encounter a block to elongation. It was postulated that the transient association of TFIIF with paused RNAP II induces a conformational change in the polymerase necessary for optimal elongation and, once this is accomplished, TFIIF is released from the TEC (Zawel et al. 1995). Thus, TFIIF appears to represent an active elongation factor, but does not remain associated with the actively moving RNAP II.
In addition, the RAP74 subunit of TFIIF directly interacts with and stimulates the enzymatic activity of FCP1, a phosphatase that targets the CTD of RNAP II (Chambers et al. 1995; Cho et al. 1999; Kamada et al. 2003; Nguyen et al. 2003). FCP1 stimulates transcript elongation in vitro, surprisingly independent of its catalytic activity (Cho et al. 1999; Mandal et al. 2002), and the stimulatory activity of TFIIF and FCP1 on elongation is additive (Mandal et al. 2002). FCP1 remains associated with the elongation-competent RNAP II in yeast in vivo (Cho et al. 2001). A description of the role of FCP1 in the CTD phosphorylation cycle is provided below. Further experimentation identified that TFIIF associates with multiple elongation factors, including the Spt5 subunit of DSIF (Lindstrom et al. 2003), and components of the PAF complex (Shi et al. 1997). TFIIF also appears to influence the TFIIS cleavage factor (see below) (Elmendorf et al. 2001; Zhang et al. 2003a). Collectively, these results establish the authenticity of TFIIF among the identified elongation factors as well as its important role in regulating the RNAP II TEC.
The Elongin complex (SIII) was initially isolated as an RNAP II stimulatory factor that exerts its effects on elongation by stimulating the rate of RNAP II-mediated mRNA synthesis, although its precise mechanism of action remains to be determined (Bradsher et al. 1993a,b). The three-subunit Elongin complex consists of the transcriptionally active elongin A, in addition to the regulatory elongin B and C proteins (Aso et al. 1995, 1996; Garrett et al. 1995; Takagi et al. 1996). Similar to TFIIF, elongin A stimulates the rate of RNAP II transcription by suppressing transient transcriptional pausing; however, unlike TFIIF, elongin A is not required for PIC formation or initiation (Bradsher et al. 1993b). But elongin A was observed to facilitate promoter-independent initiation in a biochemical system lacking initiation factors comparable to TFIIF (Flores et al. 1989; Takagi et al. 1995); supplementary work suggested that Elongin functions to properly align the 3′-OH end of the transcript in the active site of RNAP II (Takagi et al. 1995). Elongin was observed to enhance elongation by RNAP II only after the TEC converts to an “elongin-activatable” state, which required the absence of TFIIF (Moreland et al. 1998). These data support the idea that TFIIF must be lost from TECs to be susceptible to Elongin. Therefore, it is possible that Elongin could serve as a complementary factor to suppress transcriptional pausing in the instance where TFIIF may fail to reassociate with paused TECs. Alternatively, TFIIF and Elongin may function independently in vivo, and the observed effects of TFIIF on Elongin are merely a result of the biochemical assay. Importantly, however, an Elongin-like factor has not been observed in yeast; therefore, it is possible that Elongin represents a higher eukaryotic gene-specific factor.
Elongin constitutes a family of factors, as evidenced by the discovery of the elongin A-related proteins, elongin A2 and elongin A3. Elongin A2 and elongin A3 stimulate the rate of transcriptional elongation, although neither is regulated by elongin B and elongin C (Aso et al. 2000; Yamazaki et al. 2002). Interestingly, the gene product of the von Hippel-Lindau tumor suppressor (pVHL) can form a stable interaction with the elongin BC proteins (Duan et al. 1995; Kibel et al. 1995; Conaway et al. 1998). This complex, which also contains the Cul2 and Rbx1 proteins, has been shown to facilitate the ubiquitination and subsequent degradation of HIF (hypoxia-inducible factor) by the 26S proteasome (Ivan and Kaelin 2001; Conaway and Conaway 2002). Although the pVHL complex most likely does not compete with elongin A for the elongin BC complex (Kamura et al. 1998), the role of elongin BC in the pVHL complex may parallel its elongin A-regulatory function. Elongin B itself is a member of the ubiquitin (Ub) homology gene family (Garrett et al. 1995; Brower et al. 1999), and elongin C contains homology to the Skp1 yeast protein, a subunit of the SCF complex that targets proteins for polyubiquitination, and thus degradation (Ivan and Kaelin 2001). It was previously speculated that Elongin may function, in part, to facilitate the ubiquitination of RNA polymerase itself or other transcriptional cofactors (Shilatifard et al. 2003). The large subunit of RNAP II becomes ubiquitinated following treatment with UV radiation or other DNA damaging agents in a manner dependent on CTD phosphorylation (Bregman et al. 1996; Huibregtse et al. 1997; Ratner et al. 1998; Mitsui and Sharp 1999). It remains possible that the function of Elongin in vivo is related to the disassembly of the RNAP II TEC that encounters blocks to elongation, such as lesions on the DNA. Elongin may exclusively associate with a lesion-stalled RNAP II to promote its degradation, allowing the repair machinery to gain access and subsequently repair the DNA. However, a link between the Elongin complex and specific ubiquitination events remains to be elucidated, and thus the precise functions of Elongins in RNAP II-mediated elongation are not yet clear.
ELL is functionally analogous to TFIIF and Elongin through its suppression of transcriptional pause, in effect stimulating the transcriptional rate of RNAP II (Shilatifard et al. 1996). The ELL gene (eleven-nineteen lysine-rich leukemia gene) was initially identified as a chromosomal translocation partner to the MLL gene in acute myeloid leukemia (Thirman et al. 1994). The major determinant for acute myeloid leukemia is likely to be dysfunctional MLL, rather than loss of ELL, because of the high number of MLL translocations to other genes (Ayton and Cleary 2001; Ernst et al. 2002). Nevertheless, the CTD of ELL was necessary and sufficient for ELL–MLL-mediated transformation (DiMartino et al. 2000; Luo et al. 2001). Human ELL belongs to a protein family that includes ELL2 and ELL3, both of which stimulate the rate of RNAP II in vitro (Shilatifard et al. 1997a; Miller et al. 2000). The single Drosophila ELL homolog (dELL) was identified as an essential factor during development, indicating that ELL has nonredundant functions distinct from TFIIF and Elongin in vivo (Gerber et al. 2001; Eissenberg et al. 2002). It was also revealed that dELL associates with active sites of transcription following heat shock and colocalizes with the elongation-competent form of RNAP II on polytene chromosomes (Gerber et al. 2001). Intriguingly, mutations within dELL result in expression defects preferentially in larger genes (Eissenberg et al. 2002; Shilatifard 2004). These observations hint at a specific role for dELL in elongation events far away from the promoter region, which apparently cannot be contributed by TFIIF and Elongins. Alternatively, ELL may function as an antitermination factor, as opposed to a factor that increases bond formation rate. Thus, ELL may act throughout the gene, but its effects would be seen only in the largest transcription units.
Notwithstanding the positive effects of ELL in transcriptional elongation, ELL can negatively affect RNAP II activity in promoter-specific initiation by an uncharacterized mechanism (Shilatifard et al. 1997b). In addition to its ability to associate with RNAP II (Gerber et al. 2001), ELL copurifies with three proteins in a stable complex (Shilatifard 1998). The ELL complex, which also contains EAP20, EAP30, and EAP45 (ELL-associated protein), has the same elongation activity as ELL, but does not negatively regulate RNAP II (Shilatifard 1998). The functional role for these ELL-associated proteins in transcriptional elongation remains to be clarified.
Treatment of mammalian cells with the nucleotide analog DRB (5,6-dichloro-1-b-D-ribofuranosylbenzimidazole) had been known for many years to result in truncated transcript formation (Chodosh et al. 1989; Yamaguchi et al. 1998). With crude extracts, transcription reactions performed in vitro were found to be DRB-sensitive. On the other hand, transcription reactions performed in vitro with purified factors had lost DRB sensitivity. The DRB-insensitive transcription reactions were then used to score for the purification of several factors that function in elongation based on their ability to restore DRB sensitivity to the assay (Wada et al. 1998). These factors were independently identified as P-TEFb, DSIF, and NELF. Extensive analyses have shown that these factors act in concert to regulate elongation, although their individual roles are disparate. As discussed below, DRB inhibits the kinase activity of the CDK9 component of P-TEFb. P-TEFb phosphorylates the RNAP II CTD and apparently one of the DSIF subunits. DSIF binds to RNAP II during or shortly after initiation. NELF interacts with DSIF to induce polymerase pausing, and such pausing is enhanced by DRB and relieved by P-TEFb.
P-TEFb (positive transcription elongation factor b) was originally identified based on its ability to stimulate DRB-sensitive transcription of long transcripts in vitro (Marshall and Price 1992; Marshall et al. 1996). The heterodimeric P-TEFb consists of the Cdk9 kinase that associates either with cyclin T1, cyclin T2a, cyclin T2b, or cyclin K (Peng et al. 1998a,b). The elongation activity of P-TEFb is dependent on its kinase activity, and both are DRB-sensitive (Price 2000). P-TEFb was originally shown to phosphorylate the CTD of RNAP II when RNA polymerase was in promoter clearance in vitro (Marshall et al. 1996). Importantly, evidence that P-TEFb targets Ser 2 of the CTD, but not Ser 5, was provided by RNA interference studies in C. elegans (Shim et al. 2002). More recently, studies using the highly specific P-TEFb inhibitor flavopiridol demonstrated that P-TEFb primarily targets Ser 2 at actively transcribed genes in cells (Ni et al. 2004). Upon heat shock, P-TEFb is rapidly recruited to active loci on Drosophila polytene chromosomes and is frequently colocalized with the promoter-paused hypophosphorylated form of RNAP II (Lis et al. 2000). These results suggest that P-TEFb is recruited to facilitate productive elongation upon pausing. Additional experiments revealed that P-TEFb displays a similar pattern of localization at active loci as that of the Spt6 elongation factor (see below) and appears to track along with RNAP II during elongation with similar kinetics (Andrulis et al. 2000; Boehm et al. 2003). Besides the CTD, P-TEFb also phosphorylates one of the subunits of DSIF. The addition of P-TEFb alone to highly purified early TECs showed no positive effect on transcript elongation. However, the addition of P-TEFb to transcription complexes paused in the presence of DSIF/NELF reversed this inhibition and full-length transcripts were obtained; this reversal was dependent on P-TEFb kinase activity.
DSIF (DRB-sensitivity-inducing factor) was identified based on its ability to confer DRB sensitivity to partially purified transcription reactions (Wada et al. 1998). DSIF was found to be a heterodimeric complex composed of the human homologs of Saccharomyces cerevisiae Spt4 and Spt5. The family of SPT genes, of which Spt4 and Spt5 are members, was originally identified in assays for mutational suppressors that offset transcriptional defects caused by retrotransposon Ty insertions into promoter regions (Winston et al. 1984). Biochemical and genetic evidence support interactions between Spt4, Spt5, and RNAP II (Wada et al. 1998).
DSIF was originally reported to inhibit transcript elongation when added to partially purified transcription reactions in the presence of DRB. More recent findings suggest the opposite, and implicate SPT4 and SPT5 in the promotion of elongation. It now appears that the partially purified factors used in the DRB sensitivity assay to identify DSIF contained the third factor, NELF (see below) (Yamaguchi et al. 1999), which acts in concert with DSIF to inhibit elongation.
Genetic studies in yeast and in vitro transcription assays implicate Spt4 as a positive elongation factor (Rondon et al. 2004). Spt4 antagonizes the negative effects of RNAP II pausing imposed by the chromatin-remodeling yeast factor Isw1p (Morillon et al. 2003). To study the effects of Spt5 in conjunction with TAT and P-TEFb on HIV transcription, a three-stage transcription assay was used (Bourgeois et al. 2002). This analysis showed early recruitment of Spt5 soon after transcription initiation. The later recruitment of Tat through its interaction with the TAR region of the transcribed RNA resulted in P-TEFb activation and hyperphosphorylation of both Spt5 and the RNAP II CTD (Bourgeois et al. 2002). Although not required for early elongation in this system, Spt5 was shown to prevent premature termination and pausing during late elongation.
Interaction studies showed that DSIF genetically and physically associates with TFIIF, TFIIS, and CSB (Rad26) as well as the chromatin-related factors Spt6, FACT, Chd1, and the PAF complex (Hartzog et al. 1998; Orphanides et al. 1999; Costa and Arndt 2000; Jansen et al. 2000; Krogan et al. 2002b; Mueller and Jaehning 2002; Squazzo et al. 2002; Lindstrom et al. 2003; Simic et al. 2003; Endoh et al. 2004). In addition, Spt5 interacts with factors implicated in mRNA maturation and surveillance (see below) (Wen and Shatkin 1999; Andrulis et al. 2002; Pei and Shuman 2002; Lindstrom et al. 2003). Spt5 can be methylated at arginine residues by PRMT1 and PRMT5 in vitro (Kwak et al. 2003). This modification directly affects Spt5 interaction with RNAP II and appears to be important for its regulation of transcript elongation (Kwak et al. 2003). Methylation of Spt5 together with P-TEFb-mediated phosphorylation of Spt5 and the CTD of RNAP II seem to constitute modifications requisite for productive elongation.
The NELF complex promotes RNAP II pausing (Yamaguchi et al. 1999). This pausing is elicited by NELF only in the presence of DSIF. In fact, whereas DSIF binds to RNAP II directly, NELF preferentially binds to the assembled DSIF/RNAP II complex, but not to DSIF or RNAP II in isolation (Yamaguchi et al. 2002). The functional competition between TFIIF and DSIF/NELF is consistent with the negative transcriptional affects observed for DSIF/NELF (Renner et al. 2001). In vitro studies using a partially purified transcription system suggested that DRB inhibits a kinase whose activity hinders NELF. This kinase was consequently identified as P-TEFb (Yamaguchi et al. 1999). The resumption of productive elongation occurs when NELF dissociates from the paused TEC, which arises from P-TEFb-mediated phosphorylation of the RNAP II CTD and the SPT5 subunit of DSIF (Ivanov et al. 2000; Kim and Sharp 2001). Additionally, the FACT heterodimer, originally identified for its role in allowing elongation through chromatin, functions to alleviate DSIF/NELF-mediated inhibition of transcript elongation in cooperation with P-TEFb (Wada et al. 2000).
NELF consists of five subunits, NELF-A, NELF-B, NELF-C, NELF-D, and NELF-E. NELF-A is encoded by a gene candidate for Wolf-Hirschhorn syndrome and contains weak homology to the hepatitis delta antigen (Yamaguchi et al. 2001). Interestingly, hepatitis delta antigen facilitates transcript elongation by blocking NELF association with RNAP II and DSIF (Yamaguchi et al. 2001). NELF-B, previously called COBRA1 (cofactor of BRCA1), interacts with BRCA1, a tumor suppressor implicated in breast cancer (Ye et al. 2001). NELF-C and NELF-D appear to arise from the same mRNA transcript through alternate usage of translation initiation codons (Narita et al. 2003). Finally, NELF-E contains an RNA recognition motif that is essential for NELF-mediated activities (Yamaguchi et al. 2002).
In vivo studies in Drosophila demonstrated that DSIF/NELF is positioned at the heat-shock promoter before induction (Andrulis et al. 2000; Wu et al. 2003). Following heat shock, DSIF and RNAP II, but not NELF, localize to active sites of transcription (Wu et al. 2003). In yeast, mutations in the DSIF components have both positive and negative effects on transcription (Swanson and Winston 1992; Hartzog et al. 1998). The dual role of DSIF in transcription could be explained if DSIF was an adaptor that connects other modulators to the RNAP II TEC.
The regulatory interactions that pause early transcription through DSIF/NELF appear to have evolved to allow the assembly of mRNA maturation factors on the RNAP II CTD under conditions in which the polymerase is engaged in a productive transcription complex. Based on the experimental evidence, the following sequence of events of the regulatory phase of elongation involving P-TEFb, DSIF, and NELF has been postulated (see also below; Fig. 3). DSIF/NELF mediated pausing allows a time frame for the capping enzyme recruitment and addition of a cap to the 5′-end of the nascent RNA. P-TEFb is then recruited and phosphorylates the SPT5 subunit of DSIF along with Ser 2 of RNAP II, presumably resulting in the disassociation of NELF and the resumption of elongation.
DSIF/NELF-mediated checkpoint to ensure pre-mRNA capping. (A) The Kin28 subunit of TFIIH phosphorylates the RNAP II CTD on Ser 5. (B) DSIF interacts with RNAP II shortly after initiation. Whether DSIF recognizes the unphosphorylated or Ser 5-phosphorylated CTD is unknown. (C) NELF recognizes the RNAP II–DSIF complex and halts elongation. This pause allows the recruitment of the capping enzyme by the CTD and DSIF (Spt5 subunit), which adds a 5′-cap to the nascent transcript. The activity of the capping enzyme is stimulated by both the CTD and the Spt5 subunit of DSIF. (D,E) NELF is released by the concerted action of P-TEFb phosphorylation of Spt5 and the CTD on Ser 2, PRMT1/5 methylation of Spt5, and the capping enzyme itself. FCP1 may also participate, as FCP1 is required to release the capping enzyme. The precise mechanism causing NELF release is unknown. See text for details.
CSB is a DNA-dependent ATPase that has been linked to transcriptional elongation. Mutations in the CSA and CSB/ERCC6 genes result in Cockayne Syndrome (CS), a premature-aging syndrome that is characterized by developmental abnormalities. Most patients with CS have mutations in CSB, which has been shown to stimulate the rate of elongation in vitro, to directly bind RNAP II, and to affect the activity of TFIIS (Selby and Sancar 1997; Tantin et al. 1997). Aside from transcriptional elongation, CSB has a role in transcription-coupled nucleotide repair (TCR) and base excision repair (BER) (Licht et al. 2003). Patients with CS exhibit defects in nucleotide excision repair (NER) (Licht et al. 2003). NER occurs through a distinct process used on a global scale, or on actively transcribed genes (Hanawalt et al. 2000; Wood et al. 2000). The stalled RNAP II complex at DNA lesions signals the factors necessary for TCR. The yeast homolog of CSB, RAD26, links transcript elongation to TCR (Gregory and Sweder 2001). Experiments in yeast suggest that RAD26, in concert with the DEF1 protein, functions to rescue stalled RNAP II at DNA lesions (Woudstra et al. 2002). Although no direct connection has been made, it is possible that CSB and the Elongins cooperate to alleviate stalled RNAP II complexes at DNA lesions. Interestingly, loss of CSB results in chromosomal fragility at specific loci, perhaps in part by stalling transcription complexes on distinct genes, thereby preventing chromatin condensation (Yu et al. 2000). It has been speculated that the general role of CSB in transcript elongation may perhaps explain the broad clinical symptoms associated with CS (Yu et al. 2000).
In most instances, a transcriptionally arrested RNA polymerase cannot resume productive elongation without accessory factors, although the enzyme remains catalytically active. Arrest is believed to result from the RNA polymerase “backtracking” relative to the DNA template, which results in misalignment of the catalytic active site and 3′-OH of the nascent RNA transcript. Transcriptional pause decays into arrest in a time-dependent fashion and is contingent on the “dwell time” the RNAP II spends at a pause/arrest site (Gu and Reines 1995). Arrested RNAP II complexes resume productive elongation via an evolutionarily conserved mechanism that requires cleavage of the RNA transcript in a 3′-to-5′ direction. Recent structural data support the idea that cleavage of the transcript allows the proper realignment of the polymerase active site and 3′-OH, promoting readthrough of the arrest site (see below).
The elongation factor TFIIS (SII) is the eukaryotic factor that promotes RNAP II readthrough at transcriptional arrest sites (Fish and Kane 2002; Conaway et al. 2003). TFIIS was initially identified by its ability to stimulate transcription in vitro (Natori et al. 1973), and was subsequently shown to function after transcription initiation, as well as to stimulate elongation by reducing RNAP II pausing (Rappaport et al. 1987; Reinberg and Roeder 1987; Sluder et al. 1989; Bengal et al. 1991). These studies further demonstrated that TFIIS functions to increase the efficiency of elongating RNAP II, in contrast to increasing the rate of synthesis through decreased pausing time as demonstrated for TFIIF, Elongin, and ELL. A recent report suggests that TFIIS and TFIIF work together in a manner distinct from TFIIS and HDAg (Zhang et al. 2003a).
GreA and GreB, the functional homologs of TFIIS in bacteria, provided important mechanistic clues regarding the function of TFIIS in elongation and revealed that RNA polymerase-mediated transcript cleavage is an evolutionarily conserved process. The cleavage reaction is intrinsic to the RNA polymerase itself, but is enhanced in the presence of accessory factors, such as TFIIS (Rudd et al. 1994; Orlova et al. 1995). The first example of transcript cleavage was provided for the Escherichia coli RNA polymerase under conditions that halted the RNA polymerase (Surratt et al. 1991), and notably, the relief of backtracked RNA polymerase mediated by GreA and GreB was observed in vivo (Toulme et al. 2000). A similar cleavage activity in arrested RNAP II occurs within active ternary complexes, and TFIIS is important for this activity (Izban and Luse 1992b; Reines 1992; Wang and Hawley 1993). TFIIS promotes readthrough of arrested TECs caused by intrinsic DNA sequences, DNA-binding proteins, and drugs that bind DNA (Fish and Kane 2002). Studies using an artificial RNA–DNA bubble were the first to demonstrate that RNAP II can incorporate a mismatched nucleotide and that TFIIS enhances the preferential removal of the misincorporation at the 3′-end (Jeon and Agarwal 1996). These observations alluded to RNAP II-mediated proofreading capability and implicated a direct role for TFIIS in this process. Indeed, kinetic studies suggested that RNAP II proofreading activity is dramatically enhanced by TFIIS (Thomas et al. 1998). On the contrary, in vivo studies in yeast lacking TFIIS did not display any alterations in RNAP II proofreading, challenging the extent to which TFIIS contributes to this activity (Shaw et al. 2002). Future studies will likely clarify the role of TFIIS in transcription fidelity in a more physiological setting.
Aside from the in vitro studies that initially characterized TFIIS, in vivo cross-linking, genetic interactions, and physical associations have further defined that TFIIS functions as an elongation factor in cells. TFIIS localizes to the coding regions of genes as shown by ChIP and physically interacts with the Spt5 subunit of DSIF (Pokholok et al. 2002; Lindstrom et al. 2003). Yeast TFIIS, termed PPR2, was initially cloned as a gene that regulated the pyrimidine biosynthesis pathway and, although it is not an essential gene, ppr2 mutants display sensitivity to 6-azauracil (Hubert et al. 1983; Nakanishi et al. 1995). Numerous biochemical studies use stalled RNAP II complexes formed during low NTP concentrations. In yeast, the UTP analog 6-azauracil creates a similar effect by lowering the cellular concentration of GTP and UTP (Exinger and Lacroute 1992). Therefore, mutant yeast strains that display sensitivity to 6-azauracil are considered to have deficiencies in transcript elongation. Genetic studies have further demonstrated that TFIIS enhances or suppresses the mutant phenotypes of many factors that regulate transcript elongation. These factors include DSIF, RAD26 (the yeast homolog of CSB), the PAF complex, and the chromatin modifiers SWI/SNF, FACT, and Spt6 (Orphanides et al. 1999; Costa and Arndt 2000; Davie and Kane 2000; Lee et al. 2001; Lindstrom and Hartzog 2001; Hartzog et al. 2002). An elegant approach using artificial arrest sites in vivo yielded key evidence supporting the hypothesis that TFIIS functions to alleviate arrested elongating complexes in living cells (Kulish and Struhl 2001). These studies showed that transcription arrest distally located from the promoter is alleviated by TFIIS selectively under conditions of high transcriptional activity, substantiating previous in vitro findings that TFIIS aids in alleviating the arrested state.
The recently solved structure of RNAP II in complex with TFIIS has provided important details regarding TFIIS function, and corroborates many of the previously observed features of TFIIS function (Fig. 4) (Kettenberger et al. 2003). Significantly, Cramer and colleagues (Kettenberger et al. 2003) determined that the active site of RNAP II undergoes extensive structural changes during TFIIS binding, and these structural changes are consistent with a realignment of the RNA in the active center. TFIIS was observed to extend along the surface of RNAP II, insert into a pore, and reach the active site from the bottom face of RNAP II. Two invariant acidic residues from TFIIS align in the active site in a manner consistent with their proposed role in RNA cleavage. Roughly, one-third of the RNAP II mass is structurally altered upon TFIIS binding, and the authors speculate that this repositioning of RNAP II could mimic an “elongating” state of the polymerase. Although the definitive mechanism of cleavage could not be determined in these studies, the authors speculate that TFIIS-induced RNA cleavage may be similar to that observed for the Klenow fragment of DNA polymerase I. In contrast to the DNA polymerase, these studies identified a single “tunable” active site for polymerization and cleavage in the RNAP II. Structural studies of the analogous GreB–RNAP complex from bacteria have revealed a remarkable mechanistic conservation of transcript cleavage, even though TFIIS and GreB are structurally unrelated (Opalka et al. 2003).
RNAP II–TFIIS–RNA complex. (A) A side view of the modeled RNAP II–TFIIS–RNA complex: The 12-subunit RNAP II is shown in silver, and TFIIS is shown in orange (domain III) and green (domain II). The structural zinc ions of RNAP II and TFIIS are depicted in cyan. The DNA template (blue) and RNA transcript (red) are situated according to their positions in the structure of the TEC (Gnatt et al. 2001). The metal ion in the active site is depicted as a purple sphere (Cramer et al. 2000, 2001). The movement of the RNAP II is denoted by arrows. (B) A cut-away model of A as viewed from the front. Ribbon models depict TFIIS and the DNA template and RNA transcript. The dashed red ribbon denotes the putative path of the backtracked RNA, which is cleaved at the active site. This figure was adapted from Kettenberger et al. (2003) with permission from Elsevier © 2003. See Kettenberger et al. (2003) for details.
The factors described in the previous sections affect transcript elongation mostly independent of chromatin, although transcription in vivo occurs on nucleoprotein templates. The packaging of DNA into chromatin has profound effects on all processes that require DNA access, including transcription. Although chromatin serves an essential role to compact and protect our hereditary information, the cell has evolved elaborate mechanisms to counteract, and use, the effects of restricted DNA access. The nucleosome is the basic unit of chromatin and consists of ∼147 bp of DNA wrapped around a single histone octamer. The histone octamer contains two copies each of the histone proteins H2A, H2B, H3, and H4. In addition to transcriptional pause and arrest, the requirement of RNAP II to traverse a nucleosome represents a major block to transcriptional elongation. It is now apparent that histones themselves possess information pertinent to transcriptional regulation in the form of posttranslational, covalent modifications (Strahl and Allis 2000; Jenuwein and Allis 2001). A great deal of progress has been made regarding how RNAP II mechanistically elongates in a chromatin environment. The models propose that both nucleosome mobilization and histone depletion occur as RNAP II progresses along the chromatin template (Studitsky et al. 2004). Challenges remain to better understand how the identified chromatin-remodeling factors work together to allow productive elongation, as well as how histone modifications translate to distinct functional outcomes. Although many factors are known to affect transcript elongation on chromatin templates, the mechanism by which these factors function in coordination to facilitate elongation is unknown. Moreover, although some factors can allow RNAP II to traverse nucleosomes, the rate of elongation in vitro is far less than the rates observed in vivo.
Early studies in vitro showed that transcriptional pausing is strongly enhanced on chromatin-containing templates (Izban and Luse 1991). Studies of heat-shock genes in Drosophila demonstrated that RNAP II complexes initiate transcription, but pause early in elongation in close proximity to the promoter in the uninduced state (Rasmussen and Lis 1993). Subsequent studies revealed that the specific transactivator heat-shock factor 1 (HSF1) alleviates the negative effects of chromatin structure on transcript elongation through the recruitment of SWI/SNF activity to the human Hsp70 gene (Brown et al. 1996). Both the activator and the SWI/SNF activity were required for transcription on nucleosomal templates, and this effect was not observed on DNA templates without nucleosomes. These observations were consistent with chromatin itself serving as an important regulatory block to productive elongation and, more specifically in this instance, by facilitating pausing near the promoter.
SWI/SNF (homothalic switching deficient/sucrose nonfermenting) is an ATP-dependent chromatin-remodeling complex previously recognized to have a role in transcription (Narlikar et al. 2002; Martens and Winston 2003). Later studies showed that HSF1 stimulates elongation in part through the recruitment of the ATPase BRG1 (Sullivan et al. 2001), a subunit of the SWI/SNF complex. Experiments conducted with transcript elongation mutants of HSF1, which carefully distinguished initiation and elongation events, revealed that BRG1 is locally recruited to chromatin regions upon heat shock, and is required for the formation of full-length hsp70 transcripts (Corey et al. 2003). Moreover, genetic interactions between SWI/SNF and TFIIS in yeast further substantiated a role for SWI/SNF in transcript elongation (Davie and Kane 2000). In contrast, studies in Drosophila have demonstrated that the BRG1-related SWI/SNF protein BRM is not required for expression of heat-shock genes (Armstrong et al. 2002), although significant differences exist between humans and flies, and multiple SWI/SNF complexes exist in both organisms.
As mentioned above, SWI/SNF can be recruited to a gene via interaction with a transcriptional activator. However, this wouldn't easily explain how the SWI/SNF complex might work in transcript elongation, as activators normally bind to the promoter regions of genes. Another possibility is a direct association between SWI/SNF and RNAP II. Indeed, a large protein complex containing, among other factors, RNAP II and SWI/SNF was isolated from both yeast and human cells (Wilson et al. 1996; Neish et al. 1998). The questions remain of how SWI/SNF-mediated chromatin remodeling mechanistically facilitates elongation, and how SWI/SNF activities are coordinated with other known chromatin-remodeling elongation factors, such as FACT and Spt6.
The ATPase CHD1 (chromo-ATPase/helicase-DNA-binding domain) remodels nucleosomes in vitro and appears to function in both elongation and termination (Tran et al. 2000). Studies performed on polytene chromosomes in flies revealed that Chd1 associates with highly active sites of transcription (Stokes et al. 1996). Chd1 mutant alleles in yeast are sensitive to 6-azauracil (Woodage et al. 1997), and genetically interact with the Set2 histone methyltransferase and the ISWI family, both implicated in elongation (Tsukiyama et al. 1999; Krogan et al. 2003c). Genetic interactions between Chd1 and SWI/SNF have also been reported (Tran et al. 2000). Chd1 physically associates with the Paf elongation complex, DSIF, and FACT (Kelley et al. 1999; Krogan et al. 2002b; Simic et al. 2003). Moreover, these studies provided evidence that Chd1 suppresses an Spt5 mutant phenotype comparable to that seen for the Paf complex and RNAP II (Simic et al. 2003). These collective results strongly imply that Chd1 functions as an elongation factor in vivo, although very little mechanistic data have been provided. Aside from its role in elongation, Chd1 participates in transcription termination and the remodeling of the 3′-end of genes in yeast (Alen et al. 2002). Furthermore, Chd1 binds relatively long A-T tracts in double-stranded DNA (Stokes and Perry 1995). As Chd1 contains a chromodomain, which can recognize methylated histone tails, it is tempting to speculate that Chd1 specifically recognizes methylated histones, perhaps spatially regulating its localization.
It has been known for many years that the chromatin structure within active genes is altered to increase DNA accessibility (Weintraub and Groudine 1976; Wu et al. 1979a,b; Karpov et al. 1984; Nacheva et al. 1989; for review, see Orphanides and Reinberg 2000). Independent studies showed that transcription on chromatinized DNA could be reconstituted in vitro in an activator-dependent manner, and in some contexts required ATP-dependent chromatin-remodeling activities (see above) (Kamakaka et al. 1993; Brown et al. 1996). Experimental evidence alluded to the possibility that RNA polymerases themselves contain the intrinsic ability to traverse nucleosomes, although the type of polymerase and the context of the chromatin template were a major determinant for these observed effects (Studitsky et al. 1994; Sathyanarayana et al. 1999; Walter and Studitsky 2001). An experimental assay specifically designed to identify factors that support RNAP II transcription on chromatin templates led to the discovery of FACT (Orphanides et al. 1998). The FACT (facilitates chromatin transcription) complex is highly conserved among eukaryotes, functions after transcription initiation to allow RNAP II transcription on nucleosome templates, and acts independently of TFIIF, TFIIS, and ATP (Belotserkovskaya et al. 2004). FACT is a heterodimer comprised of hSpt16 and the HMG-box protein SSRP1 (structure specific recognition protein-1) (Orphanides et al. 1999). The yeast homolog of hSpt16 belongs to the histone group of the SPT genes, which together with some other findings (see below) suggested a role for Spt16 in chromatin modulation (Winston and Sudarsanam 1998). Genetic studies in yeast had identified the later recognized subunits of FACT as having a role in productive elongation through chromatin. The yeast FACT components, Spt16/Cdc68 and Pob3, are encoded by essential genes and are implicated in the regulation of transcription and chromatin structure, as well as the proper progression though the cell cycle (Malone et al. 1991; Rowley et al. 1991; Xu et al. 1993; Lycan et al. 1994). Interestingly, studies to identify factors involved in DNA replication in yeast isolated the small subunit of FACT, Pob3, as a polypeptide that bound the catalytic subunit of DNA polymerase alpha (Wittmeyer and Formosa 1997). These and other studies have implicated FACT in chromatin-related functions outside of transcription (Okuhara et al. 1999; Schlesinger and Formosa 2000; Seo et al. 2003), suggesting that FACT is a general regulator of chromatin structure. Additional work in yeast has revealed that mutants of Spt16 are sensitive to 6-azauracil and demonstrated that Spt16 genetically interacts with TFIIS, DSIF, Spt6, Chd1, and the Paf complex (Orphanides et al. 1999; Costa and Arndt 2000; Formosa et al. 2002; Squazzo et al. 2002). FACT also physically associates with DSIF, Spt6, Paf, Chd1, and histones themselves (Krogan et al. 2002b; Squazzo et al. 2002; Belotserkovskaya et al. 2003; Lindstrom et al. 2003; Simic et al. 2003). As mentioned above, FACT can alleviate NELF-mediated inhibition of elongation in cooperation with P-TEFb in the absence of nucleosomes, suggesting a mechanistic role for FACT in transcription on naked DNA (Wada et al. 2000). Immunofluorescence studies on Drosophila polytene chromosomes revealed FACT recruitment to active sites of transcription upon heat shock in a manner comparable to RNAP II, Spt5, and Spt6 (Andrulis et al. 2000; Kaplan et al. 2000; Saunders et al. 2003). Furthermore, ChIP analyses showed that FACT is localized immediately downstream of the promoter region at active genes upon induction in yeast and flies (Mason and Struhl 2003; Saunders et al. 2003). Spt6 and FACT were independently shown to prevent initiation from within the coding region of genes in yeast, demonstrating a role for Spt6 and FACT in maintaining chromatin structure during elongation (Kaplan et al. 2003; Mason and Struhl 2003). Taken together, FACT has a clear role in modulating productive elongation in a chromatin context in vitro and in vivo.
Examination of chromatin structure during active transcription and mechanistic studies of FACT have led to a convincing model in which FACT functions to destabilize the nucleosome by selectively removing one H2A/H2B dimer, thereby allowing RNAP II to traverse a nucleosome. This model is supported by several independent studies. Alterations in chromatin structure during active transcription are accompanied by a specific loss of the H2A–H2B dimer (Kireeva et al. 2002). FACT activity is perturbed when nucleosomes are covalently cross-linked, and FACT physically interacts with both nucleosomes and H2A/H2B dimers (Orphanides et al. 1999; Belotserkovskaya et al. 2003). Moreover, in yeast, mutations in histone H4 that disrupt associations with H2A/H2B dimers exhibit phenotypes similar to those of Spt16 mutant strains (Malone et al. 1991; Rowley et al. 1991; Santisteban et al. 1997). Notably, FACT was shown to facilitate the loss of H2A/H2B dimers in assays using immobilized nucleosomal templates (Belotserkovskaya et al. 2003). Collectively, these observations are consistent with the model discussed above. As histones contain important epigenetic information that must be retained, the maintenance of nucleosome structure is required. Several lines of evidence support the idea that FACT, along with other associated proteins, helps to maintain accurate chromatin architecture after the TEC passes a nucleosome (see below).
Spt6 (suppressor of Ty 6) was identified in the same genetic screen that uncovered Spt4, Spt5, and Spt16 (Neigeborn and Carlson 1984; Winston et al. 1984; Clark-Adams and Winston 1987). Early work had characterized a genetic interaction between Spt6 and DSIF (Spt4 and Spt5), foreshadowing Spt6's participation in elongation (Swanson and Winston 1992). Mechanistic properties of Spt6 were revealed when Spt6 was found to be involved in the maintenance of chromatin structure. Spt6 promotes nucleosome assembly in vitro and interacts with histones, preferentially histone H3 (Bortvin and Winston 1996). Consistent with these findings, Spt6 mutant strains display alterations in chromatin structure in vivo (Bortvin and Winston 1996). Spt6 specifically colocalizes to actively transcribed regions together with the elongating form of RNAP II (Hartzog et al. 1998; Andrulis et al. 2000; Kaplan et al. 2000). Spt6 also displayed comparable recruitment kinetics as RNAP II, FACT, and Spt5 on the Hsp70 gene after heat shock (Wu et al. 2003). Mutations in the SPT6 gene lead to transcription initiation from cryptic start sites within the coding sequences (Kaplan et al. 2003). This finding supports the idea that Spt6 aids in maintaining chromatin structure during elongation. Studies using classical elongation assays on naked DNA in vitro demonstrated that Spt6 functions as an elongation factor that stimulates the transcription rate of RNAP II both autonomously and together with DSIF (Endoh et al. 2004). Together, these results implicate Spt6 as a modulator of RNAP II activity during elongation, as well as a maintenance factor for chromatin structure. Experiments in zebrafish have established a critical role for Spt6 in early development. Pandora, the zebrafish Spt6 homolog, exhibits multiple developmental abnormalities, including specific cardiac ventricular defects, and malformation of the ear, tail, eye, as well as reduced pigmentation (Malicki et al. 1996; Stainier et al. 1996; Yelon et al. 1999; Keegan et al. 2002). Whether Spt6 is directly or indirectly mediating these developmental processes remains to be determined. The diverse developmental phenotypes observed in Spt6 mutants likely underscores its general importance as a transcript elongation factor.
Spt6 may also have a role in the coordination of mRNA surveillance through its association with the exosome, a complex with exoribonuclease activity implicated in the “quality control” of pre-mRNAs (Andrulis et al. 2002). Genetic studies have identified interactions between Spt6 and DSIF, the PAF complex, and TFIIS (Costa and Arndt 2000; Hartzog et al. 2002). Additionally, Spt6 physically interacts with RNAP II, FACT, and the Spt5 subunit of DSIF in vitro and in vivo (Swanson and Winston 1992; Krogan et al. 2002b; Endoh et al. 2004). Experiments designed to identify Spt6-interacting proteins via affinity purification identified a novel yeast protein called Iws1/Spn1, not to be confused with Isw1 (Krogan et al. 2002b; Lindstrom et al. 2003). Mutations in the gene encoding the Iws1 protein display an Spt phenotype. Similar to Spt6, Iws1 copurifies with RNAP II, and associates with the coding regions of genes, as do Spt6 and RNAP II (Krogan et al. 2002b); thus, it has a putative role in elongation. The mechanistic function of this protein remains unknown, although future work should clarify its role in Spt6-mediated elongation.
The integrity of chromatin structure is an essential requirement for many cellular processes. DNA compaction and protection are two essential properties of chromatin, and disruption of this structure by DNA or RNA polymerases must be counteracted. Histones themselves contain pertinent information relevant to gene expression in the form of posttranslational, covalent modifications. Together with DNA methylation, histone lysine methylation forms the foundation of cellular identity and cell-type specificity. Disruption of this “information system” imbedded in chromatin would yield unfavorable consequences. Thus, a system must be used that maintains chromatin structure in the wake of a TEC.
It is probable that both FACT and Spt6 are central to regulating chromatin structure. As discussed above, FACT and Spt6 have been shown experimentally to be important in maintaining chromatin structure disrupted by active transcription (Kaplan et al. 2003; Mason and Struhl 2003). Similar to Spt6, FACT contains histone chaperone activity, and facilitates the deposition of histones onto DNA in vitro (Belotserkovskaya et al. 2003). FACT and Spt6 selectively interact with histone H2A/H2B dimers and histone H3, respectively (Bortvin and Winston 1996; Belotserkovskaya et al. 2003). Given these observations, FACT and Spt6 may cooperate functionally to maintain chromatin structure faithfully after transcription (Fig. 5).
The putative role of FACT and Spt6 in reassembling the nucleosome. During elongation, the removal of one H2A/H2B dimer by FACT allows the RNAP II TEC to transcribe through nucleosomes. The histone chaperone activities of FACT and Spt6 may facilitate the reassembly of nucleosomes following the wake of the TEC. Aside from the importance of maintaining chromatin structure for its compaction and repressive functions, the positioning and integrity of nucleosomes must be faithfully sustained to correctly transmit vital epigenetic information. See text for details.
Besides Spt6 and FACT, other histone chaperones as well as ATP-dependent chromatin-remodeling complexes can facilitate replication-independent exchange of histones, indicating that perhaps other factors are involved (Ray-Gallet et al. 2002; Bruno et al. 2003). This maintenance system requires further examination to better understand how chromatin structure relates to elongation and successive rounds of transcription. In addition, it has recently been demonstrated that the histone variants H2AZ and H3.3 are incorporated into chromatin by specific ATP-dependent chromatin-remodeling complexes and histone chaperone complexes (Krogan et al. 2003b; Mizuguchi et al. 2004; Tagami et al. 2004). It is possible that histone variants themselves function to destabilize nucleosome structure, thus facilitating transcription. Biochemically, the functions of these new histone exchangers have not been examined in detail, and given the histone chaperone activity of FACT and Spt6, it is tempting to speculate that FACT and Spt6 contribute to H2AZ and H3.3 deposition in vivo, respectively.
The intricacies of transcriptional regulation are being fully appreciated as rapid advances are made in understanding the basis of histone modifications. Histone proteins are posttranslationally, covalently modified by a wide variety of enzymes. The highly accessible N-terminal histone tails undergo several covalent modifications including acetylation, methylation, ubiquitination, phosphorylation, and ADP-ribosylation (Vaquero et al. 2003; Zhang 2003). Transcriptionally active genes are typically enriched for histones containing acetylated residues in comparison to genes that are silent. Histone acetylation destabilizes chromatin structure by disrupting internucleosome associations as well as histone tail interactions with linker DNA. Histone acetylation is important for productive transcription, and even though histones must be acetylated to maintain “opened” nucleosomes, detailed evidence for a specific role of histone acetylation in elongation is limited (Walia et al. 1998). One of the links between elongation and specific histone acetylation is suggested by the interaction between the Spt16 subunit of FACT and a component of the NuA3 histone H3 acetyltransferase complex both in vitro and in vivo (John et al. 2000). Nevertheless, biochemical studies are needed to elucidate important details regarding the significance of histone acetylation and specific elongation events.
Elongator was initially isolated as a factor that associates with the elongation-competent form of RNAP II, and displays genetic defects consistent with its role in elongation (Otero et al. 1999; Wittschieben et al. 1999, 2000; Fellows et al. 2000; Formosa et al. 2002). Elongator is an acetyltransferase complex whose function in transcript elongation is controversial. It has been suggested that the yeast Elongator allows elongation through chromatin templates (Wittschieben et al. 1999, 2000), and in vitro studies demonstrated that Elongator-mediated stimulation of transcription initiation on chromatin templates required acetyl-CoA, specifically implicating Elongator in histone modification activities (Kim et al. 2002). However, Elongator was reported to be primarily cytoplasmic, and a proteomics approach failed to detect an interaction between Elongator and RNAP II in yeast, but did so for DSIF, FACT, Spt6, TFIIF, TFIIS, and the PAF complex (Krogan et al. 2002b). A recent report revealed that Elongator interacts with RNA in vitro and in vivo, including RNA transcripts located at actively transcribed genes (Gilbert et al. 2004). These observations may suggest a role for Elongator in elongation, although additional work is needed to clarify its mechanistic contribution.
Histone methylation had been acknowledged for many years, but recent interest has provided key observations regarding its function in transcription. In particular, histone lysine methylation appears to play a role in establishing both short- and long-term transcriptional regulation. Silent or repressed regions co-map with methylation at the following histone lysine residues, H3-K9, H3-K27, and H4-K20, and transcriptionally active domains are typically associated with methylation at H3-K4, H3-K36, and H3-K79 (Lachner et al. 2003; Sims et al. 2003). Experiments in yeast suggest that histone methylation and ubiquitination are interconnected, and appear to play a role in elongation. The yeast Set2 protein functions as an H3-K36-specific histone methyltransferase (HKMTase). Set2 preferentially associates with the hyperphosphorylated form of RNAP II, and deletions of Set2 in yeast confer sensitivity to 6-azauracil (Li et al. 2002, 2003; Krogan et al. 2003c; Schaft et al. 2003; Xiao et al. 2003). Set2 was further demonstrated to directly interact with the CTD of the large subunit of RNAPII, and partial deletion of the CTD, or the CTD-kinase Ctk1 (the yeast homolog of P-TEFb), resulted in a loss of H3-K36 methylation (Li et al. 2002, 2003; Krogan et al. 2003c; Xiao et al. 2003). Consistently, Set2 and H3-K36 methylation associate with the coding region of genes as revealed by ChIP analyses (Krogan et al. 2003c; Schaft et al. 2003; Xiao et al. 2003). It was subsequently discovered that the association of Set2 and H3-K36 methylation within coding regions was dependent on the PAF elongation complex (Krogan et al. 2003c). Aside from this observation, these studies identified genetic interactions between Set2 and the histone H2B ubiquitination complex, the Chd1 elongation factor, and the Set1 and Set3 complexes (Krogan et al. 2003c). Although histone H3-K36 methylation exists in higher organisms, its role in transcript elongation is ambiguous.
Set1 functions as a specific histone H3-K4 methyltransferase in yeast and copurifies with a large complex (Briggs et al. 2001; Miller et al. 2001; Roguev et al. 2001; Krogan et al. 2002a; Nagy et al. 2002; Noma and Grewal 2002). Similar to Set2, Set1 associates with the RNAP II CTD and the PAF elongation complex (Krogan et al. 2003a; Ng et al. 2003b). However, Set1 preferentially interacts with the Ser 5 phosphorylated form of RNAP II, the form associated with early transcriptional events (Ng et al. 2003b). Set1-mediated histone H3-K4 methylation occurs at promoters and within the coding region of active genes in yeast. Histone lysine methylation exists in the mono-, di-, or trimethylated state. In yeast, H3-K4 dimethylation occurs on a genome-wide scale, whereas trimethylation of H3-K4 strictly corresponds to actively transcribed genes (Ng et al. 2003b). The precise function of dimethyl H3-K4 in yeast has not been definitively determined. In mammals, both di- and trimethylated H3-K4 are associated with euchromatic regions, but whereas high levels of trimethyl H3-K4 are exclusively located on active genes, dimethyl H3-K4 has been associated with silenced regions as well (Schneider et al. 2004). Experiments designed to understand how methylation at H3-K4 positively affects transcription revealed that the histone deacetylase NuRD complex binds to the H3 tail in the absence of H3-K4 methylation (Nishioka et al. 2002; Zegerman et al. 2002), thus maintaining a deacetylated state of the tail. Moreover, recent studies identified proteins that selectively recognize the H3-K4 methyl mark, such as the ATP-dependent chromatin-remodeling protein Isw1p, a protein shown to regulate distinct stages of transcription, including elongation (see above) (Morillon et al. 2003; Santos-Rosa et al. 2003). Isw1p and Set1 were required for the proper distribution of RNAP II on certain genes and the recruitment of a cleavage and polyadenylation factor (Santos-Rosa et al. 2003). Functional homologs of Set1 exist in higher organisms, such as the human SET1 and the MLL family; however, many of the details relating to the precise function of H3-K4 methylation remain to be described. MLL (mixed lineage leukemia) has been intensely studied for many years because of its common chromosomal translocations in leukemia (Ernst et al. 2002). Only recently was it determined that MLL, and its related proteins, function as histone methyltransferases (Milne et al. 2002; Nakamura et al. 2002). The TAC1 complex, which contains both histone acetylation and H3-K4 methylation activity and is composed of Trithorax (Trx; fly homolog of MLL) and CREB-binding protein (CBP) among others, is recruited to the hsp70 gene upon heat shock in a manner comparable to Spt5 and localizes to coding regions within actively transcribed genes (Smith et al. 2004). Importantly, these results implicate both TAC1 and histone H3-K4 methyltransferase activity in transcriptional elongation in higher organisms.
Paf1 (RNA polymerase-associated factor 1) was initially identified by its ability to associate with RNAP II in yeast independent of the CTD. Paf1 is found in a complex with four additional subunits, Ctr9, Cdc73, Rtf1, and Leo1 (Shi et al. 1996, 1997; Krogan et al. 2002b; Mueller and Jaehning 2002; Squazzo et al. 2002). Genetic studies of PAF subunits revealed a wide range of phenotypes, including transcript elongation phenotypes (Costa and Arndt 2000; Mueller and Jaehning 2002; Squazzo et al. 2002; Mueller et al. 2004). The PAF complex has also been demonstrated to cross-link throughout the entire length of genes, consistent with its functioning as an elongation factor (Krogan et al. 2002b; Pokholok et al. 2002). As mentioned previously, PAF is required for the recruitment of Set2 to coding regions, with resultant histone H3-K36 methylation, and associates with the Set1 histone methyltransferase. Additionally, histone methylation at H3-K4 and H3-K79 depends on the PAF complex (Krogan et al. 2003a). Components of PAF are also required for Rad6-mediated histone H2B monoubiquitination, a histone modification shown to be a prerequisite for H3-K4 and H3-K79 methylation (Briggs et al. 2002; Dover et al. 2002; Sun and Allis 2002; Ng et al. 2003a; Wood et al. 2003). The important discovery of an H2B deubiquitination factor has clarified how H2B monoubiquitination regulates specific histone methylation in vivo (Henry et al. 2003; Daniel et al. 2004). The H2B-deubiquitination factor Ubp8 is a component of the SAGA acetyltransferase complex (Grant et al. 1997). Experiments using mutant alleles suggested that H2B monoubiquitination promotes Set1-mediated methylation of H3-K4, subsequently followed by H2B deubiquitination and Set2-mediated H3-K36 methylation (Henry et al. 2003). From the evidence provided above, it is probable that the PAF complex regulates this trans-histone modification and may serve as a “switch” that controls the change in the histone H3 methylation pattern from H3-K4 to H3-K36 (Fig. 6). In a new development, loss of PAF subunits were observed to result in a reduction in Ser 2 CTD phosphorylation and poly(A) tail length (Mueller et al. 2004). The distribution of RNAP II, TREX components (Table 1; see below), Spt5, and FACT were unaffected by the loss of the PAF complex. The authors concluded that the major function of PAF is independent of transcript elongation (Mueller et al. 2004). In the context of our discussion, the PAF complex can be more accurately referred to as a passive elongation factor, perhaps with functions similar to the Mediator's during initiation of transcription, but regulating and coordinating the many aspects of transcript elongation and downstream events of mRNA biogenesis. Undoubtedly, more work is needed to clarify these exciting observations, but elucidating the role of PAF will likely be critical in determining the molecular mechanism through which covalent histone modifications facilitate transcript elongation.
The switch between histone H3-K4 and H3-K36 methylation. The PAF complex stimulates Rad6/Bre1-mediated monoubiquitination of H2B, followed by the recruitment of the proteosomal ATPases Sug1 (Rpt6) and Rpt4, which is dependent on H2B monoubiquitination. Methylation of H3-K4 by Set1, which is recruited by the Ser 5-phosphorylated RNAP II CTD and PAF, requires the presence of Sug1 and Rpt4. H2B is then deubiquitinated by the Ubp8 subunit of the SAGA acetyltransferase complex, which allows the Ser 2-phosphorylated CTD and PAF recruitment of Set2 that methylates H3-K36. H2B monoubiquitination prevents the premature methylation of H3-K36 and serves to regulate these distinct methylation patterns. Along with the CTD, the central role of the PAF complex suggests that it is an important mediator of this process, although the mechanism behind Set1 versus Set2 recruitment is unknown. See text for details.
A recent study revealed that mutations in the proteosomal ATPases Rpt4 and Sug1 (Rpt6) disrupt histone H3-K4 and H3-K79 methylation (Ezhkova and Tansey 2004). Moreover, H2B monoubiquitination was required for the recruitment of proteosome components to chromatin, but ubiquitination of H2B was not disrupted in Rpt4 and Sug1 mutants. Rtp4 and Sug1 are components of the 26S proteosome, a large cellular machine that degrades polyubiquitinated conjugated proteins (Pickart and Cohen 2004). Previous studies observed a genetic interaction between Sug1 and FACT in yeast, implicating Sug1 in transcriptional regulation (Xu et al. 1995). Surprisingly, efficient elongation by RNAP II has been shown to be dependent on the 19S regulatory particle of the 26S proteosome (Ferdous et al. 2001). Sug1 mutants displayed sensitivity to 6-azauracil, and the 19S particle stimulated transcript elongation in vitro. Inhibition of the 20S proteosome core had no effect on elongation, indicating that degradation does not play a role in this process (Ferdous et al. 2001). Cross-linking studies showed that specific components of the 19S particle, but not the 20S proteolytic core, are recruited to transcriptionally active genes upon induction (Gonzalez et al. 2002). Several independent studies linked nuclear hormone receptor regulation to the 19S regulatory particle, although in a manner that required proteosome-mediated degradation (Gianni et al. 2002; Reid et al. 2003; Perissi et al. 2004). These observations collectively provide a model of how the 19S regulatory particle of the 26S proteosome can coordinate transcription and degradation in some contexts, although the connection between Rpt4 and Sug1 to histone methylation suggests a broader role for the 19S regulatory particle in transcription (Fig. 6). More work is needed to delineate how the proteosome, the PAF complex, histone monoubiquitination, and methylation are coordinated to allow productive elongation.
The CTD of the large subunit of RNAP II is extensively phosphorylated and dephosphorylated during different stages of transcription. The CTD physically interacts with a large number of proteins and can be viewed as a “docking site” for factors required for different mRNA maturation events that occur concomitantly with transcript elongation. The three major kinases that target the RNAP II CTD are the cyclin-dependent kinases Cdk7, Cdk8, and Cdk9 (Prelich 2002). These enzymes are evolutionarily conserved from yeast to mammals, and all three are components of different protein complexes. Cdk7/Kin28 is a subunit of TFIIH (Orphanides et al. 1996). It is responsible for Ser 5 phosphorylation subsequent to the formation of the first phosphodiester bond (Akoulitchev et al. 1995). Kin28 functions to hyperphosphorylate the RNAP II CTD, which is essential for the recruitment of capping enzyme and Set1, as well as to regulate promoter clearance (Rodriguez et al. 2000; Orphanides and Reinberg 2002). Cdk8 associates with the Srb/Mediator complex and functions in transcriptional events prior to elongation (Maldonado et al. 1996; Cho et al. 1998). The third kinase, Cdk9, is part of P-TEFb (see above) (Zhu et al. 1997). The cyclin T subunit of P-TEFb recognizes the CTD via a histidine-rich sequence. The C. elegans PIE-1 transcriptional repressor was recently shown to inhibit transcriptional elongation by blocking cyclin T binding to the RNAP II CTD through an alanine-heptapeptide repeat (Zhang et al. 2003b). These studies not only identified a mechanism for P-TEFb regulation, but also uncovered a novel mode of transcriptional repression. In addition, two separate studies discovered that P-TEFb associates with the small nuclear RNA (snRNA) 7SK, a 330-nt-long snRNA that is evolutionarily conserved (Reddy et al. 1984; Nguyen et al. 2001; Yang et al. 2001). P-TEFb containing 7SK has lower kinase activity and is not recruited to the HIV-1 promoter. Treatment of cells with ultraviolet irradiation or chemicals resulted in the disassociation of 7SK and P-TEFb, broaching yet another mode of P-TEFb regulation (Nguyen et al. 2001; Yang et al. 2001; Sano et al. 2002). Further studies showed that inhibition of P-TEFb by 7SK snRNA was dependent on the association of a protein called HEXIM1 (MAQ1) (Michels et al. 2003; Yik et al. 2003). An additional link between snRNAs and transcription was provided by the demonstration that TFIIH contains the U1 snRNA and a promoter-proximal 5′-splice site specifically stimulated reinitiation mediated by TFIIH (Kwek et al. 2002).
Cdk9 has two homologs in yeast, Ctk1 and Bur1 (Yao and Prelich 2002). The two yeast kinases likely contain distinct functional activities and may have different targets in vivo. Whereas Bur1 is essential for cell viability, Ctk1 is not (Lee and Greenleaf 1991; Prelich and Winston 1993; Yao et al. 2000). Ctk1 is responsible for elongation-associated Ser 2 phosphorylation of the CTD and is localized to the coding regions of genes (Patturajan et al. 1999; Cho et al. 2001). Even though Bur1 can phosphorylate the RNAP II CTD, genetically interacts with CTD truncations, and colocalizes with elongating RNAP II, mutations in the BUR1 gene do not appear to affect either Ser 2 or Ser 5 phosphorylation (Murray et al. 2001; Keogh et al. 2003). Bur1 and Bur2 were initially identified in a genetic screen as factors that increase transcription (Prelich and Winston 1993). Bur1 is a protein kinase and interacts with the cyclin Bur2 (Yao et al. 2000). Deletion of Ctk1 results in loss of histone H3-K36 methylation and Set2 recruitment (Krogan et al. 2003c; Xiao et al. 2003), further highlighting the role of Ctk1 in the regulation of transcript elongation. However, the fact that Ctk1 is not essential suggests that other kinases, likely Bur1, can compensate for this deficiency. Because Set1 and Set2 are not essential genes and CTD-Ser 2 phosphorylation appears to function in part by recruiting RNA processing factors involved in splicing, and because most yeast genes lack introns, it is possible that yeast can survive without a dedicated CTD-Ser 2 kinase. Perhaps yeast is not the ideal system to study the function of Ser 2, and future studies performed in worms and/or Drosophila will uncover the essential functions of P-TEFb in more genetically complex systems.
As discussed above, the FCP1 phosphatase targets the CTD of RNAP II (Chambers and Dahmus 1994; Archambault et al. 1997, 1998), and interacts directly with TFIIF (Chambers et al. 1995; Kamada et al. 2003; Nguyen et al. 2003). FCP1 purified from HeLa cells exhibits elongation stimulatory activity in vitro and participates in RNAP II recycling (Cho et al. 1999; Mandal et al. 2002). Studies in yeast revealed that Fcp1 dephosphorylates the CTD of RNAP II in vivo and genetically interacts with RNAP II (Kobor et al. 1999). In addition, Fcp1 displays a genetic interaction with DSIF and the Paf complex (Costa and Arndt 2000; Lindstrom and Hartzog 2001). Whereas P-TEFb is required for Tat-mediated stimulation of HIV-1 transcription, FCP1 is effectively inhibited by direct interaction with Tat (Marshall et al. 1998). The peptidylprolyl isomerase Pin1 was recently shown to block transcription and pre-mRNA splicing in part by specifically inhibiting FCP1, as well as indirectly stimulating the phosphorylation of the RNAP II CTD (Xu et al. 2003). FCP1 activity is also inhibited in the presence of capping enzyme (Palancade et al. 2004). FCP1 localizes to promoter and coding regions of active genes, and mutation of FCP1 results in higher levels of Ser 2 phosphorylation (Cho et al. 2001). Similarly, FCP1 from Schizosaccharomyces pombe was shown to preferentially dephosphorylate Ser 2 over Ser 5 on the CTD (Hausmann and Shuman 2002). In vitro experiments imply otherwise, as FCP1 dephosphorylates Ser 2 and Ser 5 to an equal degree (Cho et al. 1999; Lin et al. 2002). Interestingly, structural studies identified that Fcp1 catalytic specificity may arise from interactions with both the CTD and the Rpb4/7 subunits of RNAP II (Kamenski et al. 2004). Contrary to FCP1, a family of small CTD phosphatases (SCPs) mediates the selective dephosphorylation of Ser 5 on the RNAP II CTD (Yeo et al. 2003). SCP1 has the ability to affect activated transcription, and similar to FCP1, TFIIF stimulates its phosphatase activity (Yeo et al. 2003). However, the mechanism through which SCP1 functionally regulates transcription is currently unknown. In addition, the Ssu72 phosphatase targets Ser 5 of the CTD and is involved in recycling the RNAP II (see below). It was recently discovered that an insertion into the FCP1 gene is associated with an autosomal recessive developmental disorder termed congenital cataracts facial dysmorphism neuropathy syndrome (CCFDN) (Varon et al. 2003). Precisely how disruption of FCP1 contributes to this disorder has not been determined.
Transcription is a highly integrated process that is tightly coupled to mRNA maturation, surveillance, and export (Maniatis and Reed 2002; Orphanides and Reinberg 2002). Messenger RNA processing takes place most efficiently cotranscriptionally and involves the addition of a 5′-cap, the excision of intronic sequences by splicing factors, and the addition of a 3′-poly(A) tail. The transcriptional events that are coupled to multiple nuclear processes are reviewed elsewhere (Hirose and Manley 2000; Shatkin and Manley 2000; Proudfoot et al. 2002). An integral component in the coordination of these events is the CTD of RNAP II, which appears to serve as a platform for many of the factors required for mRNA maturation. Comparison of the crystal structures of the capping enzyme–CTD and the Pin1–CTD interactions demonstrated that the CTD could adopt very different conformations depending on its phosphorylation status and binding partner (Fig. 7) (Fabrega et al. 2003). Additionally, further diversity in CTD-mediated binding is exhibited from the structure of the 3′-RNA processing factor Pcf11 CTD-interacting domain (CID) bound to a CTD peptide phosphorylated at Ser 2, and more importantly, these studies have provided insight into the potential overall structure of the CTD (Meinhart and Cramer 2004). Modeling of the entire 26-heptad repeats of the yeast CTD revealed a compact left-handed β-spiral structure that was stabilized by Ser 2 phosphorylation (Fig. 8; Meinhart and Cramer 2004). Interestingly, this structure was not compatible with phosphorylation at Ser 5, suggesting differing structures dependent on the phosphorylated status. The authors speculate that this β-spiral structure is pertinent to the mRNA processing cycle, which is intimately connected to the phosphorylation state of the CTD (Meinhart and Cramer 2004). These collective observations explain how the CTD recruits a wide variety of proteins and likely explains its functional diversity. Moreover, specific regions of the CTD were shown to independently stimulate capping (N terminus), splicing, and 3′ processing (heptads 27–52) (Fong and Bentley 2001). The C-terminal region containing 10 amino acids at the end of the heptad repeats is also required for high transcriptional activity, splicing, and cleavage (Fong et al. 2003). The CTD can be thought of as a nucleation center for the many factors that regulate mRNA processing events concomitant with elongation. The CTD could mediate factor recruitment in a general manner or specifically, by selective binding of distinct heptads (Fig. 9). Thus, the CTD may have a “code” comparable to the recently proposed histone code (Strahl and Allis 2000; Jenuwein and Allis 2001; Buratowski 2003). It is important to note that the heptad repeats are not identical, and subtle differences among the repeats may be required for specific factor binding. Precedent for this idea comes from the observation that histone variants that differ by a few amino acids, such as H3 and H3.3 or H2A and H2AZ, appear to have disparate functions. Perhaps the subtle differences among heptads contribute to distinct functional outcomes as well. Below we highlight the importance of the CTD in coordinating events that take place cotranscriptionally. The central role of the CTD in these processes underscores the complexity of transcript elongation, belying it as merely a mechanism to extend a nascent mRNA chain.
CTD secondary structures in complex with Cgt1 and Pin1. (A) Ribbon, transparent surface, and solid bond representation for the CTD backbone and amino acids T4a–P6c, respectively, bound to CDS1 and CDS2 of Cgt1 (PDB code 1p16) (Fabrega et al. 2003). Residues in direct contact with the protein surface include S5a, P6a, T4a, Y1b, Y1c, P3c, and S5c. (B) Ribbon, transparent surface, and solid bond representation for the CTD backbone and amino acids Y1–S7 bound to the WW domain of Pin1 (PDB code 1f8a) (Verdecia et al. 2000). The protein surfaces involved in CTD binding would be located below each CTD element in panels A and B.
The β-spiral model of the CTD. (Left) The complete 12-subunit RNAP II structure is represented in silver (Armache et al. 2003); the active center is denoted by the magenta sphere. The arrow marks the direction of the putative RNA exit path. Iterative superposition of the CTD peptide structure creates a CTD β-spiral model shown as a coil, with alternating β-turns (cyan) and extended regions (pink). A mobile linker of ∼90 amino acids connects the CTD to the structured core of RNAP II. The yeast CTD model (26 heptad repeats) has a length of 100 Å. A detailed view of four CTD repeats is shown above the spiral model. Asterisks denote three superimposed Pro 6–Ser 7 regions. Ser 5 side chains inside the spiral are labeled with “×.” This figure is reprinted with permission from Meinhart and Cramer (2004; © 2004 Nature Publishing Group, http://www.nature.com).
The CTD is a docking site. (A) The specific residue and location of phosphorylation on the heptad repeats with the CTD of the large subunit of RNAP II creates a “docking” site for numerous factors involved in pre-mRNA processing during elongation. This pattern of phosphorylation ensures the timeliness of proper recruitment. (B) The CTD may serve as a general docking site allowing the nucleation of many factors in an indiscriminate manner. (C) Alternatively, processing factors may selectively recognize sequence variations among the heptads repeats as a CTD “code” that predetermines binding partners in a manner comparable to the histone code. (1) As suggested by structural analyses (Fabrega et al. 2003), a CTD-binding factor (SF, splicing factor) may bind the CTD and scan for its proper heptad recognition sequence. (2) The looping out of specific heptads may then be recognized by additional cofactors, perhaps in a manner that requires looping, thereby creating additional specificity.
Capping of the 5′-ends of nascent pre-mRNAs is carried out by the sequential activity of the 5′-triphosphatase, guanylyltransferase, and methyltransferase. The first nucleotide of the nascent RNA is converted to a diphosphate by the 5′-triphosphatase, followed by the fusion of a GMP moiety to this same nucleotide by guanylyltransferase activity. The triphosphatase and guanylyltransferase activities are encoded in a single polypeptide in mammals, whereas yeast encodes two distinct proteins. Finally, the methyltransferase adds a methyl group to the N7 position of the GMP moiety, completing the cap structure (Proudfoot et al. 2002). The capping enzyme is recruited in part through direct binding to the phosphorylated form of the RNAP II CTD (Yue et al. 1997). Capping activity is highly integrated with the RNAP II TEC, and is specifically stimulated by phosphorylation of the CTD on Ser 5 (Ho and Shuman 1999; Moteki and Price 2002). Moreover, an RNAP II with a truncated CTD displayed capping defects (McCracken et al. 1997a). A “checkpoint” was proposed to exist early in elongation that allows for the faithful addition of a 5′ pre-mRNA cap (Fig. 3) (Orphanides and Reinberg 2002; Pei and Shuman 2002). As discussed above, a DSIF/RNAP II complex is specifically recognized by NELF, which functions to pause the RNAP II TEC. Capping enzyme is recruited to this stalled TEC through direct interactions with the Spt5 subunit of DSIF, which has been shown to stimulate capping activity along with the Ser 5 phosphorylated CTD (Cho et al. 1997; Wen and Shatkin 1999). Importantly, the capping enzyme itself was demonstrated to relieve NELF-mediated repression (Mandal et al. 2004). Following capping, NELF is released, perhaps as a result of the combinatorial action of the capping enzyme, P-TEFb-mediated phosphorylation of Ser 2 CTD and Spt5, Spt5 methylation, and FCP1 phosphatase action. Interestingly, FCP1 is necessary for the removal of the capping enzyme from the TEC (Schroeder et al. 2000). Evidence consistent with this model was recently described in the fission yeast S. pombe (Pei et al. 2003). Furthermore, the yeast capping machinery was found to affect early stages of promoter clearance/early elongation, indicating a similar mechanism in yeast that ensures the addition of a 5′-cap (Schroeder et al. 2004).
The large spliceosome complex functions to splice pre-mRNAs, and consists of small nuclear ribonucleoprotein particles (snRNPs) and additional proteins such as the serine/arginine-rich (SR) protein family (Jurica and Moore 2003). Splicing reactions are biochemically distinct from transcription in vitro; however, a great deal of evidence suggests that transcription and splicing are coupled in the cell. As mentioned for capping, CTD truncations inhibit splicing, termination, and processing of the 3′-end, and direct associations between the CTD and the cleavage-polyadenylation factors CPSF and CstF have been observed (McCracken et al. 1997b). Consistent with this, recruitment of the splicing machinery by the full-length, but not a truncated form of the CTD, was visualized in living cells (Misteli et al. 1997). The hyperphosphorylated form of the RNAP II CTD preferentially associates with splicing factors (Chabot et al. 1995; Mortillaro et al. 1996; Yuryev et al. 1996; Kim et al. 1997), and splicing as well as 3′-processing are enhanced by the CTD (Fong and Bentley 2001). In vitro, splicing can be inhibited by blocking the CTD with specific peptides or antibodies (Chabot et al. 1995), and the phosphorylation status of the CTD of RNAP II specifically regulates pre-mRNA splicing activity (Hirose et al. 1999). As mentioned above, Pin1 blocks pre-mRNA splicing by inhibiting FCP1 and indirectly stimulates the phosphorylation of the RNAP II CTD (Xu et al. 2003). Recently, the p54 (nrb) splicing factor was shown to associate with RNAP IIO, as well as with snRNPs, TFIIF, and P-TEFb (Kameoka et al. 2004). Fascinatingly, an RNA polymerase with a slow elongation rate gave rise to different outcomes with respect to alternative splicing, implying that elongation events control splice-site selection (de la Mata et al. 2003). Moreover, selection of a particular splice site was, indeed, related to the speed of the TEC, indicating an important role for pausing in this process.
Similar to capping and splicing, transcription termination and 3′-end processing require factors that associate with the RNAP II CTD. Several components of the 3′-end processing machinery interact with the phosphorylated CTD-Ser 2 in vitro (Barilla et al. 2001; Licatalosi et al. 2002), and RNAP II has been shown to stimulate 3′-end processing (Hirose and Manley 1998; Proudfoot et al. 2002). Surprisingly, two independent studies determined that 3′-end processing defects were the major alterations resulting from a loss of Ser 2 phosphorylation, as the recruitment of elongation factors was unaffected (Ahn et al. 2004; Ni et al. 2004). Ctk1-deficient yeast cells and a specific P-TEFb inhibitor were used in Drosophila to identify these pre-mRNA defects (Skaar and Greenleaf 2002). These observations underscore the importance of the CTD in cotranscriptional RNA processing. Ssu72 is a component of the cleavage/polyadenylation factor complex, and the growth defects of Ssu72 mutants are suppressed by 6-azauracil (Dichtl et al. 2002; He et al. 2003). Ssu72 was recently shown to specifically dephosphorylate Ser 5 on the RNAP II CTD, leading to the hypothesis that Ssu72 functions to recycle RNAP II, as well as mediate proper transcript cleavage for efficient termination (Krishnamurthy et al. 2004). Additionally, the poly(A) RNA-binding proteins Gbp2 and Hrb1 were recently shown to physically associate with the TREX complex (see below; Table 1) and Ctk1, and localize to the coding regions of actively transcribed genes (Hurt et al. 2004). Gbp2 and Hrb1, which are related to the SR family of splicing factors, also cross-linked to RNA derived from activated genes, identifying their potential role during elongation.
Many of the mechanistic details surrounding transcript termination in higher organisms have remained unclear. Cleavage and polyadenylation of the nascent transcript are distinct from termination, which in principle releases the RNA. RNAP II can proceed transcribing up to 2 kb after RNA cleavage, although cleavage is necessary for termination. Sequences located upstream of a poly(A) signal that activate polyadenylation were also discovered to pause purified RNAP II (Yonaha and Proudfoot 1999). These results provide evidence that RNAP II pausing contributes directly to termination and 3′-processing. More recent studies have provided further insight into this process. The RNAP I- and RNAP II-specific termination factor TTF2 was demonstrated to terminate transcription independent of the state of CTD phosphorylation (Jiang et al. 2004). TTF2 can terminate transcription at most positions along a DNA template and is so far the only termination factor found with early TECs. Moreover, TTF2 appears to suppress mitotic transcript elongation, perhaps by terminating the bulk of transcription during mitosis (Jiang et al. 2004). The question that remains is how TTF2 activity is regulated, that is, what signals TTF2 to terminate transcription?
Transcript elongation and RNA surveillance were linked through the discovery that the elongation factors Spt5 and Spt6 physically associate with the exosome (Andrulis et al. 2002). The exosome complex removes incorrectly processed pre-mRNAs through 3′-to-5′ exoribonuclease activity and serves to monitor mRNA fidelity. Studies in vivo revealed that the exosome localizes to active genes upon heat shock, and is recruited to active regions of polytene chromosomes during development in a manner comparable to Spt6 (Andrulis et al. 2002). Importantly, this work substantiates the interconnection between the synthesis of mRNA and its faithful processing, again highlighting the integration of events related to mRNA maturation.
The process of transcript elongation can affect other events involved in mRNA metabolism, such as transcription-dependent recombination and mRNA export. Mutations in the HPR1 gene result in genomic instability through transcription-coupled recombination (Prado et al. 1997). Consistent with its role in transcript elongation, Hpr1 is required for transcription of specific genes, especially for the generation of long transcripts or those originating from GC-rich DNA sequences (Chavez et al. 2001). Hpr1 is a component of the THO complex, which consists of Tho2, Mft1, and Thp2, in addition to Hpr1 (Chavez et al. 2000). Interestingly, the splicing and export factor Sub2 (UAP56 in humans) genetically suppressed Hpr1 and Cdc73 (PAF subunit) phenotypes, potentially linking these complexes to splicing and export activities (Fan et al. 2001). Independent work discovered genetic interactions between THO and additional mRNA export factors including Sub2 and Yra1/REF (ALY in humans) (Jimeno et al. 2002; Merker and Klein 2002; Strasser et al. 2002). Yra1 and Sub2 are associated stoichiometrically with THO and comprise a larger complex termed TREX (transcription/export) (Strasser et al. 2002). The TREX complex can be cross-linked to the DNA throughout the length of actively transcribed genes in a manner similar to that of RNAP II, and deletions of individual THO subunits resulted in impaired mRNA export (Strasser et al. 2002). Additionally, TREX subunits were shown to be important for efficient transcript elongation in vitro (Rondon et al. 2003). Some disparate reports suggested that TREX might not be essential for mRNA export (for review, see Vinciguerra and Stutz 2004). A recent genome-wide study revealed that the THO complex regulates Drosophila (Rehwinkel et al. 2004). Accordingly, only a subset of transcripts are exported by THO including heat-shock mRNAs after induction, whereas the vast majority of genes were unaffected by THO mutations, perhaps explaining discrepancies in the literature (Rehwinkel et al. 2004). A link between TREX and RNA surveillance comes from the discovery of genetic interactions between the 3′–5′ exonuclease Rrp6p, and both Sub2 and the THO complex (Libri et al. 2002). TREX appears to regulate diverse processes related to transcript elongation, and future work should clarify why a subset of genes requires this interesting complex.
The ISWI (imitation switch) family is a highly conserved group of ATP-dependent chromatin remodeling factors that are mechanistically distinct from the SWI/SNF family (Corona and Tamkun 2004; Mellor and Morillon 2004). Significantly, a recent study revealed that the yeast ISWI homolog Isw1p, which associates into two separate complexes (Vary et al. 2003), controls distinct stages of transcription dependent on the composition of the complex-specific subunits (Morillon et al. 2003). The Isw1a complex consists of Isw1p and Ioc3p (Isw one complex), and functions to negatively influence initiation by nucleosome positioning (Morillon et al. 2003). The Isw1b complex, composed of Isw1p, Ioc2p, and Ioc4p, regulates and couples elongation to mRNA maturation and termination (see below) (Morillon et al. 2003). Transcription run on, in vivo cross-linking, and genetic analyses indicated that Isw1p and the Spt4 subunit of DSIF exert opposing effects on the amount and localization of transcriptionally competent RNAP II (Morillon et al. 2003). These studies further demonstrated that Isw1p mutants genetically suppress TFIIS and Spt4 mutants, as gauged by 6-azauracil sensitivity.
Deletions and catalytic mutants of Isw1p were demonstrated to affect Ser 5 and Ser 2 phosphorylation of the CTD, as well as the recruitment of the Kin28p subunit of TFIIH. Different subunits of the Isw1b complex appeared to specifically regulate either Ser 5 (Ioc2p) or Ser 2 (Ioc4p) phosphorylation (Morillon et al. 2003). Moreover, histone H3-K4 and H3-K36 methylation, along with the recruitment of the cleavage and polyadenylation factor Rna15p, were disrupted in yeast strains lacking Ioc4p. The authors concluded that the Ioc2p and Ioc4p components of Isw1b (Vary et al. 2003) have mechanistically distinct functions, although both appear to be in the same complex (Fig. 10) (Morillon et al. 2003). Although the conclusions from these studies are consistent with the data presented, important questions must be resolved before definitive mechanistic roles for the Isw1a and Isw1b are established. For instance, what is the nature of the Isw1p complexes in the absence of Isw1p, Ioc2p, Ioc3p, and Ioc4p? Are partial complexes still present at the gene that provide a partial function? Additionally, what are the differences between the Isw1a and Isw1b complexes containing the catalytic point mutant versus those lacking Isw1p altogether? The results of such queries may provide a better picture regarding the contribution of the distinct Isw1p subunits.
Isw1p-mediated regulation of transcription. (A) The Isw1p–Ioc3p complex (Isw1a complex) negatively regulates transcription through nucleosome positioning. (B) Ser 5 phosphorylation by TFIIH requires the Ioc2p subunit of the Isw1b complex (Isw1p–Ioc2p–Ioc4p). (C) The DSIF subunit Spt4p prevents untimely Ser 2 phosphorylation and premature release from the early checkpoint required for pre-mRNA 5′-capping. (D,E) The Ioc4p subunit of Isw1b is necessary for H3-K4 methylation and Ser 2 phosphorylation of the CTD (D), as well as H3-K36 methylation and the recruitment of 3′-end processing factors (E). Many interesting questions remain as to how these events are coordinated. See text for details.
Interestingly, Isw1p preferentially recognizes H3-K4 di- and trimethylated histone tails, although apparently indirectly, and its recruitment to chromatin in vivo was shown to be dependent on this methylation (Santos-Rosa et al. 2003). Furthermore, the activities of Isw1p and Set1, the enzyme responsible for H3-K4 di- and trimethylation, are necessary for normal RNAP II distribution over the coding regions of selected genes (Santos-Rosa et al. 2003). Future work should identify any correlation between histone methylation and the Isw1p regulatory cycle described above, and may provide more evidence for the ISWI family in transcriptional elongation. As Isw1p homologs are present in higher organisms, it is exciting to speculate that a similar process occurs in humans. Notably, the Isw1p is an important regulator of multiple stages of transcription, and future experimentation should more clearly elucidate the mechanistic contributions of these two interesting Isw1p complexes.
Remarkable progress has been made regarding the regulatory significance of transcript elongation; however, after each new discovery, more questions arise. Why are there so many elongation factors that appear to function redundantly in vitro? The answer most likely has to do with promoter context, either in relation to the location of the TEC on a given gene, or the nature of the gene itself, such as the size of the RNA to be transcribed. Consistent with this idea, some of the elongation factors mentioned above have been shown to regulate only a subset of genes. What are the specific characteristics of these genes that require this attention? In addition, in vivo cross-linking and genetic studies have identified numerous elongation factors, although mechanistic studies for many of these proteins are absent. Establishing an efficient elongation assay in vitro that contains chromatin templates is a necessity toward elucidating the mechanistic functions of many of these factors. It remains important to understand how “classical” and chromatin-related elongation factors work together to facilitate productive elongation. The recent emergence of histone biology in transcription sets the stage to uncover the role of histone modifications during elongation. Many of the functional aspects of the “histone code” with regard to elongation are largely unknown. Structural studies have significantly advanced our understanding of the mechanisms behind elongation. These studies have indicated that RNAP II present in the TEC undergoes many structural changes, corroborating the need to capture the elongating complex in as many different conformations as possible. It is important to integrate the biochemical, molecular, genetic, and structural data collectively, while taking into account the strengths/weaknesses of each approach, to arrive at the most accurate conclusions.
The complexity of elongation is staggering, as it integrates numerous nuclear activities associated with mRNA transcription. How are these processes logistically coordinated? The central role of the CTD as a docking site for cotranscriptional capping, splicing, 3′-end processing, surveillance, and export allude to the existence of a “CTD code” that determines specific factor recruitment at the appropriate time. This putative CTD code may entail the seemingly subtle differences in sequences within the heptad repeats that exist within a species-specific CTD. New reagents are required to answer some of these more challenging questions. Moreover, how defects in specific elongation factors cause distinct diseases remains a mystery. These and other questions will propel future experimentation on transcript elongation, and should provide details not just related to gene expression, but also to how the cell faithfully integrates so many processes concomitantly.
We are extremely grateful for the generous contributions of the crystal structures presented in Figures 4, 7, and 8 from Patrick Cramer and Christopher Lima; and for the helpful comments and critical reading of the manuscript by Patrick Cramer, Don Luse, Vasily Studitsky, and Lynne Vales. We also thank Patrick Cramer for communicating results prior to publication. Due to space limitations, we apologize for the omission of pertinent references. This work was supported by a fellowship from NIH (GM-71166) to R.J.S. and by grants from the NIH (GM-37120) and the Howard Hughes Medical Institute to D.R.