ABSTRACT

Most protein-coding genes in eukaryotes consist of coding sequences called exons interrupted by noncoding sequences called introns. The number of introns and their size varies from gene to gene. The primary transcript undergoes processing reactions to yield mature mRNA. In marked contrast to prokaryotic genes where proteins are encoded by a continuous sequence of triplet codons, the vast majority of protein-coding genes in eukaryotes are discontinuous. Some eukaryotic protein-coding genes lack a TATA box and have an initiator element instead, centered around the transcriptional initiation site. In order to initiate transcription, RNA polymerase II requires the assistance of several other proteins or protein complexes, called general transcription factors, which must assemble into a complex on the promoter in order for RNA polymerase to bind and start transcription. The RNA molecule made from a protein-coding gene by RNA polymerase II is called a primary transcript.