The word intron is derived from the terms intragenic region, [1] and intracistron, that is, a segment of DNA that lies between two exons of a gene. The term "intron" refers to both the DNA sequence within a gene and the corresponding sequence in the raw RNA transcript. As part of the RNA processing pathway, introns are removed by RNA splicing shortly after or at the same time as transcription. [3] Introns are found in the genes of most organisms and many viruses. They can be located in a wide range of genes, including those that make proteins, ribosomal RNA (rRNA) and transfer RNA (tRNA).
Within introns, a donor site (5 'end of the intron), a branching site (near the 3' end of the intron), and an acceptor site (3 'end of the intron) are required for splicing. The splice donor site includes a nearly invariant GU sequence at the 5 'end of the intron, within a larger and less highly conserved region. The splice acceptor site at the 3 'end of the intron terminates the intron with a nearly invariant AG sequence. Upstream (5 ') of the AG is a region rich in pyrimidines (C and U), or polypyrimidine tract.
Upstream of the polypyrimidine tract is the branch point, which includes an adenine nucleotide involved in lariat formation. [5] [6] The consensus sequence for an intron (in IUPAC nucleic acid notation) is: GG- [cutoff] -GURAGU (donor site) ... intron sequence ... YURAC (branch sequence 20 -50 nucleotides upstream of the acceptor site). Y-rich-NCAG- [cut] -G (acceptor site). [7] However, it is observed that the specific sequence of intronic splice elements and the number of nucleotides between the branch point and the nearest 3 'acceptor site affect splice site selection. [8] [9] In addition, point mutations in the underlying DNA or errors during transcription can activate a cryptic splice site in part of the transcript that is generally not spliced. This results in a mature messenger RNA that is missing a section of an exon. In this way, a point mutation, which might otherwise affect only a single amino acid, can manifest as a deletion or truncation in the final protein.