ABSTRACT

By the 1970s, it was evident that most sequences in the genomes of complex organisms are not protein-coding. The amount of cellular DNA was found to broadly increase with developmental complexity, but there were incongruities, termed the C-value paradox. Theoretical considerations of the lethality of protein-coding mutations, the presence of large amounts of repetitive sequences and seemingly defective ‘pseudogenes’ all suggested that some, and perhaps most, multicellular organisms carry variable amounts of superfluous ‘non-informational’ DNA. The corollary of neutral evolution of non-functional sequences was widely accepted, although there was debate between the ‘near-neutralists’ and ‘adaptionists’ concerning the signatures of protein-coding genes and regulatory sequences underpinning quantitative trait variation. Later analyses showed many different rate classes of sequence evolution in plant and animal genomes. Nonetheless, there was growing consensus that much if not most of the DNA in plant and animal genomes must be junk and that the many repetitive sequences, by then known to be derived from transposons, are ‘selfish’ genetic hobos and freeloading passengers. The discovery in 1977 that eukaryotic genes are composed of short fragments of protein-coding sequences (‘exons’) interspersed with non-coding sequences (‘introns’), which are removed by post-transcriptional splicing, explained heterogeneous nuclear RNA and was proffered as further evidence of junk. Introns were rationalized as extant remnants of the prebiotic assembly of genes, which had been deleted from microbial genomes under selective pressure for rapid replication, but persisted and were fattened by transposons in complex organisms, despite the fact that complex organisms have microbial ancestors. Introns were found to increase in number and size with developmentally complexity, which suggested that they may have evolved developmentally important functions transacted by RNA.