In part 1 and part 2 I argued that protein science has consistently shown that functional proteins are rare in sequence space. In part 3 I consider the major objections to the argument and find that they fail in providing an adequate response to the challenge that protein science poses for evolution.

A brief summary of the argument follows below:

1)     Functional and folding proteins are rare in protein sequence space.

2)      A blind search starting from a random protein sequence space is unlikely to find a folding, functional protein.

3)      If protein evolution is true then novel proteins must come from modifying preexisting functional and folding proteins.

4)      De novo genes that code for functional proteins have been discovered in the majority of organisms

5)      It is postulated that these de novo proteins evolved from RNA coding sections of DNA and not from pre-existing proteins.

6)      If so then it means random mutations would have to do a blind search in protein sequence space starting from a random sequence to find a folding, functional protein.

7)      But because of (1) and (2) – 6 cannot be true

8)      Therefore de novo protein evolution is not possible

Possible solutions for evolution

  1. A possible solution for evolution is that proteins are not distributed randomly in sequence space. It means all proteins used by all different species are concentrated within the area that organisms have searched. In other words – if evolution is true then proteins should not be distributed randomly throughout sequence space.
  2.  Evolution does not have to build long complex proteins from scratch – but rather begins with smaller protein chains and increases in complexity. In other words random mutations do not have to search through sequence space – but rather build on existing short proteins. Think of the how the word “undertaker” is made up the word “under” and “taker”. Evolution finds the smaller words and combines them to make more complex ones. However there are complex words like rhythm, synonym, which are not made up of smaller simpler words. Similarly there are complex functional protein structures and protein domains not made up of simpler shorter protein structures and domains.
  3. “Suppose that one draws letters randomly from a box. There is little chance that the sequence will form an English word. But if one begins with a small word, and asks, instead, how much the word can be changed by a modification of its letters: the answer is quite a lot. Single letter switches are sufficient:” (T. Hampton)WORD → WORE → GORE → GONE → GENE. Small changes to existing proteins leads to proteins with new functions and therefore there is no need to begin from a random sequence in protein space. Protein evolution must begin with an existing protein.

The discovery of de novo proteins shows that they could not have come from previous proteins and must therefore have come from random sequences of DNA. Random mutations would have to search through large protein sequence space to find de novo genes.



Figure 1 – Average protein length in various organisms. Source: Tiessen et al





Axel Tiessen, Paulino Pérez-Rodríguez and Luis José Delaye-Arredondo. Mathematical modeling and comparison of protein size distribution in different plant, animal, fungal and microbial species reveals a negative correlation between protein size and protein number, thus providing insight into the evolution of proteomes. BMC Research Notes 2012 5:85. doi:10.1186/1756-0500-5-85

Christine A. Orengo and Janet M. Thornton. PROTEIN FAMILIES AND THEIR EVOLUTION—A STRUCTURAL PERSPECTIVE. Annu. Rev. Biochem. 2005. 74:867–900

McLysaght A, Guerzoni D. 2015 New genes from non-coding sequence: the role of de novo protein-coding genes in eukaryotic evolutionary innovation. Phil. Trans. R. Soc. B 370: 20140332.

Ruiz-Orera J, Hernandez-Rodriguez J, Chiva C, Sabidó E, Kondova I, Bontrop R, et al. (2015) Origins of De Novo Genes in Human and Chimpanzee. PLoS Genet 11(12): e1005721. doi:10.1371/journal.pgen.1005721

Rachel Kolodny,1 Leonid Pereyaslavets, Abraham O. Samson, and Michael Levitt. On the Universe of Protein Folds. Annu. Rev. Biophys. 2013. 42:559–82

Russell F. Doolittle. THE MULTIPLICITY OF DOMAINS IN PROTEINS. Annu Rev. Biochem. 1995. 64:287-.114

Tyler Hampton.The New View of Proteins.