Ancient protein


Ancient proteins are the ancestors of modern proteins that survive as molecular fossils. Certain structural features of functional importance, particularly relating to metabolism and reproduction, are often conserved through geologic time. Early proteins consisted of simple amino acids, with more complicated amino acids being formed at a later stage through biosynthesis. Such late-arising amino acids included molecules like: histadine, phenylalanine, cysteine, methionine, tryptophan, and tyrosine. Ancient enzymatic proteins performed basic metabolic functions and required the presence of specific co-factors. The characteristics and ages of these proteins can be traced through comparisons of multiple genomes, the distribution of specific architectures, amino acid sequences, and the signatures of specific products caused by particular enzymatic activities. Alpha and beta proteins are considered the oldest class of proteins.
Mass spectrometry is one analytical method used to determine the mass and chemical makeup of peptides. Ancestral sequence reconstruction takes place through the collection and alignment of homologous amino acid sequences. These sequences must bear a sufficient amount of diversity to contain phylogenetic signals that resolve evolutionary relationships and allow for further deduction of targeted ancient phenotype. From there a phylogenetic tree can be constructed to illustrate the genetic resemblance between various amino acid sequences and common ancestors. The ancestral sequence is then inferred and reconstructed through maximum likelihood at the phylogenetic node. From there, encoding genes are synthesized, expressed, purified, and incorporated into the genome of an extant host organisms. Functionality and product properties are observed and experimentally characterized. Using a greater degree of variance in representative monomeric proteins will increase the overall precision of the results.

History

In 1955, Philip Abelson published a short paper that laid out what has become, through several cycles of technical advances, the field of palaeoproteomics or ancient protein research. He was the first to propose that amino acids, and therefore proteins, were present in a fossil bone millions of years old which gave clues about the evolution of very early life forms on our planet. Only a few years later, Hare and Abelson conducted another pioneering analysis on shells and found out that amino acids degrade or change their internal L to D configuration progressively over time, and that this could thus be used as dating tool, in what is called amino acid dating or amino acid racemization. This dating approach was later shown to be a very capable tool for dating periods extending further back than the limits of radiocarbon at ca. 50,000 years.

Structure and evolution

Ecological and geological events that changed the conditions of Earth's global environment effected the evolution of protein structure. The Great Oxidation Event, triggered by the development of phototrophic organisms like cyanobacteria, resulted in a world-wide increase in oxygen. This pressured various groups of anaerobic prokaryotes, changing the microbial diversity and global metabolome, as well as altering enzyme substrates and kinetics.
Certain areas of proteins are more prone to undergo evolutionary change at a rapid rate, while others are unusually tolerant. Essential genes - or sequences of genetic material responsible for protein architecture, structure, catalytic metal co-factor binding centers, or interaction - will experience little change compared to the rest of the genetic material. Portions of this material will be confronted with genetic mutations that affect amino acid sequencing. These mutations laid the ground for other mutations and interactions that had major consequences towards protein structure and function, resulting in proteins with similar sequences serving entirely different purposes.
Joseph Thornton, an evolutionary biologist, researched steroid hormones and their binding receptors to map their evolutionary relationship. He inserted DNA molecules, equipped with reconstructed amino acid sequences from ancient proteins, into in-vitro cells to make them synthesize ancestral proteins. The team discovered that reconstructed ancestral protein were capable of reconfiguration in response to multiple hormones. Additional studies conducted by other research teams indicate the evolutionary development of greater protein specificity over time. Ancestral organisms required proteins - mainly enzymes - capable of catalyzing a broad range of biochemical reactions to survive with a limited proteostome. Subfunctionalization and gene duplication in multifunctional and promiscuous proteins led to the development of simpler molecules with the ability to perform more specific tasks. Not all studies concur however. Some results suggest evolutionary trends through less-specific intermediates or molecules bearing two high-specificity states or decreased specificity altogether.
A second apparent evolutionary trend is the global transition away from thermostability for mesophilic protein lineages. The temperature at which various ancient proteins melt was correlated with the optimum growth temperature of extinct or extant organisms. The higher temperatures of the Precambrian affected optimum growth temperatures. Higher thermostability in proteinaceous structures facilitated their survival under more critical conditions. Heterogeneous environments, neutral drift, random adaptations, mutations, and evolution are some of the factors that influenced this non-linear transition and caused fluctuation in thermostability. This led to the development of alternative mechanisms of surviving fluctuating environmental conditions.
Certain ancestral proteins followed alternative evolutionary routes to obtain the same functional outcomes. Organisms that evolved along different pathways developed proteins that performed similar functions. In some cases, changing a single amino acid was enough to provide an entirely new function. Other ancestral sequences became over-stabilized and were incapable of conformational changes in response to shifting environmental stimuli.

Associated fields

Palaeoproteomics

Palaeoproteomics is a neologism used to describe the application of mass spectrometry -based approaches to the study of ancient proteomes. As with palaeogenomics, it intersects evolutionary biology, archaeology and anthropology, with applications ranging from the phylogenetic reconstruction of extinct species to the investigation of past human diets and ancient diseases.

Other fields

uses mass spectrometry and protein analyses to determine the evolutionary relationship between different animal species due to differences in proteinaceous mass, for instance collagen. Techniques such as shotgun proteomics allows researchers to identify proteomes and the exact sequences of amino acids within different kinds of proteins. These sequences can be compared to other organisms within different clades to determine their evolutionary relationships within the phylogenetic tree. Proteins are also more preserved in fossils than DNA, allowing researchers to recover proteins from the enamel of 1.8 million year old animal teeth and mineral crystals of 3.8 million year old eggshells.

Applications and products

Combined genome and protein sequencing research has allowed for scientists to further piece together narratives of archaic environmental conditions and past evolutionary relationships. Research into the thermostability of protein structures permits predictions of past global temperatures. Ancestral sequence reconstruction further reveals the origins of human ethanol metabolism and the evolution of various species. An example of this would be the identification and differentiation of Denisovan hominids from modern Homo sapien sapiens through amino acid variants in collagen obtained from the former's teeth.
The study of ancient proteins has not only helped to determine the evolutionary history of viral proteins but facilitated the development of new drugs.

Benefits and limitations

Understanding protein function and evolution provides new methods of engineering and controlling evolutionary pathways to produce useful templates and byproducts - more specifically, proteins with high thermostability and broad substrate specificity.
Multiple limitations as well as possible sources of error must be taken under consideration, and possible solutions or alternatives put into place. The statistical construction of ancient proteins is unverifiable and will not have identical amino acid sequences to ancestral proteins. Reconstruction can also be affected by multiple factors including: mutations; turnover rates - as prokaryotic species are more prone to genetic change than their eukaryotic counterparts, making it harder to determine their proteomic past; amino acid distribution; and limited resources of fully sequenced genomes and amino acid sequences of extant species. Ancestral protein reconstruction also assumes that certain homologous phenotypes actually existed within ancient proteinaceous populations when in fact the data recovered is but an estimate consensus of the total pre-existing diversity. Inadequate taxonomic sampling can lead to inaccurate phylogenetic trees due to long branch attraction. Proteins can also get degraded overtime into small fragments and have modern proteins incorporated into them - making identification difficult or inaccurate. Last but not least, fossilized remnants contain minuscule amounts of proteins that can be used for further study and identification and actually provide less information regarding evolutionary patterns compared to genome sequences.
Additional concerns regarding ancestral sequence reconstruction method would lie in the underlying bias in thermostability due to the usage of maximum-likelihood in obtaining data. This makes ancient proteins appear more stable than they actually were. Using alternative reconstruction methods - for instance the Bayesian method that incorporates and averages over the level of uncertainty - could provide a comparable reference regarding ancestral stability. However, this method performs poor reconstructions and may not accurately reflect actual conditions.