Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://chronos.msu.ru/old/EREPORTS/sharov-life_before_earth.pdf
Äàòà èçìåíåíèÿ: Sat Dec 14 12:32:54 2013
Äàòà èíäåêñèðîâàíèÿ: Fri Feb 28 20:42:03 2014
Êîäèðîâêà:

Ïîèñêîâûå ñëîâà: trees
Life Before Earth Alexei A. Sharov, Ph.D. Staff Scientist, Laboratory of Genetics National Institute on Aging (NIA/NIH) 333 Cassell Drive Baltimore, MD 21224 USA sharoval@mail.nih.gov Richard Gordon, Ph.D. Theoretical Biologist, Embryogenesis Center Gulf Specimen Marine Laboratory, P.O. Box 237, 222 Clark Drive Panacea FL 32346 USA DickGordonCan@gmail.com

Abstract An extrapolation of the genetic complexity of organisms to earlier times suggests that life began before the Earth was formed. Life may have started from systems with single heritable elements that are functionally equivalent to a nucleotide. The genetic complexity, roughl y measured by the number of non-redundant functional nucleotides, is expected to have grown exponentially due to several positive feedback factors: (1) gene cooperation, (2) duplication of genes with their subsequent specialization (e.g., via expanding differentiation trees in multicellular organisms), and (3) emergence of novel functional niches associated with existing genes. Linear regression of genetic complexity (on a log scale) extrapolated back to just one base pair suggests the time of the origin of life = 9.7 ± 2.5 billion years ago. Adjustments for potential hyperexponential effects would push the projected origin of life even further back in time, close to the origin of our galaxy and the universe itself, 13.75 billion years ago. This cosmic time scale for the evolution of life has important consequences: (1) life took a long time (ca. 5 billion years) to reach the complexity of bacteria; (2) the environments in which life originated and evolved to the prokaryote stage may have been quite different from those envisaged on Earth; (3) there was no intelligent life in our universe prior to the origin of Earth, thus Earth could not have been deliberately seeded with life by intelligent aliens; (4) Earth was seeded by panspermia; (5) experimental replication of the origin of life from scratch may have to emulate many cumulative rare events; and (6) the Drake equation for guesstimating the number of civilizations in the universe is likely wrong, as intelligent life has just begun appearing in our universe. Evolution of advanced organisms has accelerated via development of additional information-processing systems: epigenetic memory, primitive mind, multicellular brain, language, books, computers, and Internet. As a result the doubling time of human functional complexity has reached ca. 20 years. Finally, we discuss the issue of the predicted "technological singularity" and give a biosemiotics perspective on the increase of life's complexity. 1. The increase of genetic complexity follows Moore's law Biological evolution is traditionally studied in two aspects. First, paleontological records show astonishing changes in the composition of major taxonomic groups of animals and plants

1


deposited in sedimentary rocks of various ages (Valentine, 2004; Cowen, 2009). Aquatic life forms give rise to the first terrestrial plants and animals, amphibians lead to reptiles including dinosaurs, ferns lead to gymnosperms and then to flowering plants. Extinction of dinosaurs is followed by the spread of mammals and flying descendants of dinosaurs called birds. Second, Darwin's theory augmented with statistical genetics demonstrated that heritable changes may accumulate in populations and result in replacement of gene variants (Mayr, 2002). This process drives microevolution, which helps species to improve their functions and adjust to changing environments. But despite the importance of these two aspects of evolution, they do not capture the core of the macroevolutionary process, which is the increase of functional complexity of organisms. Function can be defined as a reproducible sequence of actions of organisms that satisfies specific needs or helps to achieve vital goals (e.g., capturing a resource or reproduction) (Sharov, 2010). To be passed on from one generation to the next, functions have to be encoded within the genome or other information carriers. The genome plays the role of intergeneration memory that ensures the preservation of various functions. Other components of the cell (e.g., stable chromatin modifications, gene imprinting, and assembly of the outer membrane (Frankel, 1989; (Grimes and Aufderheide, 1991)) may also contribute to the intergeneration memory, however, their informational role is minor compared to the genome for most organisms. Considering that the increase of functional complexity is the major trend in macroevolution, which seems applicable to all kinds of organisms from bacteria to mammals, it can be used as a generic scale to measure the level of organization. Because functions are transferred to new generations in the form of genetic memory, it makes sense to consider genetic complexity as a reasonable representation of the functional complexity of organisms (Sharov, 2006; (Luo, 2009). The mechanism by which the genome becomes more complex probably relies heavily on duplication of portions of DNA ranging from parts of genes to gene cascades to polyploidy (Ohno, 1970; (Gordon, 1999), followed by divergence of function of the copies. Developmental plasticity and subsequent genetic assimilation also play a role (West-Eberhard, 2002). We then have to ask what might be a suitable parameter, measurable from a genome, that reflects its functional complexity? Early studies of the genomes of various organisms showed little correlation between genome length and the level of organization. For example, the total amount of DNA in some single-cell organisms is several orders of magnitude greater than in human cells, a phenomenon known as C-value paradox (Patrushev and Minkevich, 2008). Sequencing of full genomes of eukaryotic organisms showed that the total amount of DNA per cell is not a good measure of information encoded by the genome. The genome includes numerous repetitive elements (e.g., LINE, LTR, and SINE transposones), which have no direct cellular functions; also some portions of the genome may be represented by multiple copies. Large single-cell organisms (e.g., amoeba) need multiple copies of the same chromosome to produce the necessary amount of mRNA. In eukaryotes, DNA has additional functions besides carrying genes and regulating their expression. These non-informational functions include structural support of nuclear matrix and nuclear lamina, chromosome condensation, regulation of cell division and homologous recombination, maintenance and regulation of telomeres and centromeres (CavalierSmith, 2005; (Rollins, et al., 2006; (Patrushev and Minkevich, 2008). Segments of DNA with non-genetic functions are mostly not conserved and include various transposing elements as well as tandem repeats. While the ENCODE project has uncovered many functions for 80% of the

2


noncoding DNA in humans (Pennisi, 2012), it has not yet addressed the C-value paradox. Thus we stick to the suggestion to measure genetic complexity by the length of functional and nonredundant DNA sequence rather than by total DNA length (Adami, et al., 2000; Sharov, 2006). A correction for the informational value of noncoding DNA will have to wait for future work on ENCODE studies for a spectrum of species. If we plot genome complexity of major phylogenetic lineages on a logarithmic scale against the time of origin, the points appear to fit well to a straight line (Sharov, 2006) (Fig. 1). This indicates that genome complexity increased exponentially and doubled about every 376 million years. Such a relationship reminds us of the exponential increase of computer complexity known as a "Moore's law" (Moore, 1965; Lundstrom, 2003). But the doubling time in the evolution of computers (18 months) is much shorter than that in the evolution of life. What is most interesting in this relationship is that it can be extrapolated back to the origin of life. Genome complexity reaches zero, which corresponds to just one base pair, at time ca. 9.7 billion years ago (Fig. 1). A sensitivity anal ysis gives a range for the extrapolation of ±2.5 billion years (Sharov, 2006). Because the age of Earth is only 4.5 billion years, life could not have originated on Earth even in the most favorable scenario (Fig. 2). Another complexity measure yielded an estimate for the origin of life date about 5 to 6 billion years ago, which is similarly not compatible with the origin of life on Earth (JÜrgensen, 2007). Can we take these estimates as an approximate age of life in the universe? Answering this question is not easy because several other problems have to be addressed. First, why the increase of genome complexity follows an exponential law instead of fluctuating erratically? Second, is it reasonable to expect that biological evolution had started from something equivalent in complexity to one nucleotide? And third, if life is older than the Earth and the Solar System, then how can organisms survive interstellar or even intergalactic transfer? These problems as well as consequences of the exponential increase of genome complexity are discussed below.

3


Figure 1. On this semilog plot, the complexity of organisms, as measured by the length of functional non-redundant DNA per genome counted by nucleotide base pairs (bp), increases linearly with time (Sharov, 2012). Time is counted backwards in billions of years before the present (time 0). Modified from Figure 1 in (Sharov, 2006).

Figure 2. A schematic view of the development of the universe since the Big Bang, courtesy of the Hubble Space Telescope Science Institute, on which we have superimposed our estimate for the origin of life, 9.7 ± 2.5 billion years ago. Note that the "Dark Ages" may have ended at -13.55 billion years (Zheng, et al., 2012) (with the Big Bang at -13.75 billion years (Jarosik, et al., 2011)), rather than at -11.5 billion years, as depicted.

4


2. How variable are the rates of evolution? To extrapolate of the rate of biological evolution into the past, we need to provide arguments why this rate is stable enough. There is no consensus among biologists on the question how variable are the rates of evolution. Darwin thought that in general evolutionary changes accumulate gradually through a series of small steps, rather than by sudden leaps (Darwin, 1866). However, he also pointed out that the rate of evolution is not uniform: "But I must here remark that I do not suppose that the process ever goes on so regularly as is represented in the diagram, though in itself made somewhat irregular, nor that it goes on continuously; it is far more probable that each form remains for long periods unaltered, and then again undergoes modification" (Darwin, 1872). This concept, now called punctuated equilibrium, assumes a high variation in the rates of evolution (Gould and Eldredge, 1977). Paleontological records indicate that major evolutionary changes occurred during very short intervals that separated long epochs of relative stability. If the concept of punctuated equilibrium is applied to the global trend of the increase of functional complexity of organisms (Fig. 1), then it may be argued that rates of evolution are so unstable that any extrapolation of them into the past is meaningless. In particular, it was suggested that the rates of primordial evolution were much higher than normal simply because of the absence of competition (Koonin and Galperin, 2003). The notion of unusually rapid primordial evolution was suggested also by other scientists (Davies, 2003; Lineweaver and Davis, 2003). These attempts to explain the presumed origin of life on Earth are strikingly similar to stretching and shrinking of time scales in Biblical Genesis to fit preconceptions (Schroeder, 1990). Although we fully agree that evolutionary rates fluctuate in time and that catastrophic changes of the environment followed by mass extinction provide a boost of novel adaptations to survived lineages, we strongly disagree that the concept of punctuated equilibrium is applicable to the general trend of the increase of functional complexity of organisms (Fig. 1). First, adaptive radiation of lineages observed during periods of rapid evolutionary change has nothing to do with the increase of functional complexity. Multicellular organisms have enough functional plasticity to produce a large variety of morphologies based on already existing molecular and cellular mechanisms. Second, many rapid changes in the composition of animal and plant communities resulted from migration and propagation of already existing species (Dawkins, 1986), a mechanism that does not require an increase in functional complexity. Third, there is no reason to expect that functional complexity of organisms did not increase during long "equilibrium" periods with no dramatic change in the morphology of organisms. Morphology is the tip of the evolutionary iceberg as the greatest changes occur at the molecular level. The common idea that stabilizing selection simply preserves the status quo in evolution is based on the misunderstanding of the original theory of stabilizing selection (Schmalhausen, 1949). Stabilizing selection leads to increased plasticity of organisms (West-Eberhard, 2002) which is achieved via novel signaling pathways that replace less reliable old pathways. All these changes may have no immediate effect on morphology, but nevertheless these are real changes that lead to the increase of functional complexity. Schmalhausen developed his theory without knowledge of molecular biology, which was not available at that time. But he managed to capture the idea on how phenotypic plasticity reshaped evolution. Finally, there is a difference in time scales: punctuated equilibrium refers to relatively short periods of evolutionary change (millions of

5


years), whereas the global growth of functional complexity becomes apparent at the time scale of billions of years. The reason why living organisms cannot increase their functional complexity instantly may be that it takes a long time to develop each new function via trial and error. Thus, simultaneous and fast emergence of numerous new functions is very unlikel y. In particular, the origin of life was then not a single lucky event but a gradual increase of functional complexity in evolving primordial systems. Similarly, the emergence from prokaryotes to eukaryotes was not the result of one successful symbiosis, but may have involved as many as 100 discrete innovative steps (Cavalier-Smith, 2010). This view is consistent with Darwin's insight that early evolution was slow and gradual: "During early periods of the earth's history, when the forms of life were probably fewer and simpler, the rate of change was probably slower; and at the first dawn of life, when very few forms of the simplest structure existed, the rate of change may have been slow in an extreme degree. The whole history of the world, as at present known, although of a length quite incomprehensible by us, will hereafter be recognised as a mere fragment of time, compared with the ages which have elapsed since the first creature, the progenitor of innumerable extinct and living descendants, was created" (Darwin, 1866). Another factor that may have reduced the rates of primordial evolution was the absence of welltuned molecular mechanisms, which are now present in every cell. In particular, there was no basic metabolism to produce a large set of simple organic molecules (e.g., sugars, amino acids, nucleic bases), and no template-based replication of polymers (see section 4). These two obstacles substantially reduced the frequency of successful "mutations" and, as a result, the initial rate of complexity increase was likely even slower than shown in Fig. 1. Thus, there is no basis for the hypothesis that the evolution of such complex organisms as bacteria with genome size of ca. 5â105 bp could have been squeezed into <500 million years after Earth's cooling.

3. Why did genome complexity increase exponentially? The increase of functional complexity in evolution can be modeled on the basis of known mechanisms, which appear to act as positive feedbacks (Sharov, 2006). First, the model of a hypercycle considers a genome as a community of mutuall y beneficial (i.e., cross-catalytic) selfreplicating elements (Eigen and Schuster, 1979). For example, a mutated gene that improves proofreading of the DNA increases the replication accuracy not only of itself but also of all other genes. Moreover, these benefits are applied to genes that may appear in the future. Thus, already existing genes can help new genes to become established, and as a result, bigger genomes grow faster than small ones. Second, new genes usually originate via duplication and recombination of already existing genes in the genome (Ohno, 1970; Patthy, 1999; Massingham, et al., 2001). Thus, larger genomes provide more diverse initial material for the emergence of new genes. Third, large genomes support more diverse metabolic networks and morphological elements (at various scales from cell components to tissues and organs) than small genomes, which in turn, may provide new functional niches for novel genes. For example, genes in multicellular organisms operate in highly diverse environments represented by various types of cells and

6


tissues. Progressive differentiation of cells supports the emergence of gene variants that either perform the same function in specific cell types or modify the original function for specific needs of some cells. Replication followed by divergence of differentiation trees allows duplication of whole cell types, followed by them assuming different functions within an organism (Gordon, 1999; (Gordon and Gordon, 2013). These mechanisms of positive feedback may be sufficient to cause an exponential growth in the size of functional nonredundant genome. Existing data also indicates that the genetic complexity may have increased a little faster than exponentially (i.e., hyperexponentially), which may be explained by phase transitions to higher levels of functional organization (Sharov, 2006; Markov, et al., 2010). For example, the time of genome doubling in Archae and Eubacteria was 1080 and 756 million years, respectively (these estimates are based on the largest known archaeal genome, 5 Mb, in Methanogenium frigidum and bacterial genome, 13 Mb, in Sorangium cellulosum) (Bernal, et al., 2001). These estimates are 2.9 and 2.0 fold longer than the doubling time in Eukaryota. The difference between the rates of increase of genome complexity between most successful and lagging lineages can be explained by evolutionary constraints of the latter ones (e.g., inefficient DNA proofreading and absence of mitosis). Thus, the rate of the "complexity clock" may have increased with the emergence of eukaryotes, and therefore, life may have originated even earlier than expected from the regression in Fig.1. That would push the projected origin of life close to the origin of our galaxy and the universe itself. Thus, life may have originated shortly after parts of the universe cooled down from the Big Bang (Gordon and Hoover, 2007). For the sake of this chapter, we are assuming that the Big Bang model for the universe and its age of 13.75 ± 0.11 billion years is correct (Jarosik, et al., 2011), although some evidence suggests that our universe is substantially older (Kazan, 2010). The exponential increase of functional complexity is consistent with Reid's view of evolution as cascading emergences: "As evolution progresses, the freedom of choice increases exponentially.... intrinsic complexification of differentiated cell types, is overall an exponential function of reproduction and time-quite a simple equation.... the historical curve of some lineages, especiall y that of hominids, fits the simple exponential equation, its logarithmic slope theoretically determined by the fact that the acceleration of complexification is virtually equivalent to increasing adaptability and freedom to explore unexploited environments" (Reid, 2007).

4. Could life have started from the equivalent of one nucleotide? Autocatal ytic synthesis (in contrast to decay) is a rare property among organic molecules. Thus, it was suggested that it can arise more easily in multi-component mixtures of molecules with random cross-catalysis (Kauffman, 1986; Kauffman, et al., 1986). In particular, Kauffman suggested that a mixture of peptides could form a closed autocatalytic set, where the synthesis of each component is catalyzed by some members of the same set. Members of autocatalytic sets are not necessarily peptides; they can be RNA oligonucleotides (Lincoln and Joyce, 2009) or any other kind of organic molecules.

7


But despite the attractiveness of the idea that life originated from autocatalytic sets and elaborate mathematical support (Mossel and Steel, 2007), there are serious problems with this hypothesis. First, we focus on the most common version of this hypothesis where elements of autocatalytic sets are heteropolymers (e.g., peptides or oligonucleotides). The "RNA world" hypothesis assumes that first living systems had self-replicating nucleic acids (Gilbert, 1986) or other kinds of similar heteropolymers (TNA, PNA) (Nelson, et al., 2000; Orgel, 2000). Although some RNA molecules can catal yze the polymerization of other RNA (Johnston, et al., 2001), this reaction requires abundant free nucleotides. Nucleotides can be synthesized abiogenically (Powner, et al., 2009), but they are unlikely to become concentrated in quantities sufficient to support RNA polymerization in a population of proto-organisms. Even if several molecules appear in close proximity to each other due to a once-in-a-universe lucky coincidence and produce a complimentary RNA strain, there would be no nucleotides left to make the next generation of replicons. Nucleotides can be synthesized from bases and sugars by RNA-mediated catalysis (Unrau and Bartel, 1998), but both bases and sugars are rare molecules which are unlikely to be supplied in sufficient quantities. Polymers like nucleic acids and peptides may persist only on condition of an unlimited supply of monomers, and this requires a heritable mechanism for their synthesis from simple and abundant organic and non-organic resources (Copley, et al., 2007; Sharov, 2009). Thus, the emergence of polymers was the second chapter in the history of life, whereas the first chapter was the origin of simple molecules that supported both metabolic and hereditary functions (Jablonka and SzathmÀry, 1995; Sharov, 2009). The next question is whether self-replicating autocatal ytic sets can become assembled from simple molecules. Alas, most abundant organic molecules in the non-living world (e.g., saturated hydrocarbons) are inert. Catalytically active organic molecules are rare and often unstable, thus it is unlikely that they would become concentrated together in a tiny space to become integrated into an autocatalytic set. High concentrations and enclosures are needed to increase the rates of mutual catal ysis so that they compensate for the loss of molecules due to their degradation and diffusion. A more recent "GARD" (Graded Autocatalysis Replication Domain) model of the origin of life is based on "compositional assemblies" of simple lipid-like molecules (SegrÈ and Lancet, 1999; SegrÈ, et al., 2001; Bar-Even, et al., 2004). In contrast to autocatal ytic sets that exist in a homogeneous space. the GARD model assumes a phase separation between aggregates of molecules (e.g., lipid microspheres) and a homogeneous environment. The GARD model is substantially more realistic than autocatalytic sets because the requirement of catalytic closure is replaced by the assumption of selective attraction or repulsion of molecules to or from the aggregates (i.e., partitioning), which is a more widespread behavior in simple molecules. Models of growth of compositional assemblies usually also assume that large assemblies can break down into two (or more) daughter assemblies. As a result, these assemblies can reproduce and form discrete quasispecies. Disproportional split of components between daughter systems can be viewed as "mutations" that may occasionally give rise to new quasispecies. However, it appears that compositional assemblies lack evolvability as the number of potential quasispecies is limited (Vasas, et al., 2010). From the systems point of view, the growth of compositional assemblies is similar to the growth of crystals. The same chemical may produce different crystals

8


(quasispecies), local defects in a crystal lattice may give rise to a new crystal type (Cairns-Smith, 1982). Similar to compositional assemblies, crystals lack evolutionary potential. The evolution of primordial living systems requires heredity, but neither nucleic acids nor other complex polymers were initially available to support it. Thus, hereditary functions must have been carried out by simpler molecules. Heredity requires autonomous self-production, which is a generalization of autocatalytic synthesis. In chemical terms, compositional assemblies in the GARD model have no catalysis, but in systems terms, there is an autocatalysis of assemblies as they produce assemblies with matching composition. The general notion of self-reproduction has been defined using the formalism of Petri nets (Sharov, 1991). In short, a system is self-reproducing if there is a finite sequence of transitions (i.e., reactions) that results in the increase of the numbers of all components within the system. For example, the formose reaction is autocatal ytic and makes sugars from formaldehyde (Huskey and Epstein, 1989). Such reactions can propagate in space, which is similar to the growth and expansion of populations of living organisms (Gray and Scott, 1994; Tilman and Kareiva, 1997). Autocatal ytic reactions have two alternative steady states: "on" and "off" (the "on" state is stabilized via a limited supply of resources). Thus, they represent the most simple hereditary system or memory unit (Jablonka and SzathmÀry, 1995; Lisman and Fallon, 1999). For example, a reverse citric acid cycle, which captures carbon dioxide and converts it into sugars, may become self-sustainable, at least theoretically (Morowitz, et al., 2000). Prions are examples of autocatalytic reproduction (Griffith, 1967; Laurent, 1997; Watzky, et al., 2008), and indeed have been invoked in various ways in speculations on the origin of life (Steele and Baross, 2006; Maury, 2009; Hu, et al., 2010). However, they cannot support the synthesis of the primary (i.e., unfolded) pol ypeptide. Autocatal ysis is necessary for the origin of life, but not sufficient. The specific feature of autocatalysis in living systems is that it is linked functionally with a local environment (e.g., cell), and this linkage can be viewed as a coding relation (Sharov, 2009). In particular, the autocatalytic system modifies (encodes) its local environment, and this modification increases the rate of autocatalysis. This functional linkage is a necessary condition for cooperation between multiple autocatal ytic components if they happen to share their local environment. In economic terms, the system invests in the modification of its environment, and therefore cannot leave its investment. This can also be viewed as a "property" relation at the molecular level. The autocatalytic system is the owner of its local environment, which plays the role of "home" or "body". Because the system is attached to its home, it is forced to cooperate with other autocatalytic systems that may appear in the same local environment. The local environment can be represented by either enclosure or attachment to a surface. Although all known free-living organisms have enclosures (cell membranes), life may have started from surface metabolism because autocatal ysis has a much higher rate on a two-dimensional surface than in threedimensional space (WÄchtershÄuser, 1988), an example of dimension reduction (Adam and DelbrÝck, 1968). We have proposed a "coenzyme world" scenario where hereditary functions are carried out by autocatalytic molecules, named "coenzyme-like molecules" (CLMs) since they are catalytically active and may resemble existing coenzymes (Sharov, 2009). Because many coenzymes (e.g.,

9


ATP, NADH, and CoA) are similar to nucleotides, CLMs can be viewed as predecessors of nucleotides. The most likely environments for CLMs were oil (hydrocarbon) microspheres in water because (1) hydrocarbons are the most abundant organic molecules in the universe (Deamer, 2011) and are expected to exist on early terrestrial planets (Marcano, et al., 2003), (2) oil microspheres self-assemble in water and (3) it is logical to project the evolutionary transformation of oil microsphere into a lipid membrane. CLMs can colonize the surface of oil microspheres in water as follows. Assume that rare water-soluble CLMs cannot anchor to the hydrophobic oil surface. However, some microspheres may include a few fatty acids with hydrophilic ends that allow the attachment of CLMs (Fig. 3). Once attached, a CLM can catalyze the oxidation of outer ends of hydrocarbons in the oil microsphere, thus providing the substrate for binding of additional CLMs. Accumulation of fatty acids increases the chance of a microsphere to split into smaller ones, and small microspheres can infect other oil microspheres, i.e., capture new oil resource. This process of autocatalytic adhesion creates a two-level hierarchical system, where CLMs play the role of coding elements. Alternatively, CLMs can be synthesized from precursors (e.g., from two simpler molecules A and B) on the surface of microspheres. For example, when molecule A becomes attached to the surface of a microsphere, it changes conformation so that it can interact with another water-soluble molecule B. As a result, the synthesis of A + B => AB is catalyzed by the oil microsphere. If the product AB is capable of oxidizing hydrocarbons into fatty acids, then the whole system becomes autocatalytic.

Figure 3. Coenzyme world: coenzyme like molecules (CLMs) on the oil microsphere. (A) CLM can anchor to the oil microsphere via rare fatt y acid molecules. (B) The function of a CLM is to oxidize hydrocarbons to fatty acids, which provides additional anchoring sites for new CLMs. (C) Accumulation of fatty acids increases the chance of a microsphere to split into smaller ones, and (D) small microspheres can infect other oil microspheres (i.e., capture new oil resource). Several kinds of autocatalytic coding elements may coexist on the same oil microsphere, creating a system with combinatorial heredity (Sharov, 2009). Each kind of coding element performs a specific function (e.g., capturing resource, storing energy, or catalysis of a reaction) and ensures the persistence of this function. However, coding elements are not connected, and hence, are transferred to offspring systems in different combinations. Despite random transfer, the combinatorial heredity can be stable because (1) coding elements are present in multiple copies and therefore each offspring has a high probability of getting the full set, and (2) natural selection preserves preferentially systems with a full set of coding elements. The efficiency of the latter mechanism was shown in a "stochastic corrector model" (SzathmÀry, 1999). New types of coding elements can be added by (1) acquisition of entirely new CLMs from the environment, (2) modification of existing CLMs and (3) polymerization of CLMs. Combinatorial heredity can

10


eventually lead to the emergence of synthetic polymers (Sharov, 2009). For example, if a new CLM, C, can catalyze the polymerization of another CLM, A, then together they encode long polymers AAAAA...,