Genomes: Transferability - Microbes and Evolution

What genes are more susceptible to horizontal gene transfer and which are less susceptible, that is better able versus less able to succeed in being transferred? Generally it is assumed that some genes are particularly difficult to swap between organisms while others are almost trivially lost and gained. This and the following section look at these issues of distinguishing among genes in terms of this potential for transferability.

Likelihoods of Gene Horizontal Transfer

Whereas once the salient question for microorganisms vis-à-vis horizontal gene transfer was one of whether it occurs to a sufficient degree to even matter evolutionarily – with one estimation that rates of acquisition by recombination could approach rates of mutation (Levin and Bergstrom, 2000) – this question has since been turned on its head. Today, one thus asks, first, why it is that certain microorganisms in fact do not engage in horizontal gene transfer (and what, then, are the consequences of its absence) and, for the rest and seeming majority of microorganisms, what in fact limits genetic migration to the point where specific species can be recognized, e.g., such as among bacteria. It is that second question that I address in this section, that is, what limits genetic migration such as among bacteria.

The latter question can actually be broken up into two questions. The first is, why do some genes transfer whereas others such as ssu RNA genes perhaps transfer with less likelihood? The second "sub"-question, by contrast, refers to gene clustering within especially virus and prokaryotic genomes. That is, why are some genes linked into gene complexes within these genomes? The answers to these two seemingly disparate questions, including what limits genetic migration, are pretty much the same, having to do with the potential for movement (outcrossing), the potential for segregation to progeny once acquired via outcrossing (e.g., as achieved given molecular recombination), and the potential for immediate utility (i.e., increase in fitness) following acquisition. That is, fitness impacts (= phenotype) follow integration into a genome (= genotype) which in turn follows transfer into a cytoplasm (e.g., as via transformation, transduction, or conjugation, generating what I've described as a pre-genetic recombination states). The closer the likelihood of any of these steps is to zero, then the lower the likelihood that a horizontal gene transfer event will either occur or, instead, survive in the face of natural selection.

One can view horizontal gene transfer events as equivalent to mutations in terms of their post-acquisition population dynamics. The majority of mutations in fact go extinct. This is in part because the majority have deleterious or, at best, only neutral impacts on organism fitness. Even beneficial alleles, however, tend to go extinct soon after their occurrence. The reason, especially for the latter, is that mutation numbers are small when they first come into existence, i.e., a frequency of one allele within a population of alleles. With small numbers, random events are extremely important (i.e., genetic drift). As a consequence, frequency tends to bump up and down, with down, given very low numbers, tending towards soon or immediate extinction. Novel gene swaps as due to homologous recombination, or novel gene insertions as due to illegitimate recombination, even those associated with provirus insertions into genomes or plasmid acquisition, all face an identical challenge: increasing numbers early on, before drift randomly drives them to extinction. Such increases occur, with any consistency, only given a selective advantage associated with the genetic change. Thus, there likely are biases among the genetic changes due to horizontal gene transfer, as one observes within a population, towards those that supply or, at least, supplied a selective advantage at their point of acquisition.

How does a horizontal gene transfer event supply a selective advantage? The answer is two-fold. First, the change can serve to replace an existing function with a more effective mechanism (more effective particularly as defined in terms of organism fitness) or, second, the change can provide an additional function to the organism. These effects can be envisaged (perhaps only roughly) as genetic swaps and insertion events, respectively. The latter especially can consist of single or multiple genes. With multiple genes, fitness advantages associated with individual components can be dependent upon epistatic interactions (i.e., multiple genes working together to create their beneficial effect) or, instead, can result from individual genes displaying distinct (no epistasis) advantageous effects. (Note that hitchhiking also can result in the transfer of both more beneficial and less beneficial genes.) Epistatic interaction likely explains, along with co-regulatory considerations, the grouping of many prokaryotic genes into clusters, i.e., operons: If a gene can provide a beneficial effect, but only if another gene is also present, then the first gene had better be genetically linked to the second gene for successful horizontal gene transfer to co-occur, that is, both genes must be transferred simultaneously with reasonable likelihood if natural selection is going to have any potential to favor their transfer.

Complexity Hypothesis

The complexity hypothesis is the idea that alleles that have more "complex" requirements in order to be functional within an acquired host are less likely to achieve fixation. These requirements can include, in part, protein-protein interactions (PPI). Thus, alleles that encode proteins that require more PPIs to be functional should be less likely to be successfully retained within acquired genomes. In a sense this is a "minority-rules" situation in which insufficient success in any one of many required interactions could result in a lack of sufficient functionality to stand subsequent tests of natural selection. (That is, if even one crucial interaction is not sufficiently functional, then the acquired allele will unlikely be beneficial.) This argument makes intuitive sense and is probably true. The predictions that it makes, however, may be less facilely appreciated.

In particular, it can be not just the number of interactions that dictate whether a new allele will succeed within its new surroundings but instead it is the potential for newly acquired proteins to make such interactions that are important. As a result, to the extent that post-transfer interactions are crucial to allele survival, it may be those proteins that have a greater potential for interaction which survive rather than more generally proteins that have low requirements for interactions. In other words, some level of PPI may be required for protein functioning, with that level differing on a case-by-case basis. Those proteins that meet specific criteria may be more likely to be retained, but this doesn't mean that only inherently low-interacting proteins will tend to be retained. As considered by Gophna and Ofran (2011) , the inherent "friendliness" of a protein, particularly one that 'needs' friends (in terms of PPIs), may be an important determinant of the likelihood of retention following horizontal gene transfer.

Altogether, Gophna and Ofran (2011) provide a number interesting insights into issues of gene horizontal gene transfer, which I list as well as discuss:

The complexity hypothesis probably applies less to co-transferred genes, i.e., those found within "selfish operons" or those associated with various genomic islands. In this case, PPIs may be retained between genes that in fact are co-transferred together. Many products of horizontal gene transfer, however, may be individually acquired rather than a part of a "prepackaged" group. That is, though the co-transferred genes may display overall higher likelihoods of retention given transfer, these groupings may make up only a fraction of potential horizontal gene transfer events.

Given transfer as individual genes, allele fixation following horizontal gene transfer may be more likely for alleles that display a greater potential for PPI, per amino acid, at the point of transfer. Note, though, that this point does not apply to alleles that in fact don't require PPIs to properly function. As also considered in the previous paragraph regarding groups of genes, however, the majority of genes at the point of transfer likely also do not have this property of not requiring PPIs to function. Thus, low-PPI proteins and grouped proteins may very well be special cases that can be horizontally transferred without a requirement for post-transfer PPIs, and therefore may be more readily useful and more readily retained within recipient genomes (i.e., as per the complexity hypothesis), but the majority of transfers may involve individual alleles that produce products that require some degree PPI.

The actual number of experimentally determined PPIs is lower than predicted for transferred genes versus non-transferred genes, within the recipient's genome. This is suggestive that newly acquired proteins have a greater potential for interactions, presumably as realized in their former hosts, than can be readily realized upon acquisition within a new genome.

Though alleles that produce proteins with a greater potential for interaction with other proteins may be preferentially retained within genomes following horizontal transfer, that potential may decline mutationally over time following transfer. The inference that can be made is that PPIs are important to the functioning of perhaps a majority of proteins, that a greater potential to participate in PPIs may allow for at least some PPIs upon acquisition by a new genome, that nonetheless proteins tend to have fewer PPIs post transfer than pre-transfer, and that PPI sites that are not utilized post transfer may not be retained but instead may be lost to drift or even natural selection.

Particularly important PPIs, perhaps especially for genes that are acquired individually via horizontal gene transfer, may be with core proteins, which can be more conserved going from genome to genome. Thus, the importance of having a potential to display PPIs may be so that some ability to interact with core proteins exists, though the utility of these interactions may be greater because core proteins are more likely to be conserved, and therefore interactions may be more achievable with core proteins than interactions with non-core proteins rather than because interactions with core proteins are inherently preferable. In other words, core proteins might be important interaction points more because of the structure of those proteins than because of their function.

A larger point, not considered by Gophna and Ofran (2011), is that genes may possess a declining potential to be transferred from genome to genome serially over time. That is, if the potential for PPIs is an important determinant of likelihood of retention within genomes following transfer, and potential for PPIs degrades following acquisition, then likelihood of further transfer too may degrade. Note again, though, that such points may apply less to alleles that are transferred within groups (i.e., "selfish operons") as well as to alleles that don't require PPIs for proper functioning, e.g., β-lactamase. Thus, there are certain types of genes that may tend to be more readily transferred to new genomes and which are better able to retain that property of transferability following their transfer. We can hypothesize that such genes make up a reasonable fraction of so-called vapour genes.

Core Genes

Lineages that do not have access to gene transfer should be in linkage disequilibrium, with the phylogeny for any given gene more or less equivalent to the organismal phylogeny. With sex, by contrast, comes some greater approximation of linkage equilibrium—a possibility even among some bacteria. The potential result of linkage equilibrium is either a lack of correlation or at least less correlation between gene-based phylogenies and organismal phylogenies. The very idea of an existence of an organismal phylogeny implies vertical descent plus the existence of genes that are less likely to be replaced via horizontal gene transfer (note as an aside that obligately sexual organisms experience sex particularly within the context of vertical descent, and thus are able to experience linkage equilibrium while at the same time possessing correlated gene- and organismal phylogenies; it is horizontal gene transfer, that is, gene transfer between rather than within populations that gives rise to conflicts between gene- and organismal phylogenies). One can describe more conservative or less promiscuously swapped genes as an organism's or lineage's core genes, whose phylogeny may be described either as or as approximating an organism's true phylogeny (Lawrence and Hendrickson, 2003) , i.e., an organismal phylogeny.

Again, the question is not whether swapping of these genes between organisms is plausible, plausible that is in the sense of not reducing the recipient's fitness to zero, but instead whether any overall benefit from swapping is likely. With core genes, potential benefits may be counteracted by the costs of disruption of multiple interactions with other gene products. In addition, the rRNA genes in particular are often found in multiple copies within cells, which likely makes their evolution additionally conservative since further selection to accommodate a single swap presumably would not be neutral with regard to interactions with not-swapped copies of the same gene; instead, one can more easily envisage ongoing, incremental improvement in or neutral modification of ribosome structure, via mutation or very short horizontal gene transfer events, resulting in gradual divergence of organismal phylogenies.