Evolution Icon EVOLUTION

Do Shared ERVs Support Common Ancestry?

Jonathan McLatchie

In my previous article, I discussed the background of one of the most commonly made arguments for primate common ancestry. In this article, I want to examine the first of the three layers of evidence offered by a popular-level article written about this subject.

I encourage readers to go to the "popular-level-article" itself, so that you can check whether you think McLatchie does it justice, and whether you think he misrepresents it. 

The author of the article under discussion tells us,

When we examine the collective genome of Homo sapiens, we find that a portion of it consists of ERVs (IHGS Consortium, 2001). We also find that humans share most of them with Chimpanzees, as well as the other members of Hominidae (great apes), the members of Hylobatidae (gibbons), and even the members of Cercopitheciodae (old world monkeys) (Kurdyukov et al., 2001; Lebedev et al., 2000; Medstrand and Mager, 1998; Anderssen et al., 1997; Steinhuber et al., 1995). Since humans don’t and/or can’t regularly procreate and have fertile offspring with members of these species, and thus don’t make sizable contributions to their gene pools, and vice versa, their inheritance cannot have resulted from unions of modern species. As previously mentioned, parallel integration is ruled out by the highly random target selection of integrase. And even if it was far more target-specific than observed, it would require so many simultaneous insertion and endogenizations that the evolutionary model would still be tremendously more parsimonious. This leaves only one way an ERV could have been inherited: via sexual reproduction of organisms of a species that later diverged into the one the organisms that share the ERV belong to, i.e. an ancestral species–simply put, humans and the other primates must share common ancestry.

Just how target-specific are these ERV integrations? In the portion of the article headed “common creationist responses,” we are told that,

…while proviral insertion is not purely random, it is also not locus specific; due to the way it directly attacks the 5′ and 3′ phosphodiester bonds, with no need to ligate (Skinner et al., 2001). So relative to pure randomness, insertion is non-random, but relative to locus specificity, insertion is highly random.

Really?

Let’s take a few moments to do what any good student of biology would do — and briefly survey some of the literature.

In one relevant study, Barbulescu et al. (2001) report that,

We identified a human endogenous retrovirus K (HERV-K) provirus that is present at the orthologous position in the gorilla and chimpanzee genomes, but not in the human genome. Humans contain an intact preintegration site at this locus. [emphasis added]

It seems that the most plausible explanation for this is an independent insert in the gorilla and chimpanzee lineages. Notice that the intact preintegration site at the pertinent locus in humans precludes the possibility of the HERV-K provirus having been inserted into the genome of the common ancestor of humans, chimpanzees and gorillas, and subsequently lost from the human genome by processes of genetic recombination. Though there are other possible candidate hypotheses for this observation (such as incomplete lineage sorting), in the context of other indications of locus-specific site preference, this data is, at the very least, suggestive that these inserts may in fact be independent events.

Road networks have accident blackspots, where accidents are more common due to various factors. But each accident is a unique event. They do not occur in precisely the same locations, nor do they all involve exactly the same vehicles and occupants. If you were to find multiple reports of accidents, all with exactly the same details, (vehicles and occupants involved, right down to the text of the report), you would not assume they were reports of different accidents and point to the fact that they all occurred in an area notorious for road safety as an explanation. You would conclude that they were all copies of a single accident report. In a similar way, a general statistical preference for certain retroviruses to integrate in certain types of regions is insufficient to explain the same contents and a locus correspondence down to single base-pair precision.  Indeed, site preference has been studied in the hope of providing clues about how to combat retroviral invasions. The studies have found no repetition of integration sites within 500 or so samples of the same retrovirus integrating with the DNA of the same host cell type. So they will integrate independently at the same locus by coincidence with a probability of 1/500 or less. The probability of two integrating at the same two loci is therefore 1/250000 (1/500^2). For n, the probability is 1/500^n (or less!). Here, we are talking about a value of n around 200,000.

 See this page on the subject of site preference. 


So McLatchie thinks that one case out of some 200,000 orthologous inserts argues that all ERVs are independent endogenizations! He mentions the possible reasons for this one provirus being absent in the human genome, but strangely, doesn't explain them and dismisses them without reason. He "suggests" that they are from independent endogenizations. It certainly is possible that it is an extremely rare coincidence, but what about the other 199,999 cases?

See my article about this case, with an explanation of incomplete lineage sorting, something that McLatchie, strangely, neglects to explain himself. ;)


But there’s more.

Another study, by Sverdlov (1998) reports,

But although this concept of retrovirus selectivity is currently prevailing, practically all genomic regions were reported to be used as primary integration targets, however, with different preferences. There were identified ‘hot spots’ containing integration sites used up to 280 times more frequently than predicted mathematically. [emphasis added]

Dealt with above. 'hot' spots are not specific DNA loci, which would be required to argue for an alternative explanation to endogenization in common ancestors.

In addition,Yohn et al. (2005) report that,

Horizontal transmissions between species have been proposed, but little evidence exists for such events in the human/great ape lineage of evolution. Based on analysis of finished BAC chimpanzee genome sequence, we characterize a retroviral element (Pan troglodytes endogenous retrovirus 1 [PTERV1]) that has become integrated in the germline of African great ape and Old World monkey species but is absent from humans and Asian ape genomes.

I don't know why evidence of endogenizations post-speciation should be a surprise. Rather, they are to be expected.

I could continue in a similar vein for some time. Other classes of retroelement also show fairly specific target-site preferences. For example, Levy et al. (2009) report that Alu retroelements routinely preferentially insert into certain classes of already-present transposable elements, and do so with a specific orientation and at specific locations within the mobile element sequence. Moreover, a study published in Science by Li et al.(2009) found that, in the waterflea genome, introns routinely insert into the same loci, leading the internationally-acclaimed evolutionary biologist Michael Lynch to note,

Remarkably, we have found many cases of parallel intron gains at essentially the same sites in independent genotypes. This strongly argues against the common assumption that when two species share introns at the same site, it is always due to inheritance from a common ancestor.

Finally, Daniels and Deininger (1985) suggest that,

…a common mechanism exists for the insertion of many repetitive DNA families into new genomic sites. A modified mechanism for site-specific integration of primate repetitive DNA sequences is provided which requires insertion into dA-rich sequences in the genome. This model is consistent with the observed relationship between galago Type II subfamilies suggesting that they have arisen not by mere mutation but by independent integration events.

Such target-site preferences are also documented herehere, and here.
Why might these ERV site-preferences exist? Presumably because these sites are most conducive to their successful reproduction (e.g. the necessitude for expression of the ERV’s regulatory elements; the activity of the host’s DNA correction system, etc). Mitchell et al. (2004) suggest “that virus-specific binding of integration complexes to chromatin features likely guides site selection.”

A
gain, target site preference is not the precise, base-pair resolution 'targeting' that would be required to provide an alternative explanation for commonly located ERVs. Why McLatchie is talking about elements other than ERVs, the subject here, seems strange. Perhaps it's because ERVs steadly refuse to support his presuppositions.

Out of tens of thousands of ERV elements in the human genome, roughly how many are known to occupy the same sites in humans and chimpanzees? According to this Talk-Origins article, at least seven. Let’s call it less than a dozen. Given the sheer number of these retroviruses in our genome (literally tens of thousands), and accounting for the evidence of integration preferences and site biases which I have documented above, what are the odds of finding a handful of ERV elements which have independently inserted themselves into the same locus?

The Talk Origins article mentioned is well out of date on ERVs, having been written before the sequencing of the human and chimp genomes. I don't know why  McLatchie is referring to this, as he is supposed to be looking at the "popular level article" he links to at the top, which was produced after the sequences were published, and show some 200,000 common ERV elements in the two genomes! But I tell a lie. I do know why McLatchie refers to the Talk Origins article instead of the one he is supposed to be discussing. He is trying to mislead his readers and calculates that they will not bother to check to see if he is being honest. Standard M.O. for creationist/ID propaganda. See "How many ERVs are shared, in common locations, in the genomes of humans and chimps?"

A Nested Hierarchy?

What about this “nested hierarchy” of which we are told?
We are (incorrectly) told that “There is only one, solitary known deviation of the distributional nested hierarchy; a relatively recently endogenized/fixed ERV called HERV-K-GC1.”

This claim, however, is false.

In addition to the case mentioned, Yohn et al. (2005) report:

We performed two analyses to determine whether these 12 shared map intervals might indeed be orthologous. First, we examined the distribution of shared sites between species (Table S3). We found that the distribution is inconsistent with the generally accepted phylogeny of catarrhine primates. This is particularly relevant for the human/great ape lineage. For example, only one interval is shared by gorilla and chimpanzee; however, two intervals are shared by gorilla and baboon; while three intervals are apparently shared by macaque and chimpanzee. Our Southern analysis shows that human and orangutan completely lack PTERV1 sequence (see Figure 2A). If these sites were truly orthologous and, thus, ancestral in the human/ape ancestor, it would require that at least six of these sites were deleted in the human lineage. Moreover, the same exact six sites would also have had to have been deleted in the orangutan lineage if the generally accepted phylogeny is correct. Such a series of independent deletion events at the same precise locations in the genome is unlikely (Figure S3).

[…]

Several lines of evidence indicate that chimpanzee and gorilla PTERV1 copies arose from an exogenous source. First, there is virtually no overlap (less than 4%) between the location of insertions among chimpanzee, gorilla, macaque, and baboon, making it unlikely that endogenous copies existed in a common ancestor and then became subsequently deleted in the human lineage and orangutan lineage. Second, the PTERV1 phylogenetic tree is inconsistent with the generally accepted species tree for primates, suggesting a horizontal transmission as opposed to a vertical transmission from a common ape ancestor. An alternative explanation may be that the primate phylogeny is grossly incorrect, as has been proposed by a minority of anthropologists.

As irritating to the evolutionary model as it might be, there are, in fact, a significant number of deviations from the orthodox phylogeny.

This is a lie. Nobody is claiming that the PTERV1 sequences are inherited from common ancestors. McLatchie appears to have missed the point entirely. Common ERVs in precise common locations point to common ancestry. ERVs in different locations point to separate endogenization events, and say nothing whatsoever about phylogeny.

In the final part of this blog series, I will discuss the argument based on “shared mistakes” in these ERV elements, as well as the argument based on degrees of mutational divergence between the retroviral 5′ and 3′ long terminal repeats (LTRs).

Should be fun.