By Razib Khan | June 9, 2013 9:07 pm
A few year ago there was a minor controversy when some evolutionary genomicists reported that they had reconstructed the genome of the extinct Taino people of Puerto Rico by reassembling fragments preserved in contemporary populations long since admixed. The controversy had to do with the fact that some individuals today claim to be Taino, and therefore, they were not an extinct population. Though that controversy eventually blew over, the methods lived on, and continue to be used. Now some of the same people who brought you that have come out with work which reconstructs the recent demographic history of the Caribbean, both maritime and mainland, using genomics. Even better, it’s totally open access because it’s up on arXiv, Reconstructing the Population Genetic History of the Caribbean (please see the comments at Haldane’s Sieve as well, kicked off by little old me). Though the authors pooled a variety of data sets (e.g., HapMap, POPRES, HGDP) the focus is on the populati
ons highlighted in the map above.
Much of the novel insight in the results begins with their observation of a distinct “Latino” population genetic cluster with strong affinities with Europe within the Caribbean populations. This is clearly visible in their ADMIXTURE analysis. What they did was pool various populations, and run a method which decomposes the ancestry of each individual as a combination of K ancestral populations. In cases where the pooled populations are clear and distinct the results will be clear and distinct. For example, if you had 50 Finns and 50 Nigerians and pooled them, and ran ADMIXTURE at K = 2, then with a non-trivial number of SNPs (10,000 is more than sufficient) all the Finns and Nigerians will partition into two distinct ancestral populations according to these sorts of model based clustering. But it always has to be remembered that though these methods map onto reality, and give us some sense of the variation within the data sets, the K’s themselves are artificial constructs. So, for example, the HGDP Maya population is known to have non-trivial European gene flow. If you use this sort of Maya population as your “Native American” reference, then you will underestimate Native ancestry in admixed groups because your reference Native population is already skewed toward Europeans (this is obviously a major problem when you don’t have the appropriate reference because it is extinct, such as with the Taino).
With those cautionary preliminaries out of the way what’s going on in these results? As you can see many of the Caribbean populations are straightforward combinations of various continental ‘parent’ populations. This is clearly evident in K = 3, where green = Africa, red = European, and blue = Native (note that the Maya have a range of European ancestry just as I said). By looking at individual variation within populations you can already gain some insights as to the nature of the admixture. In Mexico there is a wide range of the European vs. Native fraction, though in this data set there are no “pure” individuals. Additionally, there are low, but relatively even, amounts of African ancestry across the population. Though African consciousness this is not a major element of modern Mexican national identity, people of African ancestry were a major part of the Spanish colonial enterprise (see Empire: How Spain Became a World Power, 1492-1763). In some areas, such as Veracruz, people of visibly African ancestry remain, but in much of Mexico these individuals intermarried and their physical characteristics were diluted toward the point of not being visible.
The situation in the maritime Caribbean is somewhat more complex. In these contexts it was the Native, not African, ancestry which was subsumed and submerged. It is genomics which has ‘rediscovered’ this ancestry, to the extent that many scholars had previously been skeptical of the possibility that modern Puerto Ricans and Dominicans inherited a substantial share of Taino ancestry. In both Puerto Rico and the Dominican Republic the relevant issue is that there is a wide range of proportion of African and European ancestry, with Cuba being the notable extreme case of this phenomenon. What’s going on with Cuba in particular is that there were late waves of migration from Spain, so some modern white Cubans are much less affected by admixture than other Caribbeans (remember that Cuba was part of Spain until 1898). In Haiti the situation is reversed, where the revolutions of the late 18th and early 19th centuries had a racial tinge, and whites were expelled (leaving a small mulatto class).
But it is K = 8 where things really get interesting. The black component is a European Iberian-like element which is distinct to Latino populations (including Maya). As you can see on this PCA the Latino element is related to the Iberian populations, as they took the European segments from the Caribbean populations and used them to flesh out the distribution in ancestry. There are several ways to interpret this. Dienekessuggested this might simply be a function of the source Iberian populations hundreds of years ago being somewhat different from the contemporary ones. For example, obviously contemporary Spaniards would be more subject to gene flow with other Europeans >1600 than their New World cousins. Another possibility is that there was extreme sampling from a particular region of Spain, and that has how broken out as its own cluster. For example, I know that a disproportionate number of migrants were from Andalucia and Extramadura. But the pattern here doesn’t suggest to me that possibility (the black dots should be more south-shifted I would think if they were from those two provinces).
Rather, the interpretation they seem to favor is that this element has been drifted away from the ancestral populations due to a bottleneck. This is not ethnographically implausible; the early years of the Spanish colonial experiment was characterized by de facto polygyny. Many adventurers lived lives not unlike those of the white grandees of the East India company in the late 18th century. Some have argued that this period of ubiquitous common law polygyny has influenced the fact that illegitimate births have traditionally been very common in Latin America. One reason the authors favor the bottleneck model is that the genetic distance between the Latino element and the Iberian one is rather high. This is often common in situations where drift/bottleneck has deviated allele frequencies particularly rapidly. Not only that, but the tendency is most strong in maritime Latin America, many of whose islands received relatively fewer subsequent migrants than the large and expansive mainland viceroyalties.
23andMe ancestry decomposition for friend who is 1/4 Asian
Another way the authors explored the demographic history was to look at thelength distribution of the tracts of ancestries. How this works is simple. A first generation hybrid will have unbroken lengths of ancestry each parent, but subsequent generations will start to have fragmentation occur as recombination breaks apart long blocks identical by descent. You can see this in the figure to the left, where my friend who has one Asian grandparent has blocks of alternating European and Asian ancestry because of meiotic recombination events. The longer from the time of admixture the smaller and smaller the blocks will become, as recombination slices apart long blocks and recombines ancestral components. By looking at the distribution and mix of lengths the authors can construct demographic histories of the populations. In short it looks like much of the European ancestry came in one short quick pulse, rather early on in settlement. This is in keeping with the high reproductive output attested for European males thanks to polygyny during this period.
The same method was performed for the African ancestry, and the authors discovered an intriguing result. It seems that in the early years most of the Caribbean black slaves were derived from the western tip of Sub-Saharan Africa, from the Senegal river down to modern Ghana. Later on the longer tracts show affinities with populations further east, from the Bight of Benin toward the Equator. I don’t know the history of slavery well enough to confirm or deny the reality of this finding, but it illustrates the power of genomics combined with wide sampling strategies. More relevantly I suspect genomics’ role will be to assign magnitudes to known dynamics.
Finally, the authors also inferred diverse relationships for the Native admixture in the Caribbean populations. They confirmed some evidence of south-to-north migration into Central and Caribbean America, and also specific ethno-linguistic associations between now de facto extinct Caribbean populations and those of mainland South America. Some of these results have long been suggested, but lack of historical documentation makes inferences shadowy. Genomics can not resolve these debates, but they shed light upon them.
Overall this is an interesting study because I think it is a test run at the sort of historical-demographic questions that genomics will be used for. There has long been a ‘genetics as a tool’ school of thought among many ecologists and phylogeneticists, and now you shall have a ‘genomics as a tool’ to sit right along side that in many more diverse fields. Caribbean and Latin American populations are the low hanging fruit, because the Spanish and Portuguese colonial experiment are reasonably well attested, and the source populations are very distinct (so easy to pick signal out of the noise). But there are other historical questions of the same period which are also of interest. In Albion’s Seed David Hackett Fisher describes four Anglo-American folkways which contributed to the culture of this nation. Of these, ~20,000 Puritans arrived between 1620-1640 and became the ancestors of ~700,000 by 1970. Though 20,000 is not quite a bottleneck (in fact, they arrived from different sectors of England), I am curious if these individuals, a segment of “Old Americans,” can still be discerned in the genomic data. This is just one of many possible questions which will be with reach of answer in the near future….