Genetic history of the Iberian Peninsula


The ancestry of modern Iberians is consistent with the geographical situation of the Iberian Peninsula in the south-west corner of Europe. The large predominance of Y-Chromosome Haplogroup R1b, common throughout Western Europe, is the result of Central European invaders during the Bronze Age. Similar to Sardinia and unlike the Balkans and Italy, Iberia was shielded from settlement from the Bosporus and Caucasus region by its western geographic location, and its low level of Western Asian admixture probably arrived during the Roman period. Later historical Eastern Mediterranean and Middle Eastern genetic contribution to the Iberia gene-pool was also significant compared to other Western European countries, driven by Phoenicians, Greeks, Carthaginians, Jews and Levantine Arabs.
Like Sicily and Southern Italy, Iberia, although to a lesser quantity, has a specific level of ancestry originated both in North Africa and in Sub-Saharan Africa, which is largely ascribed to the Islamic presence in Southern Spain and Southern Portugal, and the population of the Canary Islands shows a bigger African admixture than the average Southern Europe due to its location as an African archipelago. Significant genetic differences are found among, and even within, Spain's different regions, which can be explained by the wide divergence in their historical trajectories and Spain's internal geographic boundaries. The Basque region holds the least Eastern Mediterranean ancestry in Iberia. African influence is mainly concentrated in the Southern and Western regions of the peninsula, though the genetic influence is a minor component of the overall mix.

Population Genetics: Methods and Limitations

One of the first scholars to perform genetic studies, although now questioned in its conclusions, was Luigi Luca Cavalli-Sforza. He used classical genetic markers to analyse DNA by proxy. This method studies differences in the frequencies of particular allelic traits, namely polymorphisms from proteins found within human blood. Subsequently, his team calculated genetic distance between populations, based on the principle that two populations that share similar frequencies of a trait are more closely related than populations that have more divergent frequencies of the trait.
Since then, population genetics has progressed significantly and studies using direct DNA analysis are now abundant and may use mitochondrial DNA, the non-recombining portion of the Y chromosome or autosomal DNA. MtDNA and NRY DNA share some similar features which have made them particularly useful in genetic anthropology. These properties include the direct, unaltered inheritance of mtDNA and NRY DNA from mother to offspring and father to son, respectively, without the 'scrambling' effects of genetic recombination. We also presume that these genetic loci are not affected by natural selection and that the major process responsible for changes in base pairs has been mutation.
Whereas Y-DNA and mtDNA haplogroups represent but a small component of a person's DNA pool, autosomal DNA has the advantage of containing hundreds and thousands of examinable genetic loci, thus giving a more complete picture of genetic composition. Descent relationships can only to be determined on a statistical basis, because autosomal DNA undergoes recombination. A single chromosome can record a history for each gene. Autosomal studies are much more reliable for showing the relationships between existing populations but do not offer the possibilities for unraveling their histories in the same way as mtDNA and NRY DNA studies promise, despite their many complications.

Main genetic compositions

DNA analysis shows that Spanish and Portuguese populations are most closely related to other populations of western Europe.
There is an axis of significant genetic differentiation along the east–west direction, in contrast to remarkable genetic similarity in the north–south direction. North African admixture, associated with the Islamic conquest, can be dated to the period between c. AD 860-1120.

Y-Chromosome haplogroups

Like other Western Europeans, among Spaniards and Portuguese the Y-DNA Haplogroup R1b is the most frequent, occurring at over 70% throughout most of Spain. R1b is particularly dominant in the Basque Country and Catalonia, occurring at rate of over 80%. In Iberia, most men with R1b belong to the subclade R-P312. The distribution of haplogroups other than R1b varies widely from one region to another.
In Portugal as a whole the R1b haplogroups rate 70%, with some areas in the Northwest regions reaching over 90%.
Although R1b prevails in much of Western Europe, a key difference is found in the prevalence in Iberia of R-DF27. This subclade is found in over 60% of the male population in the Basque Country and 40-48% in Madrid, Alicante, Barcelona, Cantabria, Andalucia, Asturias and Galicia. R-DF27 constitutes much more than the half of the total R1b in the Iberian Peninsula. Subsequent in-migration by members of other haplogroups and subclades of R1b did not affect its overall prevalence, although this falls to only two thirds of the total R1b in Valencia and the coast more generally. R-DF27 is also a significant subclade of R1b in parts of France and Britain. R-S28/R-U152 is the prevailing subclade of R1b in Northern Italy, Switzerland and parts of France, but it represents less than 5.0% of the male population in Iberia. Ancient samples from the central European Bell Beaker culture, Hallstatt culture and Tumulus culture belonged to this subclade. R-S28/R-U152 is slightly significant in Seville, Barcelona, Portugal and Basque Country at 10-20% of the total population, but it is represented at frequencies of only 3.0% in Cantabria and Santander, 2.0% in Castille and Leon, 6% in Valencia, and under 1% in Andalusia.
Sephardic Jews
I1 0% I2*/I2a 1% I2 0% Haplogroup R1a 5% R1b 13% G 15% Haplogroup J2 2 25% J*/J1 22% E-M2151b1b 9% T 6% Q 2%
Haplogroup J, mostly subclades of Haplogroup J-M172, is found at levels of over 20% in some regions, while Haplogroup E has a general frequency of about 10% – albeit with peaks surpassing 30% in certain areas. Overall, E-M78 and E-M81 both constitute about 4.0% each, with a further 1.0% from Haplogroup E-M123 and 1.0% from unknown subclades of E-M96..
;Frequencies of Y-DNA haplogroups in Spanish regions
RegionSample sizeCEGIJ2JxJ2R1aR1bNotes
Aragon346%0%18%12%0%3%56%
Andalusia East954%3%6%9%3%1%72%
Andalusia West7315%4%5%14%1%4%54%
Asturias2015%5%10%15%0%0%50%
Basques1161%0%8%3%1%0%87%
Castilla La Mancha634%10%2%6%2%2%72%
Castile North-East319%3%3%3%0%0%77%
Castile North-West10019%5%3%8%1%2%60%
Catalonia80>0%3%6%3%6%0%0%81%
Extremadura5218%4%10%12%0%0%50%
Galicia8817%6%10%7%1%0%57%
Valencia73>0%10%1%10%5%3%3%64%
-
Majorca629%6%8%8%2%0%66%
Menorca3719%0%3%3%0%3%73%
Ibiza548%13%2%4%0%0%57%
Seville1557%4%12%8%3%1%60%
Huelva2214%0%9%14%0%0%59%
Cadiz284%0%14%14%4%0%51%
Cordoba2711%0%15%15%0%0%56%
Málaga2631%4%0%15%0%8%43%
Leon6010%7%3%5%2%7%62%
Cantabria7013%9%6%3%3%4%58%
-

Mitochondrial DNA

There have been a number of studies about the mitochondrial DNA haplogroups in Europe. In contrast to Y DNA haplogroups, mtDNA haplogroups did not show as much geographical patterning, but were more evenly ubiquitous. Apart from the outlying Sami, all Europeans are characterized by the predominance of haplogroups H, U and T. The lack of observable geographic structuring of mtDNA may be due to socio-cultural factors, namely patrilocality and a lack of polyandry.
The subhaplogroups H1 and H3 have been subject to a more detailed study and would be associated to the Magdalenian expansion from Iberia c. 13,000 years ago:
A 2007 European-wide study including Spanish Basques and Valencian Spaniards found Iberian populations to cluster the furthest from other continental groups, implying that Iberia holds the most ancient European ancestry. In this study, the most prominent genetic stratification in Europe was found to run from the north to the south-east, while another important axis of differentiation runs east–west across the continent. It also found, despite the differences, that all Europeans are closely related.

North African influence

A number of studies have focused on ascertaining the genetic impact of historical North African population movements into Iberia on the genetic composition of modern Spanish and Portuguese populations. Initial studies pointed to the Straits of Gibraltar acting more as a genetic barrier than a bridge during prehistorical times, while other studies point to a higher level of recent North African admixture among Iberians than among other European populations, albeit this is as a result of more recent migratory movements, particularly the Moorish invasion of Iberia in the 8th century.
In terms of autosomal DNA, the most recent study regarding African admixture in Iberian populations was conducted in April 2013 by Botigué et al. using genome-wide SNP data for over 2000 European, Maghreb, Qatar and Sub-Saharan individuals of which 119 were Spaniards and 117 Portuguese, concluding that Spain and Portugal hold significant levels of North African ancestry. Estimates of shared ancestry averaged from 4% in some places to 10% in the general population; the populations of the Canary Islands yielded from 0% to 96% of shared ancestry with north Africans, although the Canary islands are a Spanish exclave located in the African continent, and thus this output is not representative of the Iberian population; these same results did not exceed 2% in other western or southern European populations. However, contrary to past autosomal studies and to what is inferred from Y-Chromosome and Mitochondrial Haplotype frequencies, it does not detect significant levels of Sub-Saharan ancestry in any European population outside the Canary Islands. Indeed, a prior 2011 autosomal study by Moorjani et al. found Sub-Saharan ancestry in many parts of southern Europe at ranges of between 1-4%, "the highest proportion of African ancestry in Europe is in Iberia, consistent with inferences based on mitochondrial DNA and Y chromosomes and the observation by Auton et al. that within Europe, the Southwestern Europeans have the highest haplotype-sharing with North Africans."
In terms of paternal Y-Chromosome DNA, recent studies coincide in that Iberia has the greatest presence of the typically Northwest African Y-chromosome haplotype marker E-M81 in Europe, with an average of 3%. as well as Haplotype Va. Estimates of Y-Chromosome ancestry vary, with a 2008 study published in the American Journal of Human Genetics using 1140 samples from throughout the Iberian peninsula, giving a proportion of 10.6% North African ancestry to the paternal composite of iberians. A similar 2009 study of Y-chromosome with 659 samples from Southern Portugal, 680 from Northern Spain, 37 samples from Andalusia, 915 samples from mainland Italy, and 93 samples from Sicily found significantly higher levels of North African male ancestry in Portugal, Spain and Sicily than in Italy.
Other studies of the Iberian gene-pool have estimated significantly lower levels of North African Ancestry. According to Bosch et al. 2000 "NW African populations may have contributed 7% of Iberian Y chromosomes". A wide-ranging study by Cruciani et al. 2007, using 6,501 unrelated Y-chromosome samples from 81 populations found that: "Considering both these E-M78 sub-haplogroups and the E-M81 haplogroup, the contribution of northern African lineages to the entire male gene pool of Iberia, continental Italy and Sicily can be estimated as 5.6 percent, 4.6 percent and 6.6 percent, respectively". A 2007 study estimated the contribution of northern African lineages to the entire male gene pool of Iberia as 5.6%." In general aspects, according to "...the origins of the Iberian Y-chromosome pool may be summarized as follows: 5% recent NW African, 78% Upper Paleolithic and later local derivatives, and 10% Neolithic".
Mitochondrial DNA studies of 2003, coincide in that the Iberian Peninsula holds higher levels of typically North African Haplotype U6, as well as higher frequencies of Sub-Saharan African Haplogroup L in Portugal. High frequencies are largely concentrated in the south and southwest of the Iberian peninsula, therefore overall frequency is higher in Portugal than in Spain with a mean frequency for the entire peninsula of 3.8%. There is considerable geographic divergence across the peninsula with high frequencies observed for Western Andalusia and Córdoba., Southern Portugal, South West Castile. Adams et al. and other previous publications, propose that the Moorish occupation left a minor Jewish, Saqaliba and some Arab-Berber genetic influence mainly in western and southern regions of Iberia.
However, in the most comprehensive genomic study to date Olalde et al., Science 363, 1230–1234 contradict previous studies and established that North African genetic can be identified throughout the whole Iberian Peninsula.
Current debates revolve around whether U6 presence is due to Islamic expansion into the Iberian peninsula or prior population movements and whether Haplogroup L is linked to the slave trade or prior population movements linked to Islamic expansion. A majority of Haplogroup L lineages in Iberia being North African in origin points to the latter. In 2015, Hernández et al. concluded that "the estimated entrance of the North African U6 lineages into Iberia at 10 ky correlates well with other L African clades, indicating that some U6 and L lineages moved together from Africa to Iberia in the Early Holocene while a majority were introduced during historic times."

Portuguese populations

Portuguese mitochondrial DNA genetic diversity

In a study by Sofia L. Marques, Ana Goios, Ana M. Rocha, Maria João Prata, António Amorim, Leonor Gusmão, Cíntia Alves, and Luis Alvarez. "Portuguese mitochondrial DNA genetic diversity- an update and a phylogenetic revision." FSI Genetics 15 : pages 27–32. Excerpts from the Abstract:
" In the case of Portugal, previous population genetics studies have already revealed the general portrait of HVS-I and HVS-II mitochondrial diversity, becoming now important to update and expand the mitochondrial region analysed.
Accordingly, a total of 292 complete control region sequences from continental Portugal were obtained, under a stringent experimental design to ensure the quality of data through double sequencing of each target region.* Furthermore, H-specific coding region SNPs were examined to detail haplogroup classification and complete mitogenomes were obtained for all sequences belonging to haplogroups U4 and U5. In general, a typical Western European haplogroup or Atlantic modal haplotype composition was found in mainland Portugal, associated to high level of mitochondrial genetic diversity. Within the country, no signs of substructure were detected. The typing of extra coding region SNPs has provided the refinement or confirmation of the previous classification obtained with EMMA tool in 96% of the cases. Finally, it was also possible to enlarge haplogroup U phylogeny with 28 new U4 and U5 mitogenomes."
While reasonable estimation of Germanic genes represent no more than 1% of the Iberian gene pool, in Portugal and Galicia maximums of 4% occur, with Catalonia also estimated above average.
The Atlantic modal haplotype or haplotype 15 is a Y chromosome haplotype of Y-STR microsatellite variations, associated with the Haplogroup R1b. It was discovered prior to many of the SNPs now used to identify subclades of R1b and references to it can be found in some of the older literature. It corresponds most closely with subclade R1b1a2a1a .
The AMH is the most frequently occurring haplotype amongst human males in Atlantic Europe. It is characterized by the following marker alleles:
The Atlantic modal haplotype reaches the highest frequencies in Portugal, where it reaches 70% as a whole, with more than 90% in NW Portugal, in Great Britain and Ireland.