Haplogroup I (mtDNA)


Haplogroup I is a human mitochondrial DNA haplogroup. It is believed to have originated about 21,000 years ago, during the Last Glacial Maximum period in West Asia. The haplogroup is unusual in that it is now widely distributed geographically, but is common in only a few small areas of East Africa, West Asia and Europe. It is especially common among the El Molo and Rendille peoples of Kenya, various regions of Iran, the Lemko people of Slovakia, Poland and Ukraine, the island of Krk in Croatia, the department of Finistère in France and some parts of Scotland.

Origin

Haplogroup I is a descendant of haplogroup N1a1b and sibling of haplogroup N1a1b1. It is believed to have arisen somewhere in West Asia between 17,263 and 24,451 years before present , with coalescence age of 20.1 thousand years ago. It has been suggested that its origin may be in Iran or more generally the Near East. It has diverged to at least seven distinct clades i.e. branches I1-I7, dated between 16-6.8 thousand years. The hypothesis about its Near Eastern origin is based on the fact that all haplogroup I clades, especially those from Late Glacial period, include mitogenomes from the Near East. The age estimates and dispersal of some subclades are similar to those of major subclades of the mtDNA haplogroups J and T, indicating possible dispersal of the I haplogroup into Europe during the Late Glacial period and postglacial period, several millennia before the European Neolithic period. Some subclades show signs of the Neolithic diffusion of agriculture and pastoralism within Europe.
A similar view puts more emphasis on the Persian Gulf region of the Near East.

Distribution

Haplogroup I is found at moderate to low frequencies in East Africa, Europe, West Asia and South Asia. In addition to the confirmed seven clades, the rare basal/paraphyletic clade I* has been observed in three individuals; two from Somalia and one from Iran.

Africa

The highest frequencies of mitochondrial haplogroup I observed so far appear in the Cushitic-speaking El Molo and Rendille in northern Kenya. The clade is also found at comparable frequencies among the Soqotri.
PopulationLocationLanguage FamilyNFrequencySource
AmharaEthiopiaAfro-Asiatic > Semitic1/1200.83%
EgyptiansEgyptAfro-Asiatic > Semitic2/345.9%
Beta IsraelEthiopiaAfro-Asiatic > Cushitic0/290.00%
Dawro KontaEthiopiaAfro-Asiatic > Omotic0/1370.00% and
EthiopiaEthiopiaUndetermined0/770.00%
Ethiopian JewsEthiopiaAfro-Asiatic > Cushitic0/410.00%
GurageEthiopiaAfro-Asiatic > Semitic1/214.76%
HamerEthiopiaAfro-Asiatic > Omotic0/110.00% and
OngotaEthiopiaAfro-Asiatic > Cushitic0/190.00% and
OromoEthiopiaAfro-Asiatic > Cushitic0/330.00%
TigraiEthiopiaAfro-Asiatic > Semitic0/440.00%
DaasanachKenyaAfro-Asiatic > Cushitic0/490.00%
ElmoloKenyaAfro-Asiatic > Cushitic12/5223.08% and
LuoKenyaNilo-Saharan0/490.00% and
MaasaiKenyaNilo-Saharan0/810.00% and
NairobiKenyaNiger-Congo0/1000.00%
NyangatomKenyaNilo-Saharan1/1120.89%
RendilleKenyaAfro-Asiatic > Cushitic3/1717.65% and
SamburuKenyaNilo-Saharan3/358.57% and
TurkanaKenyaNilo-Saharan0/510.00% and
HutuRwandaNiger-Congo0/420.00%
DinkaSudanNilo-Saharan0/460.00%
SudanSudanUndetermined0/1020.00%
BurungeTanzaniaAfro-Asiatic > Cushitic1/382.63%
DatogaTanzaniaNilo-Saharan0/570.00% and
IraqwTanzaniaAfro-Asiatic > Cushitic0/120.00%
SukumaTanzaniaNiger-Congo0/320.00% and
TuruTanzaniaNiger-Congo0/290.00%
YemeniYemenAfro-Asiatic > Semitic0/1140.00%

Asia

Haplogroup I is present across West Asia and Central Asia, and is also found at trace frequencies in South Asia. Its highest frequency area is perhaps in northern Iran. Terreros 2011 notes that it also has high diversity there and reiterates past studies that have suggested that this may be its place of origin. Found in Svan population from Georgia I* 4.2%."Sequence polymorphisms of the mtDNA control region in a human isolate: the Georgians from Swanetia."Alfonso-Sánchez MA1, Martínez-Bouzas C, Castro A, Peña JA, Fernández-Fernández I, Herrera RJ, de Pancorbo MM. The table below shows some of the populations where it has been detected.
PopulationLanguage FamilyNFrequencySource
BaluchIndo-European0/390.00%
BrahuiDravidian0/380.00%
Caucasus *Kartvelian1/581.80%
Druze-11/3113.54%
GilakiIndo-European0/370.00%
GujaratiIndo-European0/340.00%
HazaraIndo-European0/230.00%
Hunza BurushoIsolate2/444.50%
India-8/25440.30%
Iran -3/319.70%
Iran -2/1171.70%
KalashIndo-European0/440.00%
Kurdish Indo-European1/205.00%
Kurdish Indo-European1/323.10%
Kurdish
Indo-European66/20033.0%
LurIndo-European0/170.00%
MakraniIndo-European0/330.00%
MazandarianIndo-European1/214.80%
PakistaniIndo-European0/1000.00%
Pakistan-1/1450.69%
ParsiIndo-European0/440.00%
PathanIndo-European1/442.30%
PersianIndo-European1/422.40%
ShugnanIndo-European1/442.30%
SindhiIndo-European1/238.70%
Turkish Turkic2/405.00%
Turkish *Turkic1/502.00%
TurkmenTurkic0/410.00%
UzbekTurkic0/420.00%

Europe

Western Europe

In Western Europe, haplogroup I is most common in Northwestern Europe. The frequency in these areas is between 2 and 5 percent. Its highest frequency in Brittany, France where it is over 9 percent of the population in Finistère. It is uncommon and sometimes absent in other parts of Western Europe.
PopulationLanguageNFrequencySource
Austria/Switzerland-4/1872.14%
Basque Basque/Labourdin côtier-haut navarrais0/560.00%
Basque Basque/Occidental0/550.00%
Basque Basque/Biscayen1/591.69%
Basque Basque/Haut-navarrais méridional2/633.17%
Basque Basque/Gipuzkoan0/570.00%
Basque Basque/Bas-navarrais0/680.00%
Basque Basque/Haut-navarrais septentrional0/510.00%
Basque Basque/Roncalais-salazarais0/550.00%
Basque Basque/Souletin0/620.00%
Basque Basque/Biscayen0/640.00%
BéarnFrench0/510.00%
BigorreFrench0/440.00%
BurgosSpanish0/250.00%
CantabriaSpanish0/180.00%
ChalosseFrench0/580.00%
Denmark-6/1055.71%
England/Wales-12/4293.03%
Finland-1/492.04%
Finland/Estonia-5/2022.48%
France -2/229.10%
France -0/400.00%
France -0/390.00%
France -2/722.80%
France -2/375.40%
France/Italy-2/2480.81%
Germany-12/5272.28%
Iceland-21/4674.71%
Ireland-3/1282.34%
Italy -2/484.20%
La RiojaSpanish1/511.96%
North AragonSpanish0/260.00%
Orkney-5/1523.29%
Saami-0/1760.00%
Scandinavia-12/6451.86%
Scotland-39/8914.38%
Spain/Portugal-2/3520.57%
Sweden-0/370.00%
Western BizkaiaSpanish0/180.00%
Western Isles/Isle of Skye-15/2466.50%

Eastern Europe

In Eastern Europe, the frequency of haplogroup I is generally lower than in Western Europe, but its frequency is more consistent between populations with fewer places of extreme highs or lows. There are two notable exceptions. Nikitin 2009 found that Lemkos in the Carpathian mountains have the "highest frequency of haplogroup I in Europe, identical to that of the population of Krk Island in the Adriatic Sea".
PopulationNFrequencySource
Boyko0/200.00%
Hutsul0/380.00%
Lemko6/5311.32%
Belorussians2/922.17%
Russia 3/2151.40%
Romanians 590.00%
Romanians 462.17%
Russia1/502.0%
Ukraine0/180.00%
Croatia 4/2771.44%
Croatia 15/13311.28%
Croatia 1/1050.95%
Croatia 2/1081.9%
Croatia 1/981%
Herzegovinians1/1300.8%
Bosnians6/2472.4%
Serbians4/1173.4%
Macedonians2/1461.4%
Macedonian Romani7/1534.6%
Slovenians2/1041.92%
Bosnians4/1442.78%
Poles8/4361.83%
Caucasus *1/581.80%
Russians5/2012.49%
Bulgaria/Turkey2/1021.96%

Historic and Pre-Historic Samples

Haplogroup I has until recently been absent from ancient European samples found in Paleolithic and Mesolithic grave sites. In 2017, in a site on Italian island of Sardinia was found a sample with the subclade I3 dated to 9124-7851 BC, while in the Near East, in Levant was found a sample with yet-not-defined subclade dated 8,850-8,750 BC, while in Iran was found a younger sample with subclade I1c dated to 3972-3800 BC. In Neolithic Spain was found a sample with yet-not-defined subclade. Haplogroup I displays a strong connection with the Indo-European migrations; especially its I1, I1a1 and I3a subclades, which have been found in Poltavka and Srubnaya cultures in Russia, among ancient Scythians, and in Corded Ware and Unetice Culture burials in Saxony. Haplogroup I has also been noted at significant frequencies in more recent historic grave sites.
In 2013, Nature announced the publication of the first genetic study utilizing next-generation sequencing to ascertain the ancestral lineage of an Ancient Egyptian individual. The research was led by Carsten Pusch of the University of Tübingen in Germany and Rabab Khairat, who released their findings in the Journal of Applied Genetics. DNA was extracted from the heads of five Egyptian mummies that were housed at the institution. All the specimens were dated to between 806 BC and 124 AD, a time frame corresponding with the Late Dynastic and Ptolemaic periods. The researchers observed that one of the mummified individuals likely belonged to the I2 subclade. Haplogroup I has also been found among ancient Egyptian mummies excavated at the Abusir el-Meleq archaeological site in Middle Egypt, which date from the Pre-Ptolemaic/late New Kingdom, Ptolemaic, and Roman periods.
Haplogroup I5 has also been observed among specimens at the mainland cemetery in Kulubnarti, Sudan, which date from the Early Christian period.

Samples with determined subclades

Samples with unknown subclades

The frequency of haplogroup I may have undergone a reduction in Europe following the Middle Ages. An overall frequency of 13% was found in ancient Danish samples from the Iron Age to the Medieval Age from Denmark and Scandinavia compared to only 2.5% in modern samples. As haplogroup I is not observed in any ancient Italian, Spanish , British, central European populations, early central European farmers and Neolithic samples, according to the authors "Haplogroup I could, therefore, have been an ancient Southern Scandinavian type "diluted" by later immigration events".

Subclades

Tree

This phylogenetic tree of haplogroup I subclades with time estimates is based on the paper and published research.
Hg Age estimate 95% confidence interval
N1a1b28.623.5 - 33.9
I20.118.4 - 21.9
I116.314.6 - 18.0
I1a11.69.9 - 13.3
I1a14.94.2 - 5.6
I1a1a3.83.3 - 4.4
I1a1b1.40.5 - 2.2
I1a1c2.51.3 - 3.7
I1a1d1.81.0 - 2.6
I1b13.411.3 - 15.5
I1c10.38.4 - 12.2
I1c17.25.4 - 9.0
I1c1a4.02.5 - 5.4
I2'312.610.4 - 14.7
I26.86.0 - 7.6
I2a4.73.8 - 5.7
I2a13.22.1 - 4.4
I2b1.70.5 - 2.9
I2c4.73.6 - 5.8
I2d3.01.1 - 4.8
I2e3.11.4 - 4.8
I310.68.8 - 12.4
I3a7.46.1 - 8.7
I3a16.14.7 - 7.5
I3b2.61.1 - 4.2
I3c9.47.6 - 11.2
I415.112.3 - 18.0
I4a6.45.4 - 7.4
I4a15.74.5 - 6.7
I4b8.45.8 - 10.9
I518.416.4 - 20.3
I5a16.014.0 - 17.9
I5a19.27.1 - 11.3
I5a212.310.2 - 14.4
I5a2a1.61.0 - 2.1
I5a34.82.8 - 6.8
I5a45.63.5 - 7.8
I5b8.86.3 - 11.2
I618.416.2 - 20.6
I6a5.33.5 - 7.0
I6b13.110.4 - 15.8
I79.16.3 - 11.9

Distribution

I1

It formed during the Last Glacial pre-warming period. It is found mainly in Europe, Near East, occasionally in North Africa and the Caucasus.
It is the most frequent clade of the haplogroup.
Genbank IDPopulationSource
JQ702472
JQ702567Germany
JQ704077Germany
JQ705190
JQ705840
I1a
The subclade frequency peaks are mostly located in North-Eastern Europe.
Genbank IDPopulationSource
-FamilyTreeDNA
Turkey FamilyTreeDNA
Chuvash
I1a1
Genbank IDPopulationSource
Portugal
-
-
-
-
-
-
-
Tunisia
-
Czech
Czech
Turkey
Morocco
I1a1a
Genbank IDPopulationSource
Finland
Finland
Finland
Finland
Finland
Finland
Finland
Finland
-
-
-
-
-
I1a1b
Genbank IDPopulationSource
-
-
-
I1a1c
Genbank IDPopulationSource
-
-
Mishar Tatars
I1a1d
Genbank IDPopulationSource
-
-
I1b
Genbank IDPopulationSource
Caucasian
India
Jewish Diaspora
ArmenianFamilyTreeDNA
-FamilyTreeDNA
-
-
SwedishFamilyTreeDNA
I1c
GenBank IDPopulationSource
-FamilyTreeDNA
-
-
-

I2'3

It is the common root clade for subclades I2 and I3. There's a sample from Tanzania with which I2'3 shares a variant at position 152 from the root node of haplogroup I, and this "node 152" could be upstream I2'3s clade. Both I2 and I3 might have formed during the Holocene period, and most of their subclades are from Europe, only few from the Near East. Examples of this ancestral branch have not been documented.
I2
GenBank IDPopulationSource
-FamilyTreeDNA
Volga Tatars
-FamilyTreeDNA
-
-
-
-
-
-
-
-
-
-
-
-FamilyTreeDNA
Chechnya
Czech
Turkey
I2a
GenBank IDPopulationSource
-FamilyTreeDNA
ScotlandFamilyTreeDNA
-
-
-
-FamilyTreeDNA
I2a1
GenBank IDPopulationSource
Finland
IrelandFamilyTreeDNA
IrelandFamilyTreeDNA
I2b
GenBank IDPopulationSource
Finland
Finland
Finland
Finland
I2c
GenBank IDPopulationSource
-
-
-
-
-
I2d
GenBank IDPopulationSource
-
-
I2e
GenBank IDPopulationSource
-
-
I3
GenBank IDPopulationSource
-
-
-
-
Greece
I3a
GenBank IDPopulationSource
FranceFamilyTreeDNA
-FamilyTreeDNA
-
-
-
-
I3a1
GenBank IDPopulationSource
ItalyBandelt
FranceFamilyTreeDNA
-
I3b
GenBank IDPopulationSource
IrelandFamilyTreeDNA
-

I4

The clade splits into subclades I4a and newly defined I4b, with samples found in Europe, the Near East and the Caucasus.
GenBank IDPopulationSource
-
Italy
I4a
GenBank IDPopulationSource
Siberia
-FamilyTreeDNA
-FamilyTreeDNA
ArmenianFamilyTreeDNA
-
-
-
-
-
-
-
-
-

I5

Is the second most frequent clade of the haplogroup. Its subclades are found in Europe, e.g. I5a1, and the Near East, e.g. I5a2a and I5b.
GenBank IDPopulationSource
German FamilyTreeDNA
North Ossetia
I5a
GenBank IDPopulationSource
Hutterite
-
-
Dubai
Turkey
Yemen
Yemen
Yemen
Yemen
Yemen
Yemen
Yemen
I5a1
GenBank IDPopulationSource
Leon
Bedouin
-
-
Italy
Bulgaria

I6

The subclade is very rare, found until July 2013 only in four samples from the Near East.
GenBank IDPopulationSource
Turkey
I6a
GenBank IDPopulationSource
-
-

I7

It is the rarest defined subclade, until July 2013 found only in two samples from the Near East and the Caucasus.
GenBank IDPopulationSource
ArmenianFamilyTreeDNA
Kuwait

Genetics

Backbone mtDNA Tree

Footnotes

Works Cited

Journals