Genetic studies on Croats


is a scientific discipline which contributes to the examination of the human evolutionary and historical migrations. Particularly useful information is provided by the research of two uniparental markers within our genome, the Y-chromosome and mitochondrial DNA. The studied data suggests that around 3/4 of the contemporary Croatian male Population are the descendants of Old Europeans from the Paleolithic Period. The contemporary Croatian female individuals have genetic diversity which fits within a broader European maternal genetic landscape.
There many Paleolithic period sites located in the territory of Croatia, mostly ascribed to the Mousterian phase in the Middle Paleolithic period. In the Neolithic period in Southeast Europe were founded major cultures like Vinča, Varna, Starčevo. In the Bronze Age happened symbiosis between Proto-Indo-Europeans of Kurgan culture and autochthonous populations, leading to the formation among others also of Proto-Illyrians. They gradually mixed and were assimilated by the Romans, Celts, Avars, and finally Slavs since the 6th century. An additional significant migration happened from Bosnia and Herzegovina, an expansion which was influenced by the Ottoman Empire's conquest since the 15th century, as well by Croatian immigration prior and post World Wars I and II and Croatian War of Independence.

Y chromosome DNA

Prehistoric Y-DNA

In the 2014 study, of the three successfully generated SNP profiles of Neolithic Starčevo culture samples from Vinkovci, two belonged to Y-DNA haplogroup G2a-P15 and one to I2a1-P37.2, which could indicate G2a as potential representatives of the spread of farming from the Near East to Europe, while I2a as Mesolithic substratum in Europe.
In the 2018 study, 10 out of 17 samples from Croatia had a successful Y-DNA sequencing; two Croatia Cardial Neolithic samples from Zemunica Cave belonged to C1a2 and E1b1b1a1b1, Early-Neolithic Starčevo from Beli Manastir-Popova zemlja to C, Early-Neolithic Croatia Impressa from Kargadur to G2a2a1, two Middle-Neolithic Sopot samples from Osijek to G2a2a1 and J2a1, Late-Neolithic Sopot from Beli Manastir-Popova zemlja to I, two Vučedol samples from Beli Manastir-Popova zemlja and Vucedol Tell to R1b1a1a2a2 and G2a2a1a2a, and the Early-Middle Bronze Age sample from Veliki Vanik belonged to J2b2a.

Contemporary Y-DNA

, on the paternal Y chromosome line, a majority of male Croats from Croatia belong to one of the three major European Y-DNA haplogroups - I, R1a and R1b, while a minority mostly belongs to haplogroup E, and others to haplgroups J, N, and G.
Haplogroup I among Croats from Croatia is divided in two major subdivisions - subclade I2, typical for the populations of eastern Adriatic and the Balkans, and I1, typical for the populations of Scandinavia. From the I2 subclade, the most prevailing is I2a1a i.e. its subclade I-M423 > I-Y3104 > I-L621 > I-CTS10936 > I-S19848 > I-CTS4002 > I-CTS10228 > I-Y3120 > I-S17250 > I-PH908, which is typical of the South Slavic populations of Southeastern Europe, being highest in Bosnia-Herzegovina. In Croatia the highest frequency is observed in Dalmatia, peaking in cities of Dubrovnik and Zadar, as well southern islands of Vis, Brač and Korčula, and Hvar. The frequency is lower in the town of Osijek on the banks of the river Drava, in the western mountainous Žumberak region, and in the northern islands of Cres and Krk. The highest frequency of the haplogroup is found in Croats from Herzegovina. The subclade I1 was not found in Osijek and Bosnian Croats, but peaked at 8.9% in Dubrovnik. The subclade's I-P37.2 very high frequency in the Western Balkans diminishes in all directions. The population with haplogroup I migrated to Europe from the Middle East, approximately 25,000-13,000 years ago. It represents the Paleolithic and Mesolithic population of hunter gatherers in Europe. However, in comparison to older research which argued a prehistoric autochthonous origin of the haplogroup I2 in Croatia and the Balkans, as already Battaglia et al. observed highest variance of the haplogroup in Ukraine, Zupan et al. noted that it suggests it arrived with Slavic migration from the homeland which was in present-day Ukraine. The most recent research by O.M. Utevska, concluded that the haplogroup STR haplotypes have the highest diversity in Ukraine, with ancestral STR marker result "DYS448=20" comprising "Dnieper-Carpathian" cluster, while younger derived result "DYS448=19" comprising the "Balkan cluster" which is predominant among the South Slavs. This "Balkan cluster" also has the highest variance in Ukraine, which indicates that the very high frequency in the Western Balkan is because of a founder effect. Utevska calculated that the STR cluster divergence and its secondary expansion from the middle reaches of the Dnieper river or from Eastern Carpathians towards the Balkan peninsula happened approximately 2,860 ± 730 years ago, relating it to the times before Slavs, but much after the decline of the Tripolye culture. More specifically, the cluster is represented by a single SNP, I-PH908, known as I2a1a2b1a1a1c in ISOGG phylogenetic tree, and according to YFull YTree it formed and had TMRCA approximately 1,850-1,700 YBP. Although it is considered that I-L621 might have been present in the Cucuteni–Trypillia culture, but until now was only found G2a, and another subclade I2a1a1-CTS595 was present in the Baden culture of the Calcholitic Carpathian Basin. Although it is dominant among the modern Slavic peoples on the territory of the former Balkan provinces of the Roman Empire, until now it was not found among the samples from the Roman period and is almost absent in contemporary population of Italy. It was found in the skeletal remains with artifacts, indicating leaders, of Hungarian conquerors of the Carpathian Basin from the 9th century, part of Western Eurasian-Slavic component of the Hungarians. According to Fóthi et al., the distribution of ancestral subclades like of I-CTS10228 among contemporary carriers indicates a rapid expansion from Southeastern Poland, is mainly related to the Slavs, and the "largest demographic explosion occurred in the Balkans". The earliest archeogenetic sample until now is Sungir 6 near Vladimir, Russia which belonged to the I-S17250 > I-Y5596 > I-Z16971 > I-Y5595 > I-A16681 subclade.
R1a1a1-M17 and Haplogroup R1b are the second and the third most prevailing haplogroups according to the investigation done in 2003. According to the 2008 research these values are slightly smaller. The haplogroup R-M17 in Croatia is mostly divided into two subclades, R-M558 which is predominant, and R-M458, while R-Z282 is rare. The highest frequency of R1a1a1-M17 was found in the Croats from Osijek, Žumberak, and in the northern islands of Krk and Cres, being similar to the values of the other Slavs, like Slovenes, Czechs and Slovaks. The frequency is lower in Zadar and Dubrovnik, as well on the southern islands of Hvar, Vis, Korčula, and Brač. In Bosnian Croats, the frequency is similar to those of other South Slavs. The highest frequency of the haplogroup R1b, which in Croatia is divided into several subclades, was in the Croats from the island of Krk and Dugi Otok, in Žumberak was 11.3%, while in the southern islands, city of Dubrovnik and in Bosnian Croats it is almost absent, or like in Osijek it was not found. These two haplogroups are connected to Proto-Indo-Europeans migration from the Eurasian area some 5,000 years ago, with R1a particularly to Slavic population's migration. Their frequency show north-south gradiation and an opposite frequency distribution to the haplogroup I-P37.2, and the highest frequency is observed in the northern, western and eastern Croatia. The R-M558 subclade is more frequent among East Slavs in Eastern Europe and Volga-Ural region, while R-M458 among West Slavs in Central and Eastern Europe. Both are present in "informative frequencies in Balkan populations with known Slavonic heritage". R-M558 subclade CTS1211 was also found among Hungarian conquerors which indicates mixing and assimilation of the Slavs among the Hungarians.
From the haplogroup E among Croats the most frequent is subclade E1b1b1a1b-V13, while E1b1b1a3-M149 and E1b1b1c-M123 were also found in small numbers. E-V13 it's typical of the populations of south-eastern Europe, peaking among Kosovo Albanians, and is also high among the Macedonians, Greeks, Romanians, Bulgarians and Serbs. The highest frequency in Croatian mainland has been found in Žumberak and Osijek, in central islands Dugi Otok and Ugljan, as well southern islands Vis and Mljet. In the northern islands of Cres and Krk was similar to other southern islands. In Bosnian Croats the frequency was the same as among the Croats from Croatia. Subclades of J1 are rare in Croatia, while J2 are higher in Croats from Croatia, peaking in Croats from Osijek and central islands Ugljan and Pašman as well the northern island of Krk and Cres, than in Bosnian Croats. Subclade G2a-P15 both in Croatian and Bosnian Croats is found in low numbers, but peaks locally in the north-eastern town of Osijek, and the southern islands of Mljet, Korčula, Brač as well northern island Cres. The haplogroup E and J are related to post-LGM, Neolithic migration of a population from Anatolia who brought with them domestication of wild animals and plants. Specifically, the haplogroup E's subclade probably arose locally in the Balkan not earlier than 8,000-10,000 years ago. These haplogroups show south-north gradiation. The haplogroup G could have been present in Europe during the LGM or population with some of its subclades arrived with early farmers.
Haplogroup's N subclades are rare in Croatia. It is very frequent in the Far East, like Siberia and China, while in Europe in Finns and in the Baltic countries. Unusually for European populations, another central Asian-Siberian haplogroup P was found in unusually high frequencies due to founder effect in the islands of Hvar and Korčula.

Abstract and data

The region of modern-day Croatia was part of a wider Balkan region which may have served as one of several refugia during the LGM, a source region for the recolonization of Europe during the post-glacial period and Holocene. The eastern Adriatic coast was much further south. The northern and the western parts of that sea were steppes and plains, while the modern Croatian islands were hills and mountains. The region had a specific role in the structuring of European, and particularly among Slavic, paternal genetic heritage, characterized by the predominance of R1a and I, and scarcity of E lineages. The contemporary insular populations genetic diversity is characterized by strong isolation and endogamy.
In the table below is cited the most extensive study until now on the population in Croatia. It is a national reference DNA database of 17 loci system which acquired Y-STR haplotypes were predicted in estimated Y-SNP haplogroups. The sub-populations were divided in five regions which sub-populations showed strong similarity and homogeneity of paternal genetic contribution, with exception of sub-population from southern Croatia who showed a mild difference. Additionly to high degree of overall homogeneity, there are gradient similarities to central European cluster, and southern European cluster, going from north to south.
PopulationSamplesSourceI2aR1aE1b1b1-M35R1bI1J2bG2aHJ2a1hJ1J2a1bE1b1a1-M2G2cI2a1I2b1I2bJ2a1-bhLNQT
Overall Croatia1,100Mršić et al. 37.7%
22.1%
10.6%
7.9%
5.8%
3.7%
2.7%
1.8%
1.2%
1.1%
1%
<1<1<1<1<1<1<1<1<1<1
Central Croatia220Mršić et al. 31.8%
23.6%
11.8%
10.4%
5%
5%
3.6%
1.3%
0.4%
2.2%
0.9%000.4%0.9%0.9%00.4%0.9%00
North Croatia220Mršić et al. 25.4%
29.1%
10.9%
10.4%
4.1%
5%
3.1%
5%
0.4%
00.4%
0.4%002.2%0.9%00.4%0.4%01.3%
East Croatia220Mršić et al. 40%
18.6%
11.3%
8.2%
5.9%
2.7%
1.8%
0.9%
2.2%
2.7%
1.3%0000.4%00.4%00.9%1.8%0.4%
West Croatia220Mršić et al. 36.8%
20%
12.7%
5.9%
8.6%
3.2%
3.2%
1.8%
1.8%
0.4%
1.8%00.4%00.4%00.4%00.9%0.4%0.9%
South Croatia220Mršić et al. 54.5%
19.1%
6.3%
4.5%
5.4%
2.7%
1.8%
0.4%
0.9%
00.4%0000.4%00.9%0.4%01.3%0.4%
Zagreb & Croatia239Purps et al. 36.1%23.8%6.4%13.9%3.9%2.1%n/an/a1.9%0.9%n/an/an/an/an/an/an/an/an/an/an/a
Croatia720Šarac et al. 32.5%25.6%9.8%9.1%4.1%5.0%4.4%0.3%2.7%0.5%1.0%0.4%00.3%0.8%0.5%1.0%00.6%0.9%1.2%

Mitochondrial DNA

Prehistoric mtDNA

In the 2014 Y-DNA and mtDNA study, one Mesolithic sample dated 6080-6020 BCE from Vela Spila near Vela Luka on island Korčula belonged to mtDNA haplogroup U5b2a5 common in hunter-gatherer communities, while other eleven Neolithic Starčevo culture samples dated circa 6000–5400 BCE from Vinkovci were assigned haplogroups J1c, K1a, T2b, HV0, K, V, V6, which reveal similar mtDNA diversity and shared ancestry in early farming populations from the Pannonian Basin and the populations of the Central European LBK, accompanied by a reduction of the Mesolithic mtDNA substratum.
Preliminary results from 2016 mtDNA study, which will approximately include 30 samples from Neolithic and 5 samples from Early to Late Bronze Age, on 5 ancient Croatian petrous bones indicated mtDNA haplogroups K2 and K1b1a, H1e/H41, H1b for Neolithic samples similar to Early European Farmers and modern Sardinians and Southern Europeans, while haplogroup HV or H4 for Bronze Age sample similar to modern day Croatian and Balkan population, but without clear evidence for connection with the Indo-European migration.
The 2018 study which included 17 samples from Croatia; Mesolithic from Vela Spila to U5b2b, three Croatia Cardial Neolithic samples from Zemunica Cave to H1, K1b1a and N1a1, Early-Neolithic Starčevo from Beli Manastir-Popova zemlja to U8b1b1, two Early-Neolithic Croatia Impressa samples from Kargadur to H5a and H7c, two Middle-Neolithic Sopot samples from Osijek to U5a1a2 and H10, two Late-Neolithic Sopot samples from Beli Manastir-Popova zemlja to U5b2b and N1a1, Eneolithic from Radovanci to J1c2, three Vučedol samples from Beli Manastir-Popova zemlja and Vucedol Tell to T2e, T2c2 and U4a, Early-Middle Bronze Age from Veliki Vanik to I1a1, and the Late Bronze Age sample from Jazinka Cave belonged to HV0e.

Medieval mtDNA

The 2011 mtDNA study on 27 early medieval skeletal remains in Naklice near Omiš in Southern Dalmatia showed that 67% belonged to haplogroup H, 18% to J, 11% to U5, and 4% to HV. The 2015 mtDNA study on medieval skeletal remains in Šopot and Ostrovica in Northern Dalmatia confirmed that profiles inherited by the maternal line differed neither between Ostrovica and Šopot site nor between medieval and modern populations, showing the same haplogroup prevalence in both medieval and contemporary populations. The 2014 study of a male skeleton found in Split from Late Roman Period showed that belonged to haplogroup H.

Contemporary mtDNA

Genetically, on the maternal X chromosome line, a majority of female Croats from Croatia belong to three of the eleven major European mtDNA haplogroups - H, U, J, while a large minority belongs to many other smaller haplogroups.
In all the studies, haplogroup H is the most frequent maternal haplogroup in Croatian mainland and coast respectively, but in most recent 2020 study is at lower frequencies of 25.5% due to nomenclature differences primarily of R/R0 lineages. The highest frequency in Croatia observed in population of island Korčula and Mljet, while lowest frequency in islands Cres, and Hvar. It is the dominant European haplogroup. The elevated frequency of subhaplogroup H1b in Mljet, otherwise rare in other studies, is a typical example of a founder effect - migration from the nearest coastal region and micro-evolutionary expansion in the island.
Haplogroup U is mostly represented by its subclade U5 which is the second most frequenct haplogroup, with 11.6% in the mainland and 10.4% in the coast, with similar frequencies in the islands of Brač, Krk, and Hvar, while lowest in Korčula. Overall the haplogroup U, including its subclades like U5, is the most frequent in the city of Dubrovnik and islands Lastovo and Cres. It is the oldest European haplogroup and its subclade U5 makes the majority of the haplogroup diversity in Europe. The high frequency of U4 in Lastovo indicates founder effect.
Haplogroup J is the third most frequent haplogroup, with 11.9% in the mainland but only 3.1% in the coast, however the islands had higher frequencies than the coastal population Korčula, Brač, Krk, Hvar, peak in Žumberak and Lastovo, while in Cres is almost totally absent.
Haplogroup T is third or fourth most frequent haplogroup. Its subclade T2 has similar frequency of 3.1-5.8% in both the coastal and mainland as well insular population, with exceptional peak in island Hvar, however the overall haplogroup T has lower frequency in Mljet, Lastovo and Dubrovnik.
Haplogroup K has average frequency of 3.6% in the mainland and 6.3% in the coast, it is absent in Lastovo and it has lowest frequency in the islands Cres and Hvar, while highest in the island Brač.
Haplogroup V is a younger sister clade of haplogroup H, and has almost the same minimum and maximum frequency in both continental and insular populations, with exception in Korčula, as well lower frequency in Mljet, Lastovo and Dubrovnik.
Haplogroup W frequency in the mainland and coastal population is between 2.2-4.2%, while between insular populations 1.9-3.1%, with exception in Krk, and Cres. In islands Mljet and Lastovo is between 4.4-5.9%, while in Dubrovnik is almost absent.
Other mtDNA haplogroup with notable local peaks are: HV subclades with low frequencies in the mainland and coast but average in islands, and high in Dubrovnik and Brač. Haplogroup N1a in Cres is the northernmost finding till now of this branch in Europe, and haplotypes indicate a relatively recent founder effect. It is a characteristic haplogroup of the early farmers. Haplogroup F which is almost absent, but peaks at 8.3% in Hvar. Haplogroup I in Krk, which subhaplogroups separated around the LGM.

Abstract and data

For decades the Croatian insular populations have been studied because of their isolation which can trace micro-evolutionary processes and understand evolutionary forces, like genetic drift, founder effect and population bottlenecks which shaped the contemporary population. The results until now indicate that the genetic flow and influx of women to the islands was limited. A moderate genetic isolate can also be considered for the continental population of mountainous region Žumberak because they had a loose affinity with Uskoks's proposed region of origin or to their current closest neighbors. On the example of population of the island of Krk, the high-resoluton mtDNA analysis showed evidence that settlements Omišalj, Vrbnik, and Dobrinj are related in a joint cluster of early Slavic settlements, while Poljica and Dubašnica regions a separate cluster founded by Slavic and Romanian migrants from the Velebit hinterland who arrived in the 15th century. On the example of population of the island of Mljet can be perceived demographic and historical events like the island's use for quarantine station, while along Vis and Lastovo consanguinity practice and inbreeding due to lack of genetic diversity, being suitable for genetic-epidemiological research.
In the 2004 mtDNA analysis, one cluster was formed by populations from islands Hvar, Krk and Brač, and second cluster included Croatian mainland and Croatian coast, while the island of Korčula was distinguished due to exceptionally high frequency of haplogroup H. In the 2009 mtDNA interpopulation PCA analysis of subhaplogroups, insular populations from Krk, Ugljan, Korčula, Brač, Hvar were clustered together implying to have close maternal lineages, with Vis close to them, but Cres and Rab had separate outlying positions from both the cluster and each other. In the 2014 mtDNA PCA analysis, the populations from eastern and southern Croatia clustered together with Bosnia and Herzegovina, while western and northern Croatia with Slovenia. As Slovenian population does not form Southeast Europe cluster it is considered a possible input from different migration waves of Slavs in the Middle Ages.
PopulationSamplesSourceHHVJTKU*U1U2U3U4U5U6U7U8RNIWXOther
Croatia488Šarac et al. 45.294.079.835.984.3001.232.661.432.6610.060.200.410.2000.822.611.841.844.29
East Croatia61Šarac et al. 49.1811.489.843.284.920001.641.649.8400001.6404.921.640
North Croatia155Šarac et al. 41.775.0614.5610.763.1600.631.900.633.1611.390.6300.63001.272.530.630.63
West Croatia209Šarac et al. 46.4111.486.702.395.7402.394.311.912.396.2200.96000.964.7802.390
South Croatia63Šarac et al. 49.213.177.949.523.17001.591.593.1712.7000001.591.591.593.170
Croatia200Barbarić et al. 25.511.57.5107.50242.52.5100107.523120.5

Autosomal DNA

According to 2013 autosomal IBD survey "of recent genealogical ancestry over the past 3,000 years at a continental scale", the speakers of Serbo-Croatian language share a very high number of common ancestors dated to the migration period approximately 1,500 years ago with Poland and Romania-Bulgaria cluster among others in Eastern Europe. It is concluded to be caused by the Hunnic and Slavic expansion, which was a "relatively small population that expanded over a large geographic area", particularly "the expansion of the Slavic populations into regions of low population density beginning in the sixth century" and that it is "highly coincident with the modern distribution of Slavic languages".
According to a 2014 autosomal analysis of Western Balkan, the Croatian population shows genetic uniformity with other South Slavic populations. The Croatians and Bosnians were more close to East European populations and largely overlapped with Hungarians from Central Europe. In the 2015 analysis, they formed a western South Slavic cluster with the Bosnians and Slovenians in comparison to eastern cluster formed by Macedonians and Bulgarians with Serbians in the middle. The western cluster has an inclination toward Hungarians, Czechs, and Slovaks, while the eastern cluster toward Romanians and some extent Greeks. In the 2018 analysis of Slovenian population, the Croatian population again clustered with Slovenians, Hungarians and was close to Czech. The population of Croatia mostly shares common ancestry with Eastern, Western, and Southern Europeans, and has almost no relation to isolated populations like the Sardinians and the Basques.
According to 2016 whole exome sequencing of 176 individuals from the island of Vis it was confirmed isolate status of the island's population, and revealed the "pattern of loss-of-function mutations, which resembles the trails of adaptive evolution".