Arabic script


The Arabic script is a writing system used for writing Arabic and several other languages of Asia and Africa, such as Persian, Kurdish, Sindhi, Balochi, Pashto, Lurish, Urdu, Kashmiri and Mandinka, among others. Until the 16th century, it was also used to write some texts in Spanish. Additionally, prior to the language reform in 1928, it was the writing system of Turkish. It is the second-most widely used writing system in the world by the number of countries using it and the third by the number of users, after the Latin and Chinese scripts.
The Arabic script is written from right to left in a cursive style, in which most of the letters are written in slightly different forms according to whether they stand alone or are joined to a following or preceding letter. The basic letter form remains unchanged. In most cases, the letters transcribe consonants or consonants and a few vowels, so most Arabic alphabets are abjads. Additionally, it does not have capital letters.
The script was first used to write texts in Arabic, most notably the Qurʼān, the holy book of Islam. With the spread of Islam, it came to be used as the primary script for many language families, leading to the addition of new letters and other symbols, with some versions, such as Kurdish, Uyghur and old Bosnian being abugidas or true alphabets. It is also the basis for the tradition of Arabic calligraphy.

Languages written with the Arabic script

Overview

The Arabic script has been adapted for use in a wide variety of languages besides Arabic, including Persian, Malay and Urdu, which are not Semitic. Such adaptations may feature altered or new characters to represent phonemes that do not appear in Arabic phonology. For example, the Arabic language lacks a voiceless bilabial plosive, therefore many languages add their own letter to represent in the script, though the specific letter used varies from language to language. These modifications tend to fall into groups: Indian and Turkic languages written in the Arabic script tend to use the Persian modified letters, whereas the languages of Indonesia tend to imitate those of Jawi. The modified version of the Arabic script originally devised for use with Persian is known as the Perso-Arabic script by scholars.
In the cases of Bosnian, Kurdish, Kashmiri and Uyghur writing systems, vowels are mandatory. The Arabic script can therefore be used in both abugida and abjad forms, although it is often strongly, if erroneously, connected to the latter due to it being originally used only for Arabic.
Use of the Arabic script in West African languages, especially in the Sahel, developed with the spread of Islam. To a certain degree the style and usage tends to follow those of the Maghreb. Additional diacritics have come into use to facilitate the writing of sounds not represented in the Arabic language. The term ʻAjamī, which comes from the Arabic root for "foreign," has been applied to Arabic-based orthographies of African languages.

Table of writing styles

Script or styleAlphabetLanguageRegionDerived fromComment
NaskhArabic
& others
Arabic
& others
Every region where Perso-Arabic scripts are usedSometimes refers to a very specific calligraphic style, but sometimes used to refer more broadly to almost every font that is not Kufic or Nastaliq.
NastaliqUrdu,
Persian,
& others
Urdu,
Persian,
& others
Southern and Western AsiaTaliqUsed for almost all modern Urdu text, but only occasionally used for Persian
TaliqPersianPersian
KuficArabicArabicMiddle East and parts of North Africa
RasmRestricted Arabic alphabetArabicMainly historicalOmits all diacritics including i'jam. Digital replication usually requires some special characters. See: :wiktionary:ٮ.

Table of alphabets

Current use

Today Iran, Afghanistan, Pakistan, India, and China are the main non-Arabic speaking states using the Arabic alphabet to write one or more official national languages, including Azerbaijani, Baluchi, Brahui, Persian, Pashto, Central Kurdish, Urdu, Sindhi, Kashmiri, Punjabi and Uyghur.
An Arabic alphabet is currently used for the following languages:

Middle East and Central Asia

In the 20th century, the Arabic script was generally replaced by the Latin alphabet in the Balkans, parts of Sub-Saharan Africa, and Southeast Asia, while in the Soviet Union, after a brief period of Latinisation, use of Cyrillic was mandated. Turkey changed to the Latin alphabet in 1928 as part of an internal Westernizing revolution. After the collapse of the Soviet Union in 1991, many of the Turkic languages of the ex-USSR attempted to follow Turkey's lead and convert to a Turkish-style Latin alphabet. However, renewed use of the Arabic alphabet has occurred to a limited extent in Tajikistan, whose language's close resemblance to Persian allows direct use of publications from Afghanistan and Iran.
Most languages of the Iranian languages family continue to use Arabic script, as well as the Indo-Aryan languages of Pakistan and of Muslim populations in India. However, the Bengali language of India and Bangladesh was never written in Arabic script, which has been written in the Bengali alphabet since inception.

Africa

As of Unicode 13.0, the following ranges encode Arabic characters:

Pronunciation of the Most Common Non-Classical Arabic Consonant [Phonemes] / [Graphemes]

Letter construction

Most languages that use alphabets based on the Arabic alphabet use the same base shapes. Most additional letters in languages that use alphabets based on the Arabic alphabet are built by adding diacritics to existing Arabic letters. Some stylistic variants in Arabic have distinct meanings in other languages. For example, variant forms of kāf ك ک ڪ are used in some languages and sometimes have specific usages. In Urdu and some neighbouring languages the letter Hā has diverged into two forms ھ dō-čašmī hē and ہ ہـ ـہـ ـہ gōl hē. while a variant form of ي referred to as baṛī yē ے is used at the end of some words.

Table of Letter Components

abbreviations used below

A = The letter is used for most languages and dialects with writing systems based on Arabic.
MSA = Letters used in Modern Standard Arabic.
CA = Letters used in Classical Arabic.
AD = Letters used in some regional Arabic Dialects.
"Arabic" = Letters used in Classical Arabic, Modern Standard Arabic, and most regional dialects.
"Farsi" = Letters used in modern Persian.
FW = Foreign words: the letter is sometimes used to spell foreign words.
SV = Stylistic variant: the letter is used interchangeably with at least one other lletter depending on the calligraphic style.
AW = Arabic words: the letter is used in additional languages to spell Arabic words.

Table

No additions
dots
1
2
3
4
different dots above and below
ring
line
numeral
arrows
[Hamza]
other semi-optional vowels
table end

blank line for new entries

header

footnotes

The i'jam diacritic characters are illustrative only, in most typesetting the combined characters in the middle of the table are used. The characters used to illustrate the consonant diacritics are from Unicode set "Arabic pedagogical symbols". The "Arabic Tatweel Modifier Letter" character used to show the positional forms doesn't work in some Nastaliq fonts.
For most letters the isolated form is shown, for select letters all forms are shown.
Urdu Choti Yē has 2 dots below in the initial and middle positions only. The standard Arabic version ي يـ ـيـ ـي always has 2 dots below.
These characters are used by most languages that use writing systems based on Arabic, though sometimes only in foreign words.
A Wasala diacritic Unicode character has been proposed but not yet released.