Sotho phonology


Notes:

The phonology of Sesotho and those of the other Sotho–Tswana languages are radically different from those of "older" or more "stereotypical" Bantu languages. Modern Sesotho in particular has very mixed origins inheriting many words and idioms from non-Sotho–Tswana languages.
There are in total 39 consonantal phonemes and 9 vowel phonemes. The consonants include a rich set of affricates and palatal and postalveolar consonants, as well as three click consonants.

Historical sound changes

Probably the most radical sound innovation in the Sotho–Tswana languages is that the Proto-Bantu prenasalized consonants have become simple stops and affricates. Thus isiZulu words such as entabeni, impuphu, ezinkulu, ukulanda, ukulamba, and ukuthenga are cognates to Sesotho thabeng, phofo, tse kgolo, ho lata, ho lapa, and ho reka, respectively.
This is further intensified by the law of [|nasalization] and nasal homogeneity, making derived and imported words have syllabic nasals followed by homogeneous consonants, instead of prenasalized consonants.
Another important sound change in Sesotho which distinguishes it from almost all other Sotho–Tswana languages and dialects is the chain shift from and to and .
In certain respects, however, Sesotho is more conservative than other Sotho–Tswana languages. For example, the language still retains the difference in pronunciation between,, and. Many other Sotho–Tswana languages have lost the fricative, and some Northern Sotho languages, possibly influenced by Tshivenda, have also lost the lateral affricate and pronounce all three historical consonants as .
The existence of ejective consonants is very strange for a Bantu language and is thought to be due to Khoisan influence. These consonants occur in the Sotho–Tswana and Nguni languages, and the ejective quality is strongest in isiXhosa, which has been greatly influenced by Khoisan phonology.
As with most other Bantu languages, almost all palatal and postalveolar consonants are due to some form of [|palatalization] or other related phenomena which result from a approximant or vowel being "absorbed" into another consonant.
The Southern Bantu languages have lost the Bantu distinction between long and short vowels. In Sesotho the long vowels have simply been shortened without any other effects on the syllables; while sequences of two dissimilar vowels have usually resulted in the first vowel being "absorbed" into the preceding consonant, and causing changes such as [|labialization] and palatalization.
As with most Southern African Bantu languages, the "composite" or "secondary" vowels *e and *o have become and. These usually behave as two phonemes, although there are enough exceptions to justify the claim that they have become four separate phonemes in the Sotho–Tswana languages.
Additionally, the first-degree and second-degree vowels have not merged as in many other Bantu languages, resulting in a total of 9 phonemic vowels.
Almost uniquely among the Sotho–Tswana languages, Sesotho has adopted clicks. There is one place of articulation, alveolar, and three manners and phonations: tenuis, aspirated, and nasalized. These most probably came with loanwords from the Khoisan and Nguni languages, though they also exist in various words which don't exist in these languages and in various ideophones.
These clicks also appear in environments which are rare or non-existent in the Nguni and Khoisan languages, such as a syllabic nasal followed by a nasalized click, a syllabic nasal followed by a tenuis click, and a syllabic nasal followed by an aspirated click.

Vowels

Sesotho has a large inventory of vowels compared with many other Bantu languages. However, the nine phonemic vowels are collapsed into only five letters in the Sesotho orthography. The two close vowels i and u are very high and are better approximated by French vowels than English vowels. That is especially true for, which, in English, is often noticeably more front and can be transcribed as or in the IPA; that is absent from Sesotho.

Consonants

The Sotho–Tswana languages are peculiar among the Bantu family in that most do not have any prenasalized consonants and have a rather-large number of heterorganic compounds. Sesotho, uniquely among the recognised and standardised Sotho–Tswana languages, also has click consonants, which were acquired from Khoisan and Nguni languages.
  1. is an allophone of, occurring only before the close vowels. Dialectical evidence shows that in the Sotho–Tswana languages was originally pronounced as a retroflex flap before the two close vowels.
Sesotho makes a three-way distinction between lightly ejective, aspirated and voiced stops in several places of articulation.
Place of articulationIPANotesOrthographyExample
bilabialunaspirated: spitp pitsa
bilabial ph phuputso
bilabialthis consonant is fully voicedb lebese
alveolarunaspirated: stalkt botala
alveolar th tharollo
alveolaran allophone of, only occurring before the close vowels d Modimo
velarunaspirated: skillk boikarabelo
velarfully aspirated: kill; occurring mostly in old loanwords from Nguni languages and in ideophoneskh lekhokho

Sesotho possesses four simple nasal consonants. All of these can be syllabic and the syllabic velar nasal may also appear at the end of words.
Place of articulationIPANotesOrthographyExample
bilabial m ho mamaretsa
bilabialsyllabic version of the abovem mpa
alveolar n lenaneo
alveolarsyllabic version of the aboven nna
alveolo-palatala bit like Spanish el niñony ho nyala
alveolo-palatalsyllabic version of the aboven nnyeo
velarcan occur initiallyng lengolo
velarsyllabic version of the aboven ho nka

The following approximants occur. All instances of and most probably come from original close,,, and vowels or Proto-Bantu *u, *i, *û, and *î.
Note that when appears as part of a syllable onset this actually indicates that the consonant is [|labialized].
Place of articulationIPANotesOrthographyExample
labial-velar w sewa
lateralnever occurs before close vowels, where it becomes l selepe
laterala syllabic version of the above; note that if the sequence is followed by the close or then the second is pronounced normally, not as a l mollo
palatal y ho tsamaya

The following fricatives occur. The glottal fricative is often voiced between vowels, making it barely noticeable. The alternative orthography used for the velar fricative is due to some loanwords from Afrikaans and ideophones which were historically pronounced with velar fricatives, distinct from the velar affricate. The voiced postalveolar affricative sometimes occurs as an alternative to the fricative.
Place of articulationIPANotesOrthographyExample
labiodental f ho fumana
alveolar s Sesotho
postalveolar sh Moshweshwe
postalveolar j mojalefa
velar kg. Also in Gauta and some ideophones such as gwa sekgo
glottalh ho aha

There is one trill consonant. Originally, this was an alveolar rolled lingual, but today most individuals pronounce it at the back of the tongue, usually at the uvular position. The uvular pronunciation is largely attributed to the influence of French missionaries at Morija in Lesotho. Just like the French version, the position of this consonant is somewhat unstable and often varies even in individuals, but it generally differs from the "r"'s of most other South African language communities. The most stereotypical French-like pronunciations are found in certain rural areas of Lesotho, as well as some areas of Soweto.
Place of articulationIPANotesOrthographyExample
uvularsoft Parisian-type rr moriri

Sesotho has a relatively large number of affricates. The velar affricate, which was standard in Sesotho until the early 20th century, now only occurs in some communities as an alternative to the more common velar fricative.
Place of articulationIPANotesOrthographyExample
alveolar ts ho tsokotsa
alveolaraspiratedtsh ho tshoha
lateral tl ho tlatsa
lateraloccurs only as a nasalized form of hl or as an alternative to ittlh tlhaho
postalveolar tj ntja
postalveolar tjh ho ntjhafatsa
postalveolarthis is an alternative to the fricative j ho ja
velaralternative to the velar fricativekg kgale

The following click consonants occur. In common speech they are sometimes substituted with dental clicks. Even in standard Sesotho the nasal click is usually substituted with the tenuis click. is also used to indicate a syllabic nasal followed by an ejective click, while is used for a syllabic nasal followed by a nasal click.
Place of articulationIPANotesOrthographyExample
postalveolarejectiveq ho qoqa
postalveolarnasal; this is often pronounced as an ejective clicknq ho nqosa
postalveolaraspiratedqh leqheku

The following heterorganic compounds occur. They are often substituted with other consonants, although there are a few instances when some of them are phonemic and not just allophonic. These are not considered consonant clusters.
In non-standard speech these may be pronounced in a variety of ways. bj may be pronounced and pj may be pronounced. pj may also sometimes be pronounced, which may alternatively be written ptj, though this is not to be considered standard.
Place of articulationIPANotesOrthographyExample
bilabial-palatalalternative tjpj ho pjatla
bilabial-palatalaspirated version of the above; alternative tjhpjh mpjhe
bilabial-palatalalternative jbj ho bjarana
labiodental-palatalonly found in short passives of verbs ending with fa; alternative shfj ho bofjwa

Syllable structure

Sesotho syllables tend to be open, with syllabic nasals and the syllabic approximant l also allowed. Unlike almost all other Bantu languages, Sesotho does not have prenasalized consonants.
  1. The onset may be any consonant, a labialized consonant, an approximant, or a vowel.
  2. The nucleus may be a vowel, a syllabic nasal, or the syllabic l.
  3. No codas are allowed.
The possible syllables are:
Note that heterorganic compounds count as single consonants, not consonant clusters.
Additionally, the following phonotactic restrictions apply:
  1. A consonant may not be followed by the palatal approximant .
  2. Neither the labio-velar approximant nor a labialized consonant may not be followed by a back vowel at any time.
Syllabic l occurs only due to a vowel being [|elided] between two l's:
There are no contrastive long vowels in Sesotho, the rule being that juxtaposed vowels form separate syllables. Originally there might have been a consonant between vowels which was eventually elided that prevented coalescence or other phonological processes.
Other Bantu languages have rules against vowel juxtaposition, often inserting an intermediate approximant if necessary.

Phonological processes

Vowels and consonants very often influence one another resulting in predictable sound changes. Most of these changes are either vowels changing vowels, nasals changing consonants, or approximants changing consonants. The sound changes are nasalization, palatalization, [|alveolarization], [|velarization], vowel [|elision], [|vowel raising], and labialization. Sesotho nasalization and vowel-raising are extra-strange since, unlike most processes in most languages, they actually decrease the sonority of the phonemes.
Nasalization is a process in Bantu languages by which, in certain circumstances, a prefixed nasal becomes assimilated to a succeeding consonant and causes changes in the form of the phone to which it is prefixed. In the Sesotho language series of articles it is indicated by.
In Sesotho it is a fortition process and usually occurs in the formation of class 9 and 10 nouns, in the use of the objectival concord of the first person singular, in the use of the adjectival and enumerative concords of some noun classes, and in the forming of reflexive verbs.
Very roughly speaking, voiced consonants become devoiced and fricatives lose their fricative quality.
Vowels and the approximant get a in front of them
The syllabic nasal causing the change is usually dropped, except for monosyllabic stems and the first person objectival concord. Reflexive verbs don't show a nasal.
Other changes may occur due to contractions in verb derivations:
Nasal homogeneity consists of two points:
  1. When a consonant is preceded by a nasal it will undergo nasalization, if it supports it.
  2. When a nasal is immediately followed by another consonant with no vowel betwixt them, the nasal will change to a nasal in the same approximate position as the following consonant, after the consonant has undergone nasal permutation. If the consonant is already a nasal then the previous nasal will simply change to the same.
----
Palatalization is a process in certain Bantu languages where a consonant becomes a palatal consonant.
In Sesotho it usually occurs with the short form of passive verbs and the diminutives of nouns, adjectives, and relatives.
For example:
----
Alveolarization is a process whereby a consonant becomes an alveolar consonant. It occurs in noun diminutives, the diminutives of colour adjectives, and in the pronouns and concords of noun classes with a di- or di- prefix. This results in either or.
Examples:
Other changes may occur due to phonological interactions in verbal derivatives:
The alveolarization which changes Sesotho to is by far the most commonly applied phonetic process in the language. It's regularly applied in the formation of some class 8 and 10 concords and in numerous verbal derivatives.
----
Velarization in Sesotho is a process whereby certain sounds become velar consonants due to the intrusion of an approximant. It occurs with verb passives, noun diminutives, the diminutives of relatives, and the formation of some class 1 and 3 prefixes.
For example:
----
Elision of vowels occurs in Sesotho less often than in those Bantu languages which have vowel "pre-prefixes" before the noun class prefixes, but there are still instances where it regularly and actively occurs.
There are two primary types of regular vowel elision:
  1. The vowels,, and may be removed from between two instances of, thereby causing the first to become syllabic. This actively occurs with verbs, and has historically occurred with some nouns.
  2. When forming class 1 or 3 nouns from noun stems beginning with the middle is removed and the is contracted into the, resulting in. This actively occurs with nouns derived from verbs commencing with and has historically occurred with many other nouns.
For example:
----
Vowel raising is an uncommon form of vowel harmony where a non-open vowel is raised in position by a following vowel at a higher position. The first variety — in which the open-mid vowels become close-mid — is commonly found in most Southern African Bantu languages. In the 9-vowel Sotho–Tswana languages, a much less common process also occurs where the near-close vowels become raised to a position slightly lower than the close vowels without ATR.


Mid vowel raising is a process where becomes and becomes under the influence of close vowels or consonants that contain "hidden" close vowels.
These changes are usually recursive to varying depths within the word, though, being a left spreading rule, it is often bounded by the difficulty of "foreseeing" the raising syllable:
Additionally, a right-spreading form occurs when a close-mid vowel is on the penultimate syllable and, due to some inflection or derivational process, is followed by an open-mid vowel. In this case the vowel on the final syllable is raised. This does not happen if the penultimate syllable is close.
but
These vowels can occur phonemically, however, and may thus be considered to be separate phonemes:


Close vowel raising is a process which occurs under much less common circumstances. Near-close becomes and near-close becomes when immediately followed by a syllable containing the close vowels or. Unlike the mid vowel raising this processes is not iterative and is only caused directly by the close vowels.
Since these changes are allophonic, the Sotho–Tswana languages are rarely said to have 11 vowels.
----
Labialization is a modification of a consonant due to the action of a bilabial element which persists throughout the articulation of the consonant and is not merely a following semivowel. This labialization results in the consonant being pronounced with rounded lips and with attenuated high frequencies.
It may be traced to an original or being "absorbed" into the preceding consonant when the syllable is followed by another vowel. The consonant is labialized and the transition from the labialized syllable onset to the nucleus vowel sounds like a bilabial semivowel. Unlike in languages such as Chishona and Tshivenda, Sesotho labialization does not result in "whistling" of any consonants.
Almost all consonants may be labialized, the exceptions being labial stops and fricatives, the bilabial and palatal nasals, and the voiced alveolar allophone of . Additionally, syllabic nasals and the syllabic are never directly labialized. Note that the unvoiced heterorganic doubled articulant fricative only occurs labialized.
Due to the inherent bilabial semivowel, labialized consonants never appear before back vowels:

Tonology

Sesotho is a tonal language spoken using two contrasting tones: low and high; further investigation reveals, however, that in reality it is only the high tones that are explicitly specified on the syllables in the speaker's mental lexicon, and that low tones appear when a syllable is tonally under-specified. Unlike the tonal systems of languages such as Mandarin, where each syllable basically has an immutable tone, the tonal systems of the Niger–Congo languages are much more complex in that several "tonal rules" are used to manipulate the underlying high tones before the words may be spoken, and this includes special rules which, like grammatical or syntax rules that operate on words and morphemes, may change the tones of specific words depending on the meaning one wishes to convey.

Stress

The word stress system of Sesotho is quite simple. Each complete Sesotho word has exactly one main stressed syllable.
Except for the second form of the first demonstrative pronoun, certain formations involving certain enclitics, polysyllabic ideophones, most compounds, and a handful of other words, there is only one main stress falling on the penult.
The stressed syllable is slightly longer and has a falling tone. Unlike in English, stress does not affect vowel quality or height.
This type of stress system occurs in most of those Eastern and Southern Bantu languages which have lost contrastive vowel length.
The second form of the first demonstrative pronoun has the stress on the final syllable. Some proclitics can leave the stress of the original word in place, causing the resultant word to have the stress at the antepenultimate syllable. Ideophones, which tend to not obey the phonetic laws which the rest of the language abides by, may also have irregular stress.
There is even at least one minimal pair: the adverb fela has regular stress, while the conjunctive fela has stress on the final syllable. This is certainly not enough evidence to justify making the claim that Sesotho is a stress accent language, though.
Because the stress falls on the penultimate syllable, Sesotho, like other Bantu languages, tends to avoid monosyllabic words and often employs certain prefixes and suffixes to make the word disyllabic.