Chinese character classification

All Chinese characters are logograms, but several different types can be identified, based on the manner in which they are formed or derived. There are a handful which derive from pictographs and a number which are ideographic in origin, including compound ideographs, but the vast majority originated as phono-semantic compounds. The other categories in the traditional system of classification are rebus or phonetic loan characters and "derivative cognates". Modern scholars have proposed various revised systems, rejecting some of the traditional categories.
In older literature, Chinese characters in general may be referred to as ideograms, due to the misconception that characters represented ideas directly, whereas some people assert that they do so only through association with the spoken word.

Traditional classification

Traditional Chinese lexicography divided characters into six categories. This classification is known from Xu Shen's second century dictionary Shuowen Jiezi, but did not originate there. The phrase first appeared in the Rites of Zhou, though it may not have originally referred to methods of creating characters. When Liu Xin edited the Rites, he glossed the term with a list of six types without examples.
Slightly different lists of six types are given in the Book of Han of the first century CE, and by Zheng Zhong quoted by Zheng Xuan in his first-century commentary on the Rites of Zhou.
Xu Shen illustrated each of Liu's six types with a pair of characters in the postface to the Shuowen Jiezi.
The traditional classification is still taught but is no longer the focus of modern lexicographic practice. Some categories are not clearly defined, nor are they mutually exclusive: the first four refer to structural composition, while the last two refer to usage. For this reason, some modern scholars view them as six principles of character formation rather than six types of characters.
The earliest significant, extant corpus of Chinese characters is found on turtle shells and the bones of livestock, chiefly the scapula of oxen, for use in pyromancy, a form of divination. These ancient characters are called oracle bone script. Roughly a quarter of these characters are pictograms while the rest are either phono-semantic compounds or compound ideograms. Despite millennia of change in shape, usage and meaning, a few of these characters remain recognizable to the modern reader of Chinese.
At present, more than 90% of Chinese characters are phono-semantic compounds, constructed out of elements intended to provide clues to both the meaning and the pronunciation. However, as both the meanings and pronunciations of the characters have changed over time, these components are no longer reliable guides to either meaning or pronunciation. The failure to recognize the historical and etymological role of these components often leads to misclassification and false etymology. A study of the earliest sources is often necessary for an understanding of the true composition and etymology of any particular character. Reconstructing Middle and Old Chinese phonology from the clues present in characters is part of Chinese historical linguistics. In Chinese, it is called Yinyunxue.

Pictograms

Roughly 600 Chinese characters are pictograms – stylised drawings of the objects they represent. These are generally among the oldest characters. A few, indicated below with their earliest forms, date back to oracle bones from the twelfth century BCE.
These pictograms became progressively more stylized and lost their pictographic flavor, especially as they made the transition from the oracle bone script to the Seal Script of the Eastern Zhou, but also to a lesser extent in the transition to the clerical script of the Han Dynasty. The table below summarises the evolution of a few Chinese pictographic characters.

Simple ideograms

Ideograms express an abstract idea through an iconic form, including iconic modification of pictographic characters. In the examples below, low numerals are represented by the appropriate number of strokes, directions by an iconic indication above and below a line, and the parts of a tree by marking the appropriate part of a pictogram of a tree.

Character
Pinyin	yī	èr	sān	shàng	xià	běn	mò
Translation	one	two	three	up	below	root	apex

N.B.:

p=běn - a tree with the base indicated by an extra stroke.
p=mò - the reverse of p=běn, a tree with the top highlighted by an extra stroke.
Compound ideographs

Compound ideographs, also called associative compounds or logical aggregates, are compounds of two or more pictographic or ideographic characters to suggest the meaning of the word to be represented.
In the postface to the Shuowen Jiezi, Xu Shen gave two examples:

l=military, formed from l=dagger-axe and l=foot
l=truthful, formed from l=person and l=speech

Other characters commonly explained as compound ideographs include:

p=lín, composed of two trees
p=sēn, composed of three trees
p=xiū, depicting a man by a tree
p=cǎi, depicting a hand on a bush
p=kàn, depicting a hand above an eye
p=mù, depicting the sun disappearing into the grass, originally written as l=thick grass enclosing 日

Many characters formerly classed as compound ideographs are now believed to have been mistakenly identified. For example, Xu Shen's example 信, representing the word xìn < *snjins "truthful", is now usually considered a phono-semantic compound, with p=rén < *njin as phonetic and l=speech as signific. In many cases, reduction of a character has obscured its original phono-semantic nature. For example, the character l=bright is often presented as a compound of l=sun and l=moon. However this form is probably a simplification of an attested alternative form 朙, which can be viewed as a phono-semantic compound.
Peter Boodberg and William Boltz have argued that no ancient characters were compound ideographs. Boltz accounts for the remaining cases by suggesting that some characters could represent multiple unrelated words with different pronunciations, as in Sumerian cuneiform and Egyptian hieroglyphs, and the compound characters are actually phono-semantic compounds based on an alternative reading that has since been lost. For example, the character p=ān < *ʔan "peace" is often cited as a compound of l=roof and l=woman. Boltz speculates that the character 女 could represent both the word nǚ < *nrjaʔ "woman" and the word ān < *ʔan "settled", and that the roof signific was later added to disambiguate the latter usage. In support of this second reading, he points to other characters with the same 女 component that had similar Old Chinese pronunciations: p=yàn < "tranquil", p=nuán < "to quarrel" and p=jiān < *kran "licentious". Other scholars reject these arguments for alternative readings and consider other explanations of the data more likely, for example viewing 妟 as a reduced form of 晏, which can be analysed as a phono-semantic compound with 安 as phonetic. They consider the characters 奻 and 姦 to be implausible phonetic compounds, both because the proposed phonetic and semantic elements are identical and because the widely differing initial consonants *ʔ- and *n- would not normally be accepted in a phonetic compound. Notably, Christopher Button has shown how more sophisticated palaeographical and phonological analyses can account for Boodberg's and Boltz's proposed examples without relying on polyphony.
While compound ideographs are a limited source of Chinese characters, they form many of the kokuji created in Japan to represent native words.
Examples include:

hatara "to work", formed from person and move
tōge "mountain pass", formed from mountain, up and down

As Japanese creations, such characters had no Chinese or Sino-Japanese readings, but a few have been assigned invented Sino-Japanese readings. For example, the common character has been given the reading dō, and even been borrowed into written Chinese in the 20th century with the reading dòng.

Rebus (phonetic loan) characters

Jiajie are characters that are "borrowed" to write another homophonous or near-homophonous morpheme. For example, the character was originally a pictogram of a wheat plant and meant *m-rˁək "wheat". As this was pronounced similar to the Old Chinese word *mə.rˁək "to come", 來 was also used to write this verb. Eventually the more common usage, the verb "to come", became established as the default reading of the character 來, and a new character was devised for "wheat". When a character is used as a rebus this way, it is called a p=jiǎjièzì, translatable as "phonetic loan character" or "rebus" character.
As in Egyptian hieroglyphs and Sumerian cuneiform, early Chinese characters were used as rebuses to express abstract meanings that were not easily depicted. Thus many characters stood for more than one word. In some cases the extended use would take over completely, and a new character would be created for the original meaning, usually by modifying the original character with a radical. For instance, yòu originally meant "right hand; right" but was borrowed to write the abstract word yòu "again; moreover". In modern usage, the character 又 exclusively represents yòu "again" while, which adds the "mouth radical" to 又, represents yòu "right". This process of graphic disambiguation is a common source of phono-semantic compound characters.

Pictograph or ideograph	Rebus word	Original word	New character for original word
四	sì "four"	sì "nostrils"	泗
枼	yè "flat, thin"	yè "leaf"	葉
北	běi "north"	bèi "back "	背
要	yào "to want"	yāo "waist"	腰
少	shǎo "few"	shā "sand"	沙 and 砂
永	yǒng "forever"	yǒng "swim"	泳

While this word jiajie dates from the Han Dynasty, the related term tongjia is first attested from the Ming Dynasty. The two terms are commonly used as synonyms, but there is a linguistic distinction between jiajiezi being a phonetic loan character for a word that did not originally have a character, such as using c=, "One of the most dangerous stumbling-blocks in the interpretation of pre-Han texts is the frequent occurrence of , loan characters."

Phono-semantic compound characters

p=xíng shēng or p=xié shēng

These form over 90% of Chinese characters. They were created by combining two components:

a phonetic component on the rebus principle, that is, a character with approximately the correct pronunciation.
a semantic component, also called a determinative, one of a limited number of characters which supplied an element of meaning. In most cases this is also the radical under which a character is listed in a dictionary.

As in ancient Egyptian writing, such compounds eliminated the ambiguity caused by phonetic loans.
This process can be repeated, with a phono-semantic compound character itself being used as a phonetic in a further compound, which can result in quite complex characters, such as 劇.
Often, the semantic component is on the left, but there are many possible combinations, see Shape and position of radicals.

Examples

As an example, a verb meaning "to wash oneself" is pronounced mù. This happens to sound the same as the word mù "tree", which was written with the simple pictograph 木. The verb mù could simply have been written 木, like "tree", but to disambiguate, it was combined with the character for "water", giving some idea of the meaning. The resulting character eventually came to be written p=mù. Similarly, the water determinative was combined with p=lín to produce the water-related homophone p=lín.

Determinative	Rebus	Compound
c=	c=	c=
c=	c=	c=

However, the phonetic component is not always as meaningless as this example would suggest. Rebuses were sometimes chosen that were compatible semantically as well as phonetically. It was also often the case that the determinative merely constrained the meaning of a word which already had several. p=cài is a case in point. The determinative 艹 for plants was combined with p=cǎi. However, p=cǎi does not merely provide the pronunciation. In classical texts it was also used to mean "vegetable". That is, 采 underwent semantic extension from "harvest" to "vegetable", and the addition of 艹 merely specified that the latter meaning was to be understood.

Determinative	Rebus	Compound
c=	c=	c=

Some additional examples:

Determinative	Rebus	Compound
c=	c=	c=
c=	c=	c=
c=	c=	c=

Sound change

Originally characters sharing the same phonetic had similar readings, though they have now diverged substantially. Linguists rely heavily on this fact to reconstruct the sounds of Old Chinese. Contemporary foreign pronunciations of characters are also used to reconstruct historical Chinese pronunciation, chiefly that of Middle Chinese.
When people try to read an unfamiliar two-part character, they will typically follow the rule of thumb to "read the side" and take one component to be a phonetic, which often results in errors.

Simplification

Since the phonetic elements of many characters no longer accurately represent their pronunciations, when the People's Republic of China simplified characters, they often substituted a phonetic that was not only simpler to write, but more accurate for a modern reading in Mandarin as well. This has sometimes resulted in forms which are less phonetic than the original ones in varieties of Chinese other than Mandarin.

Derivative cognates

The derivative cognate is the smallest category and also the least understood. In the postface to the Shuowen Jiezi, Xu Shen gave as an example the characters 考 kǎo "to verify" and 老 lǎo "old", which had similar Old Chinese pronunciations and may have had the same etymological root, meaning "elderly person", but became lexicalized into two separate words. The term does not appear in the body of the dictionary, and may have been included in the postface out of deference to Liu Xin. It is often omitted from modern systems.

Modern classifications

The liushu had been the standard classification scheme for Chinese characters since Xu Shen's time. Generations of scholars modified it without challenging the basic concepts. Tang Lan was the first to dismiss liùshū, offering his own sānshū, namely xiàngxíng, xiàngyì and xíngshēng. This classification was later criticised by Chen Mengjia and Qiu Xigui. Both Chen and Qiu offered their own sānshū.

Citations

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...