Written Cantonese


Written Cantonese is the written form of Cantonese, the most complete written form of Chinese after that for Mandarin Chinese and Classical Chinese. Written Chinese was originally developed for Classical Chinese, and was the main literary language of China until the 19th century. Written vernacular Chinese first appeared in the 17th century and a written form of Mandarin became standard throughout China in the early 20th century. While the Mandarin form can in principle be read and spoken word for word in other Chinese varieties, its intelligibility to non-Mandarin speakers is poor to incomprehensible because of differences in idioms, grammar and usage. Modern Cantonese speakers have therefore developed their own written script, sometimes creating new characters for words that either do not exist or have been lost in standard Chinese.
With the advent of the computer and standardization of character sets specifically for Cantonese, many printed materials in predominantly Cantonese-speaking areas of the world are written to cater to their population with these written Cantonese characters.

History

Before the 20th century, the standard written language of China was Classical Chinese, which has grammar and vocabulary based on the Chinese used in ancient China, Old Chinese. However, while this written standard remained essentially static for over two thousand years, the actual spoken language diverged further and further away. Some writings based on local vernacular speech did exist but these were rare. In the early 20th century, Chinese reformers like Hu Shih saw the need for language reform and championed the development of a vernacular that allowed modern Chinese to write the language the same way they speak. The vernacular language movement took hold, and the written language was standardised as vernacular Chinese. Mandarin was chosen as the basis for the new standard.
The standardisation and adoption of written Mandarin pre-empted the development and standardisation of vernaculars based on other varieties of Chinese. No matter which dialect one spoke, one still wrote in standardised Mandarin for everyday writing. However, Cantonese is unique amongst the non-Mandarin varieties in having a widely used written form. Cantonese-speaking Hong Kong used to be a British colony isolated from mainland China before 1997, so most HK citizens do not speak Mandarin. Written Cantonese was developed as a means of informal communication. Still, Cantonese speakers must use standard written Chinese, or even literary Chinese, in most formal written communications, since written Cantonese may be unintelligible to speakers of other varieties of Chinese.
Historically, written Cantonese has been used in Hong Kong for legal proceedings in order to write down the exact spoken testimony of a witness, instead of paraphrasing spoken Cantonese into standard written Chinese. However, its popularity and usage has been rising in the last two decades, the late Wong Jim being one of the pioneers of its use as an effective written language. Written Cantonese has become quite popular in certain tabloids, online chat rooms, instant messaging, and even social networking websites; this would be even more evident since the rise of localism in Hong Kong from the 2010s, where the articles written by those localist media are written in Cantonese. Although most foreign movies and TV shows are subtitled in Standard Chinese, some, such as The Simpsons, are subtitled using written Cantonese. Newspapers have the news section written in Standard Chinese, but they may have editorials or columns that contain Cantonese discourses, and Cantonese characters are increasing in popularity on advertisements and billboards.
It has been stated that written Cantonese remains limited outside Hong Kong, including other Cantonese-speaking areas in Guangdong Province. However, colloquial Cantonese advertisements are sometimes seen in Guangdong, suggesting that written Cantonese is widely understood and is regarded favourably, at least in some contexts.
Some sources will use only colloquial Cantonese forms, resulting in text similar to natural speech. However, it is more common to use a mixture of colloquial forms and standard Chinese forms, some of which are alien to natural speech. Thus the resulting "hybrid" text lies on a continuum between two norms: standard Chinese and colloquial Cantonese as spoken.

Cantonese characters

Early sources

A good source for well documented written Cantonese words can be found in the scripts for Cantonese opera. Readings in Cantonese colloquial: being selections from books in the Cantonese vernacular with free and literal translations of the Chinese character and romanized spelling by James Dyer Ball has a bibliography of printed works available in Cantonese characters in the last decade of the nineteenth century. A few libraries have collections of so-called "wooden fish books" written in Cantonese characters. Facsimiles and plot precis of a few of these have been published in Wolfram Eberhard's Cantonese Ballads. See also Cantonese love-songs, translated with introduction and notes by Cecil Clementi or a newer translation of these by Peter T. Morris in Cantonese love songs : an English translation of Jiu Ji-yung's Cantonese songs of the early 19th century. Cantonese character versions of the Bible, Pilgrims Progress, and Peep of Day, as well as simple catechisms, were published by mission presses. The special Cantonese characters used in all of these were not standardized and show wide variation.

Characters today

Written Cantonese contains many characters not used in standard written Chinese in order to transcribe words not present in the standard lexicon, and for some words from Old Chinese when their original forms have been forgotten. Despite attempts by the government of Hong Kong in the 1990s to standardize this character set, culminating in the release of the Hong Kong Supplementary Character Set for use in electronic communication, there is still significant disagreement about which characters are correct in written Cantonese, as many of the Cantonese words existed as descendants of Old Chinese words, but are being replaced by some new invented Cantonese words due to the Hong Kong Government's lack of knowledge about some of the Cantonese words.

Vocabulary

General estimates of vocabulary differences between Cantonese and Mandarin range from 30 to 50 percent. Donald B. Snow, the author of Cantonese as Written Language: The Growth of a Written Chinese Vernacular, wrote that "It is difficult to quantify precisely how different" the two vocabularies are. Snow wrote that the different vocabulary systems are the main difference between written Mandarin and written Cantonese. Ouyang Shan made a corpus-based estimate concluding that one third of the lexical items used in regular Cantonese speech do not exist in Mandarin, but that between the formal registers the differences were smaller. He analyzed a radio news broadcast and concluded that of its lexical items, 10.6% were distinctly Cantonese. Here are examples of differing lexical items in a sentence:
GlossWritten CantoneseStandard Written Chinese
is係 haih是 sih '
not唔 m̀h不 bāt '
they/them佢哋 keúih-deih他們 tā-mùhn '
嘅 ge的 dīk '
Is it theirs?係唔係佢哋嘅?
haih-m̀h-haih keúih-deih ge?
是不是他們的?
Sih-bāt-sih tā-mùhn dīk?

In the above table the two Chinese sentences are grammatically identical, using an A-not-A question to ask "Is it theirs?". But the characters are all different, though they correspond 1:1.

Cognates

There are certain words that share a common root with standard written Chinese words. However, because they have diverged in pronunciation, tone, and/or meaning, they are often written using a different character. One example is the doublet 來 lòih and 嚟 lèih, meaning "to come." Both share the same meaning and usage, but because the colloquial pronunciation differs from the literary pronunciation, they are represented using two different characters. Some people argue that representing the colloquial pronunciation with a different character is superfluous, and would encourage using the same character for both forms since they are cognates.

Native words

Some Cantonese words have no equivalents in Mandarin, though equivalents may exist in classical or other varieties of Chinese. Cantonese writers have from time to time reinvented or borrowed a new character if they are not aware of the original one. For example, some suggest that the common word 靚, meaning pretty in Cantonese but also looking into the mirror in Mandarin, is in fact the character 令.
Today those characters can mainly be found in ancient rime dictionaries such as Guangyun. Some scholars have made some "archaeological" efforts to find out what the "original characters" are. Often, however, these efforts are of little use to the modern Cantonese writer, since the characters so discovered are not available in the standard character sets provided to computer users, and many have fallen out of usage.
In Southeast Asia, Cantonese people may adopt local Malay words into their daily speech, such as using the term 鐳 rather than saying 錢 which would be what the Hong Kong Cantonese would say, meaning money and written 錢.

Particles

Cantonese particles may be added to the end of a sentence or suffixed to verbs to indicate aspect. There are many such particles; here are a few.
Some Cantonese loanwords are written in existing Chinese characters.
Written form of CantoneseJyutpingEnglish wordWritten form of Mandarin
巴士baa1 si2bus公車
公共汽車、公交车
的士dik1 si2taxi計程車
出租車
德士
多士do1 si6toast吐司
朱古力zyu1 gu1 lik1chocolate巧克力
三文治saam1 man4 zi6sandwich三明治
士多si6 do1store商店
士巴拿si6 baa1 naa2spanner /ˈspæn.ə/扳手
士多啤梨si6 do1 be1 lei2strawberry草莓
啤梨be1 lei2pear梨子
沙士saa1 si6SARS嚴重急性呼吸道症候群
非典
拜拜baai1 baai3bye bye再見
BBbi4 bi1baby嬰兒
菲林fei1 lam2film膠卷
菲屎fei1 si2face /feɪs/面子
三文魚saam1 man4 jyu4salmon鮭魚
沙律saa1 leot6salad/ˈsæləd/沙拉
taai11. tire
2. tie
1. /ˈtaɪ̯ə/
2. /taɪ/
1. 輪胎
2. 領帶
褒呔bou1 taai1bowtie/bəʊˈtaɪ/蝴蝶型領結
fei1fee /fiː/
bo1ball/bɔːl/
哈囉haa1 lou3hello/həˈləʊ/您好
迷你mai4 nei2mini/ˈmɪni/
摩登mo1 dang1modern/ˈmɒdən/時尚、現代
肥佬fei4 lou2fail/feɪl/不合格
咖啡gaa3 fe1coffee/ˈkɒfi/咖啡
OKou1 kei1okay/ˌəʊˈkeɪ/可以
kaak1card/kɑːd/
啤牌pe1 paai2poker/ˈpəʊkə/樸克
gei1gay/ɡeɪ/同性戀
taat1 tart/tɑːt/
可樂ho2 lok6cola/ˈkəʊ.lə/可樂
檸檬ning4 mung1lemon/ˈlɛmən/檸檬
扑成buk1 sing4boxing/ˈbɒksɪŋ/拳擊
刁時diu1 si2deuce 平分
干邑gon1 jap1cognac法國白蘭地酒
沙展saa1 zin2sergeant警長
士碌架si3 luk1 gaa2snooker彩色檯球
士撻si3 taat1 starter啟輝器
士啤si3 be1spare後備,備用
士啤呔si3 be1 taai1spare tire備用輪胎
Often used to describe people with waist and abdomen fat
士的si3 dik1stick手杖,拐杖
士多房si3 do1 fong4storeroom貯藏室
山埃saan1 aai1cyanide氰化物
caa1 charge充電
六式碼luk3 sik1 maa2Six Sigma六西格瑪
天拿水tin1 naa4 seoi2 thinner稀釋劑,溶劑
比高bei2 gou1bagel過水麵包圈
貝果
比堅尼bei2 gin1 nei4bikini比基尼泳裝
巴士德消毒baa1 si1 dak1 siu1 duk6pasteurized用巴氏法消毒過的
巴打baa1 daa2brother兄弟
巴黎帽baa1 lai4 mou2beret貝雷帽
巴仙baa1 sin1 / pat6 sen1 / percent百分之
古龍水gu2 lung4 seoi2cologne科隆香水
布冧bou3 lam1plum洋李,李子,梅
布甸bou3 din1pudding布丁
打令daa1 ling2darling心愛的人
打比daa2 bei2derby德比賽馬
kaa1car(火車)車廂
卡式機kaa1 sik1 gei1cassette盒式錄音機
卡士kaa1 si21. cast
2. class
1. 演員陣容
2. 檔次,等級;上品,高檔,有品味
卡通kaa1 tung1cartoon動畫片,漫畫
卡巴kaa1 baa1kebab烤腌肉串
甲巴甸gaap3 baa1 din1gabardine華達呢
le1level級,級別
叻㗎lek1 gaa4lacquer清漆
sin1cent
他菲亞酒taa1 fei1 aa3 zau2tafia塔非亞酒
冬甩dung1 lat1doughnut炸麵餅圈
奶昔naai2 sik1milkshake牛奶冰淇淋
安士on1 si2ounce盎司,英兩,啢
安哥on1 go1encore再來一個,再演奏(Song)一次

Cantonese character formation

Cantonese characters, as with regular Chinese characters, are formed in one of several ways:

Borrowings

Some characters already exist in standard Chinese, but are simply reborrowed into Cantonese with new meanings. Most of these tend to be archaic or rarely used characters. An example is the character 子, which means "child". The Cantonese word for child is represented by 仔, which has the original meaning of "young animal".

Marked phonetic loans

Many characters used in Cantonese writings are formed by putting a mouth radical on the left hand side of another better-known character, usually a standard Chinese character. This indicates that the new character sounds like the standard character, but is only used phonetically in the Cantonese context. The characters which are commonly used in Cantonese writing include:
CharacterRomanizationNotesStandard Chinese equivalent
gaafunction word
háah/háafunction word
yaa/yaahfunction word
āakv. cheat, hoax
gámfunction word like this, e.g., 噉就死喇這樣
gamfunction word like this, e.g., 咁大件這麼
function word indicates past tense
function word, also a contraction of 乜嘢
saaifunction word indicates completion, e.g., 搬嗮 moved all, finished moving掉, 完
deihfunction word, indicates plural form of a pronoun
nī/nēiadv. this, these
m̀hadv. not, no, cannot; originally a function word
lāangfunction word
āamadv. just, nearly
āamadv. correct, suitable
dī/dītgenitive, similar to 's but pluralizing i.e., 呢個 this → 呢啲 these, 快點 = 快啲 = "hurry!"的, 些, 點
yūkv. to move
háiprep. at, in, during, at, in
adv. that, those
gegenitive, similar to 's; sometimes function word之, 的
mākn. mark, trademark; transliteration of "mark"
laakfunction word
laafunction word
yéhn. thing, stuff東西, 事物
sāaiv. to waste浪費
lèih/làihv. to come; sometimes function word
háaihfunction word
gauhfunction word a piece of
lō/lofunction word
táuv. to rest
haamv. to cry
maih/máihv. not be, contraction of 唔係 m̀h haih, used following 係 in yes-no questions; also other uses否, 非
final particle expressing consent and denial, liveliness and irritation, etc.

There is evidence that the mouth radical in such characters can, over time, be replaced by a Signific, which indicates the meaning of the character. The new character is then a semantic compound. For instance, 冧, written with the signific 冖, is instead written in older dictionaries as 啉, with the mouth radical.

Derived characters

Other common characters are unique to Cantonese or are different from their Mandarin usage, including: 乜, 冇, 仔, 佢, 佬, 俾, 靚 etc. The characters which are commonly used in Cantonese writing include:
The words represented by these characters are sometimes cognates with pre-existing Chinese words. However, their colloquial Cantonese pronunciations have diverged from formal Cantonese pronunciations. For example, 無 is normally pronounced mòuh in literature. In spoken Cantonese, 冇 has the same usage, meaning, and pronunciation as 無, except for tone. 冇 represents the spoken Cantonese form of the word "without", while 無 represents the word used in Classical Chinese and Mandarin. However, 無 is still used in some instances in spoken Cantonese, such as 無論如何. Another example is the doublet 來/嚟, which means "come". 來 is used in literature; 嚟 is the spoken Cantonese form.

Workarounds

Though most Cantonese words can be found in the current encoding system, input workarounds are commonly used by those not familiar with them. Some Cantonese writers use simple romanization, symbols, homophones, and Chinese characters which have different meanings in Mandarin For example,