Chinese Language

  • In the news

  • Chinese language branding, France -
    Renault F1 Team Partner Hanjin Shipping will run with Chinese- language branding on the Renault R24 at this weekend's inaugural Chinese Grand Prix. ...
  • Purity of Chinese language debated
    China Daily, China -
    ... Some Chinese linguists hail the practice as a symbol of vitality and openness of the Chinese language, with a history of more than 5,000 years. ...
  • Foreigners can teach Chinese language
    China Daily, China -
    ... The ministry said this was aimed at encouraging more people to teach the Chinese language. More than 25 million people overseas ...
  • Foreigners now can apply for Chinese Language teaching certificate
    Xinhua, China -
    15 (Xinhuanet) -- The Ministry of Education has scrapped its long-time regulation that only allowed Chinese citizens to be authorized Chinese Language teachers ...
  • Viacom Expands in China
    Reuters -
    ... Viacom's alliance with Beijing Television (BTV) will produce Chinese-language music and entertainment programing for distribution to BTV's channels as well as ...

The Chinese language (汉语/漢語, 华语/華語, or 中文; pinyin: hànyǔ, huáyǔ, or zhōngwén) is a member of the Sino-Tibetan family of languages. Although most Chinese view the many varieties of spoken Chinese as a single language, the variations in spoken language are comparable to those of Romance languages; the written language has also changed over time, though far more slowly than the spoken language, and hence has been able to transcend much of the variation in spoken language.

The terms and concepts used by Chinese to think about language are different from those used in the West, partly because of the unifying effects of the Chinese characters used in writing, and partly because of differences in the political and social development of China in comparison with Europe. Whereas after the fall of the Roman Empire, Europe fragmented into small nation-states, the identities of which were often defined by language, China was able to preserve cultural and political unity through the same period.

Chinese (汉语/漢語;中文)
Note: Not all linguists consider spoken Chinese one single language. See here for the details of this dispute.
| valign="top"|Spoken in: China (the PRC and the ROC), Singapore, Indonesia, Malaysia, and other Chinese communities around the world
Total speakers: 1.2 billion
Ranking: 1 (if considered a single language)
Official status
Official language of: PRC, ROC, Singapore
Regulated by: in the PRC: various agencies(in Chinese)
Language codes
ISO 639-1 zh
RFC 3066 zh
ISO 639-2(B) chi
ISO 639-2(T) zho

One major difference between Chinese concepts of language and Western concepts is that Chinese makes a sharp distinction between written language (wen) and spoken language (yu). This distinction extends to the distinction between written word (zi) and spoken word (hua). The concept of a distinct and unified combination of both written and spoken forms of language is much less strong in Chinese than in the West. There are a variety of spoken Chinese, the most prominent of which is Mandarin. There is however only one uniform written script. (See section below.)

Spoken Chinese is a tonal language related to Tibetan and Burmese, but genetically unrelated to other neighbouring languages, such as Korean, Vietnamese, Thai, and Japanese. However, these languages were strongly influenced by Chinese in the course of history, linguistically and also extralinguistically. Korean and Japanese both have writing systems employing Chinese characters, which are called Hanja and Kanji, respectively. In North Korea, Hanja has been completely discontinued and Hangul is the sole way to express their language, while in South Korea, Hanja is used as a form of bold face. Along with those two languages, Vietnamese also contains many Chinese loanwords and formerly used Chinese characters.

About one-fifth of the world speaks some form of Chinese as its native language, making it the language with the most native speakers. The Chinese language (spoken in its standard Mandarin form) is the official language of the People's Republic of China and the Republic of China, one of four official languages of Singapore, and one of six official languages of the United Nations.

1 Spoken Chinese

2 Written Chinese

3 Development of Chinese

4 Related topics

5 References

6 External links

Table of contents

Spoken Chinese

Main article: Chinese spoken language

China language mapEnlarge

China language map

Most linguists classify all of the variations of Chinese as part of the Sino-Tibetan language family and believe that there was an original language similar to Proto Indo-European from which the Sinitic and Tibeto-Burman languages descended. The relations between Chinese and the other Sino-Tibetan languages is still unclear and an area of active research, as is the attempt to reconstruct proto-Sino-Tibetan. The main difficulty in this effort is that, while there is very good documentation that allows us to reconstruct the ancient sounds of Chinese, there is no written documentation concerning the division between proto-Sino-Tibetan and Chinese. In addition, many of the languages that would allow us to reconstruct proto-Sino-Tibetan are very poorly documented or understood.

Chinese linguistics map 2Enlarge

Chinese linguistics map 2

The maps above and to the right depict the subdivisions ("languages" or "dialect groups") within Chinese. The seven main groups are Mandarin (represented by the lines drawn from Beijing), Wu, Xiang, Gan, Hakka, Cantonese, and Min (which linguists further divide into of 5 to 7 subdivisions on its own, which are all mutually unintelligible). Linguists who distinguish ten instead of seven major groups would then separate Jin from Mandarin, Pinghua from Yue, and Hui from Wu. There are also many smaller groups that confound efforts at classification, such as: Dungan, a dialect of northwestern Mandarin spoken among Chinese-descended Muslims in Kyrghyzstan; Danzhou-hua, spoken on Hainan Island; Xiang-hua 乡话 (not to be confused with Xiang 湘), spoken in western Hunan; and Shaozhou-Tuhua, spoken in northern Guangdong. (An informative article written in Chinese may be found at [1].)

In addition to the previously noted divisions, there is also Putonghua and Guoyu, the official languages of the People's Republic of China and the Republic of China, respectively. These are based on the dialect of Mandarin as spoken in Beijing, and are intended to transcend all of China as a common language of communication. It is therefore the common Chinese language (as these are often called) that is the language of government, of the media, and of instruction in schools.

There is a lot of controversy around the terminology used to describe the subdivisions of Chinese, with some preferring to call Chinese a language and its subdivisions dialects, and others preferring to call Chinese a language family and its subdivisions languages. There is more on this debate later on. On the other hand, even though Dungan is very closely related to Mandarin, not many people consider it "Chinese", because it is written in Cyrillic and spoken by people outside of China who are not considered Chinese in any sense.

It is common for speakers of Chinese to be able to speak several variations of the language. Typically in southern China, a person will be able to speak the official Putonghua, the local dialect, and occasionally either speak or understand another regional dialect, such as Cantonese. Such polyglots will frequently code switch between Putonghua and the local dialect, depending on situation. Sometimes, the various dialects are mixed from other dialects, depending on geographical influence. A person living in Taiwan, for example, will commonly mix pronunciations, phrases, and words from Mandarin and Minnan, and this mixture is considered socially appropriate under many circumstances.

Is Chinese a Language or a Family of Languages?

Spoken Chinese comprises many regional and mutually unintelligible variants. In the West, many people are familiar with the fact that the Romance languages all derive from Latin and so have many underlying features in common while being mutually unintelligible. The linguistic evolution of Chinese is similar, while the socio-political context is quite different.

In Europe, political fragmentation created independent states which are roughly the size of Chinese provinces. This created a political desire to create separate cultural and literary standards between nation-states and to standardize the language within a nation-state. In China, a single cultural and literary standard continued to exist while at the same time there was no great desire to standardize the spoken language between different cities and counties. This has created a linguistic context that is very different from that of Europe, and this has profound implications for how to describe spoken variations of Chinese.

For example, in Europe, the language of a nation-state was usually standardized to be similar to that of the capital, making it easy, for example, to classify a language as French or Spanish. This had the effect of sharpening linguistic differences. A farmer on one side of the border would start to model his speech after Paris while a farmer on the other side would model his speech after Madrid. In China, this standardization did not happen, and so even categorizing variations can be difficult, in part because different dialects merge into each other. As a result, linguists will disagree among themselves as to classification.

As a result of the above, Chinese people generally consider Chinese to be one single language. In order to describe dialects, Chinese people typically use the speech of location , for example Beijing hua (北京話) for the speech of Beijing or Shanghai hua (上海話) for the speech of Shanghai — without any "laypeople awareness" that these various hua are then categorized into "languages" based on mutual intelligibility. So although it is true that many parts of north China are quite homogeneous in language, while in parts of south China, major cities can have dialects that are only marginally intelligible even to close neighbours, these are all regarded as hua — equal subvariations under a single Chinese language.

Due to this "self-perception" of a single Chinese language by the majority of its speakers, there are many linguists who follow this definition, and regard Chinese as a single language and its variations as dialects; others follow the intelligibility requirement and consider Chinese to be a group of anywhere from seven to seventeen related "languages", since these languages are not at all mutually intelligible, and show variation comparable to the Romance languages.

It is to be noted that this distinction can have some political overtones. Describing Chinese as different languages can imply that China should actually be several different nations, and that the Hàn (Chinese) race is in fact several different races. For this reason, some Chinese are uncomfortable with the idea that Chinese is not a single language, as this perception might legitimize secessionist movements. On the other hand, supporters of Taiwanese independence also tend to be strong promoters of Min- and Hakka-language education.

However, the linkages between ethnicity, politics, and language can be complex. For example, many Wu, Min, Hakka, and Cantonese speakers, who consider their own tongues to be separate spoken languages, and the Chinese race to be a single entity, do not consider these two positions to be contradictory. Moreover, the government of the People's Republic of China officially states that China is a multinational nation, and that the term Chinese incorporates groups that do not natively speak Chinese at all. (Those that do speak Chinese are called Han Chinese — an ethnic and cultural concept, not a political one.) Similarly on Taiwan, one can find supporters of Chinese unification who are also interested in promoting local language, and supporters of Taiwan independence who have little interest in the topic.

Written Chinese

The Chinese written language employs the Han characterss (漢字 pinyin hànzì), which are named after the Han culture to which they are largely attributed. Chinese characters appear to have originated in the Shang dynasty as pictograms depicting concrete objects. The first examples we have of Chinese characters are inscriptions on oracle bones, which are occasionally sheep scapula but mostly turtle plastrons (lower shells) used for divination purposes. Over the course of the Zhou and Han dynasties, the characters became more and more stylized. Also, additional components were added so that many characters contain one element that gives (or at least once gave) a fairly good indication of the pronunciation, and another component (the so-called "radical") gives an indication of the general category of meaning to which the character belongs. In the modern Chinese languages, the majority of characters are phonetically based rather than logographically based. An example would be the character for the word 按 àn that means "to press down." It contains 安 n (peace), which serves as its phonetic component, and 手 shǒu (hand), that indicates that the action is frequently one that is done using one's hand.

Many styles of Chinese calligraphic writing developed over the centuries, such as zhuanshu (篆書, seal-script), caoshu (草書, grass script), lishu (隸書, official script) and kaishu (楷書, standard script).

In Japan and Korea, Han characters were adopted and integrated into their languages and became Kanji and Hanja, respectively. Japan still uses Kanji as an integral part of its writing system; however, Korea's use of Hanja has diminished (indeed, it is not used at all in North Korea).

In the field of software and communications internationalization, CJK is a collective term for Chinese, Japanese, and Korean, and the rarer CJKV a collective term for the same plus Vietnamese, all of which are double-byte languages, as they have more than 256 characters in their "alphabet". The computerized processing of Chinese characters involves some special issues both in input and character encoding schemes, as the standard 100+ key keyboards of today's computers don't allow input of that many characters with a single key-press.

The Chinese writing system is mostly logographic, i.e., each character expresses a monosyllabic word part, also known as a morpheme. This is helped by the fact that 90%+ of Chinese morphemes are monosyllabic. The majority of modern words, however, are multisyllable and multigraphic. Multisyllabic words have a separate logogram for each syllable. Some, but not all, Han characters are ideographs, but most Han Chinese characters have forms that were based on their pronunciation rather than their meanings, so they do not directly express ideas.

Character forms

There are currently two standards for printed Chinese characters. One is the Traditional system, used in Hong Kong, Macau, and Taiwan. Mainland China and Singapore use the Simplified system (developed by the PRC government in the 1950s), which uses simplified forms for many of the more complicated characters. Most simplified versions were derived from established, though obscure, historically-established simplifications. In Taiwan, many simplifications are used when characters are handwritten, but in printing traditional characters are the norm. In addition, most Chinese use some personal simplifications.

Relationship between spoken and written Chinese

The relationship between the Chinese spoken and written languages is somewhat complex. This complexity is compounded by the fact that the numerous variations of spoken Chinese have gone through centuries of evolution since at least the late-Han dynasty. However, written Chinese has changed much less than the spoken language.

Until the 20th century, most formal Chinese writing was done in wenyan, translated as Classical Chinese or Literary Chinese, which was very different from any of the spoken varieties of Chinese in much the same way that Classical Latin is different from modern Romance languages. Chinese characters that are closer to the spoken language were used to write informal works such as colloquial novels.

Since the May Fourth Movement (1919), the formal standard for written Chinese has been baihua, or Vernacular Chinese, the grammar and vocabulary of which are similar, but not identical, to the grammar and vocabulary of modern spoken Mandarin. Although few new works are written in classical Chinese, the ability to read classical Chinese is taught in middle and high school and forms part of college entrance examinations.

Chinese characters are understood as morphemes that are independent of phonetic change. Thus, although the number one is "yi" in Mandarin, "yat" in Cantonese and "tsit" in Hokkien, they derive from a common ancient Chinese word and still share an identical character: 一. Nevertheless, the orthographies of Chinese dialects are not identical. The vocabularies used in the different dialects have also diverged. In addition, while literary vocabulary is often shared among all dialects (at least in orthography; the readings are different), colloquial vocabularies are often different.

The complex interaction between the Chinese written and spoken languages can be illustrated with Cantonese. There are two standard forms used in writing Cantonese: formal written Cantonese and colloquial written Cantonese. Formal written Cantonese is very similar to written Mandarin and can be read by a Mandarin speaker without much difficulty. However, formal written Cantonese is rather different from spoken Cantonese. Colloquial written Cantonese is more similar to spoken Cantonese but is largely unreadable by an untrained Mandarin speaker.

Cantonese is unique among non-Mandarin regional languages in having a widely used written standard. The other regional languages do not have widely used alternative written standards, but many have local characters or use characters that are archaic in "baihua".

Classification of writing styles

One can classify Chinese writing into four basic types:

  • baihua (白話) (Vernacular Chinese)
  • wenyan (文言) (Classical Chinese)
  • "Written colloquial Chinese" - In particular, written colloquial Cantonese. Cantonese is unique in that it has a commonly used written character system that is different from "baihua" or "wenyan". Colloquial Chinese usually involves the use of "dialectal characters".
  • Poems and other Chinese constrained writings.
As with other aspects of the Chinese language, the contrast between different written standards is not sharp and there can be a socially accepted continuum between the written standards. For example, in writing an informal love letter, one may use informal bai hua. In writing a newspaper article, the language used is different and begins to include aspects of wen yan. In writing a ceremonial document, one would use even more wen yan. The language used in the ceremonial document may be completely different from that of the love letter, but there is a socially accepted continuum existing between the two. Pure "wen yan", however, is rarely used.

Development of Chinese

Categorization of the development of Chinese is subject to scholarly debate. One of the first systems was devised by the Swedish linguist Bernhard Karlgren; what follows is a modern revision of his system.

Old Chinese, sometimes known as 'Archaic Chinese', was the language common during the early and middle Zhou Dynasty (11th to 7th centuries B.C.), texts of which include inscriptions on bronze artifacts, the poetry of the Shijing, the history of the Shujing, and portions of the Yijing (I Ching). Work on reconstructing Old Chinese started with Qing dynasty philologists. The phonetic elements found in the majority of Chinese characters also provide hints to their Old Chinese pronunciations. Old Chinese was not wholly uninflected. It possessed a rich sound system in which aspiration or rough breathing differentiated the consonants.

Middle Chinese was the language used during the Sui, Tang, and Song dynasties (7th through 10th centuries A.D.). It can be divided into an early period, to which the 切韻 'Qieyun' rhyme table (A.D. 601) relates, and a late period in the 10th, which the 廣韻 'Guangyun' rhyme table reflects. Bernhard Karlgren called this phase 'Ancient Chinese'. Linguists are confident in having a good reconstruction of how Middle Chinese sounded. The evidence for the pronunciation of Middle Chinese comes from several sources: modern dialect variations, rhyming dictionaries, and foreign transliterations. Just as Proto-Indo-European can be reconstructed from modern Indo-European languages, so can Middle Chinese be reconstructed from modern dialects. In addition, ancient Chinese philologists devoted a great amount of effort in summarizing the Chinese phonetic system through "rhyming tables", and these tables serve as a basis for the work of modern linguists. Finally, Chinese phonetic translations of foreign words also provide plenty of clues about the nature of Middle Chinese phonetics. However, all reconstruction is tentative; scholars have shown, for example, that trying to reconstruct modern Cantonese from the rhymes of modern Cantopop would give a very inaccurate picture of the language.

The development of the spoken Chinese languages from early historical times to the present has been complex. The language tree shown here shows how the present main divisions of the Chinese language developed out of an early common language. Comparison with the map above will give some idea of the complexities that have been left out of the tree. For instance, the Min language that is centered in Fujian Province contains five subdivisions, and the so-called northern language (which is called Mandarin in the West), also contains named subdivisions, such as Yunnan hua and Sichuan hua.

Chinese language treeEnlarge

Chinese language tree

Most Chinese living in northern China, in Sichuan, and, actually, in a broad arc from the northeast (Manchuria) to the southwest (Yunnan), use various Mandarin dialects as their home language. (See the three regions colored yellow and brown in the map above.) The prevalence of Mandarin throughout northern China is largely the result of geography, namely the plains of north China. By contrast, the mountains and rivers of southern China have promoted linguistic diversity. The presence of Mandarin in Sichuan is largely due to a plague in the 12th century. This plague, which may have been related to the Black Death, depopulated the area, leading to later settlement from north China.

Until the mid-20th century, most Chinese living in southern China did not speak any Mandarin. However, despite the mix of officials and commoners speaking various Chinese dialects, Beijingese Mandarin became dominant at least during the officially Manchu-speaking Qing Empire. Since the 17th century, the Empire had set up Orthoepy Academies (正音書院 Zhengyin Shuyuan) in an attempt to make pronunciation conform to the Beijing standard, but these attempts had little success.

This situation changed with the creation (in both the PRC and the ROC) of an elementary school education system committed to teaching Mandarin. As a result, Mandarin is now spoken fluently by most people in Mainland China and in Taiwan. In Hong Kong, the language of education and formal speech remains Cantonese, but Mandarin is becoming increasingly influential.

Related topics


  • Hannas, William. C. 1997. Asia's Orthographic Dilemma. University of Hawaii Press. ISBN 082481892X (paperback); ISBN 0824818423 (hardcover)
  • DeFrancis, John. 1990. The Chinese Language: Fact and Fantasy. Honolulu: University of Hawaii Press. ISBN 0824810686
  • Norman, Jerry. 1988. Chinese. New York, NY: Cambridge University Press. ISBN 0521228093 (hardcover).

External links