I got a question from a reader asking what ‘the main Chinese language’ is, and my first instinct was to just send an email back saying “Mandarin Chinese”. I think that is the answer the reader wanted - they were planning to start learning Chinese but weren’t totally clear which language that actually meant. However, I thought the question of ‘the main Chinese language’ actually deserved a bit more space here.
This is actually quite a thorny question, which many people who haven’t studied Chinese may not realise. There are so many ways to approach the question that it’s difficult to even know where to start! Here’s a quick summary of the points I want to make:
- The term ‘Mandarin’ in English actually tends to refer to something called ‘Standard Chinese’.
- ‘Standard Chinese’ is based on the different varieties of Mandarin Chinese in various ways, and they are just one part of a huge overall picture.
- ‘Standard Chinese’ is more artificial than many people realise.
- The whole thing is wrapped up in political and geographic controversy.
First lets have a look at how Standard Chinese fits into a bigger picture of Chinese languages. In summary it looks something like this:
- Sino-Tibetan languages
- Sinitic languages
- Han languages
Written vernacular Chinese → Standard Chinese grammar
- Mandarin languages → Standard Chinese vocabulary
Beijing Mandarin → Standard Chinese phonology
This can be really confusing when the thought in your head is “I want to learn Chinese”. When you say that, what you probably mean is “I want to learn Standard Chinese”. I’ve used the arrows above to try to indicate a ‘based on’ relationship. As you can see, Standard Chinese grammar is _based on _written vernacular Chinese, its vocabulary is based on the Mandarin languages as a group, and its phonology is based on the Beijing version of Mandarin Chinese.
One dubious thing I did in the list above was to include written vernacular Chinese as part of the ‘Han languages’ section. I’m not sure how accurate this is, but I did it to try and make the situation a bit easier to understand. Written Chinese in its various forms is pretty separate from the spoken languages and has had quite an independent existence throughout its history, being used by a huge variety of groups in a huge variety of locations.
Already it should be very clear that any term like ‘the main Chinese language’ just cannot capture this situation. The ‘main Chinese language’ is probably Standard Chinese, but the situation is much more complicated than that. Let’s go through the levels involved and see if we can get a clearer picture.
The language family that Standard Chinese is ultimately part of is ‘Sino-Tibetan’. Equivalents to this include huge categories like ‘Indo-European’ (every language in Europe plus huge swathes of western and central Asia) and ‘Afro-Asiatic’ (the whole northern half of Africa and a big chunk of the Middle East). You can see the geographic region covered by Sino-Tibetan languages on this map in pink.
As you can see, these are vast categories covering hundreds of languages and millions of speakers. The Sino-Tibetan family that includes Chinese is the second biggest language family by number of languages (Indo-European is the biggest), and the largest by number of speakers.
Within the Sino-Tibetan language family is a group called Sinitic languages, or Chinese languages. This term captures every language you might consider to be ‘Chinese’, and accounts for a huge majority of the Sino-Tibetan family (something like 94% according to Ehtnologue). In other words, the Sinitic or Chinese languages are everything in the Sino-Tibetan family apart from the Tibetan side, as you might expect.
Within the Chinese language family, there’s a group called Han. (Note: Han and Sinitic languages may be the same thing.) This is actually better described as originally being an _ethnic_ term rather than a linguistic one, as Han is an ethnicity. The term “Han language” ( 汉语 Hànyǔ ) is often used in Chinese to refer to this group of languages (or Standard Chinese in a lot of cases).
Because the Han ethnicity is the largest in China (the largest in the world, actually), this is the dominant group amongst Chinese languages as a whole, but it’s still huge and contains massive variety. Han languages include several major groups such as Wu (includes Shanghainese), Yue (includes or refers to Cantonese), Min (includes some Taiwanese languages), Xiang (includes Hunanese), Hakka (includes some languages spoken in southern China, Taiwan, Malaysia and elsewhere) and Gan (includes languages spoken in Jiangxi). All of those contains sub-categories and sub-languages. The languages I’ve mentioned in brackets are not exhaustive, but just give you some idea of where these languages are used.
The other issue we’ve now encountered is at what point in this cascade of categories do we start using the term ‘dialect’ and stop using the term ‘language’? The problem here is that these terms are extremely controversial and debated by linguists, and there are multiple camps inside and outside China arguing over which Chinese languages are actually dialects and vice versa. I’ve tried to keep using ‘language’ throughout, even for similar and mutually intelligible languages, just to keep things consistent and reinforce the point that we’re dealing with massive variety here.
You’ll probably find that most people in China refer to anything other than Standard Chinese as a 方言. Literally this is ‘local language’, and is usually translated into English as ‘dialect’. Clearly this reduces the status of the language being described, so I’ve tried to avoid using the term here. A lot of people in China, both foreigners and Chinese people, can be quite disparaging towards 方言 and see them as ‘lower status’ in comparison to Standard Chinese.
Written vernacular Chinese
As I mentioned above, this is a dubious inclusion as we’re dealing with spoken languages here, not written ones. I wanted to include written Chinese, though, because the grammar of Standard Chinese is based on the grammar of written vernacular Chinese. This written language was developed and standardised from the late nineteenth century onwards, setting a theme for language standardisation and design that’s important for understanding what Standard Chinese actually is.
Now we’re back on track with actual spoken languages. Despite the term ‘Mandarin’ in English seeming to refer to one language, Mandarin is actually yet another group of languages (or dialects, or varieties, depending on who you ask). We are finally at a group of very similar languages, though.
If you’ve been learning ‘Mandarin’ (i.e. Standard Chinese), you might still have some difficulty communicating with speakers of many of the Mandarin languages. They would almost certainly understand you speaking in Standard Chinese, but there’s a good chance you wouldn’t understand them if they spoke in their native Mandarin language. As native speakers, though, speakers of all Mandarin languages can communicate with each other without major issues, even if they have clearly distinct accents and quite a lot of differences in vocabulary.
In Chinese, the Mandarin languages are often referred to as ‘Northern languages’ ( 北方话 běifānghuà ). Speakers of these languages have a much easier time speaking Standard Chinese than people from the south do, because their native language is much closer to Standard Chinese. Southerners are sometimes ridiculed for their accents when speaking Standard Chinese; less so for speakers of northern languages / Mandarin languages.
The Mandarin languages have massive variety in vocabulary, and Standard Chinese draws on all of them for its vocabulary. The result is that the vocabulary of Standard Chinese is something of a Frankenstein vocabulary that overlaps only in certain places with each Mandarin language. If you’ve learnt to speak Standard Chinese, you’ll find that you understand a lot of the words used by speakers of Mandarin languages, but they’ll be mixed in with a lot words that aren’t in your textbook or dictionary. And of course, the accent and pronunciation will be different to what you’re used to. Note that in Beijing and north-eastern China, the Mandarin languages are very similar to Standard Chinese, but not totally the same.
So far we’ve got two out of the three things we need for Standard Chinese: grammar, from written vernacular Chinese, and vocabulary, from the various Mandarin languages. The final thing we need to add is the standardised phonology, which is based on…
Within the Mandarin language group is the version spoken in Beijing. This gets variously referred to as a language, a variety, a dialect or just an accent, depending on how much status someone wants to give it. In any case, the phonology / pronunciation of Standard Chinese is heavily based on the Mandarin spoken in Beijing. If you’ve learnt Standard Chinese then you’ve learnt something very similar to what native Beijingers speak, but in a sanitized (if I can use that term), standardised form.
Now we’ve got the three ingredients necessary to bake the Standard Chinese cake: grammar, vocabulary and phonology. However, the ‘baking process’ is very significant and much more influential than a lot of people who study Standard Chinese might realise.
Standard Chinese is an artificial language
Saying Standard Chinese is artificial might seem like quite a bold statement if you’re not aware of this, but really it’s quite an accurate way to describe the language. Standard Chinese has never been a language that developed entirely organically. Instead it’s very much a planned language that was designed by language experts from inside (and even outside) China.
As you’ve seen above, it’s certainly based on more organic varieties of Chinese, but ultimately, the language used in Chinese textbooks, on TV, on the radio and so on is always an attempt at a standardised ideal. The Chinese terms for Standard Chinese make this much clearer than ‘Mandarin’ does in English: 普通话 (pǔtōnghuà) - “standard speech” - and 国语 (guóyǔ) - “national language”. Seeing these terms, it becomes clear that Standard Chinese is something pretty separate to any of the organic languages its designers based it on.
The political controversy of ‘the main Chinese language’
As you might imagine, all of this is hugely controversial, at the very least for people interested in linguistics. Aside from what to classify as a language, a dialect and an accent, there’s the issue of who gets to decide what Standard Chinese is, what ‘the main Chinese language is, why they have that right and whether or not they should continue to have it. Many languages and the people that use them face these issues, and Chinese is no exception (English is often a happy exception to this - I can’t really think of any authority that can claim any comprehensive power over something called ‘Standard English’).
Politically, Standard Chinese is controversial because it’s pretty much been designed and applied top-down by the central government to every place in China, regardless of what local languages exist there and what might be most beneficial for speakers there. This is obviously quite convenient and effective for uniting the country linguistically, but it also creates all sorts of status, prestige and equal opportunity issues for the local languages and their speakers.
The issue of national unity and a singular Chinese identity is one of the biggest topics out there, so I won’t go into it here other than to point out that Standard Chinese might be seen as the linguistic arm of the same effort from the central government that seeks to establish a single thread of ‘China’ in history, culture, politics, ethnicity and so on. When such an effort is applied so broadly to over a billion people, you can see why it’s so controversial.
Please do share your thoughts on all this in the comments. I’m no expert, but I’ve tried to put my opinions and understanding in writing here; feel free to do the same in the comments. I’m sure many of you understand these issues far better than I do.