Mastering tones is one of the most challenging tasks for learners of Mandarin and other Chinese topolects. This is to a large extent because tones are an integral part of pronunciation, but one that feels like added to syllables and is completely alien to most foreigners.
There’s not much I can do about that (except highlighting that tones are not optional), but I can do something about the misleading and confusing explanations about how tones interact with each other.
Tune in to the Hacking Chinese Podcast to listen to this article:
Available on Apple Podcasts, Google Podcast, Overcast, Spotify and many more!
Teaching you how to fish and serving fish for dinner
Here on Hacking Chinese, I normally focus on how to learn rather than actually teaching you the language directly. Give a man a fish and you feed him for a day: teach him to fish and you feed him for a lifetime.
The reasoning behind this is that there are already many people who teach the language and do so well, but very few who talk about how to learn. However, I make exceptions to this when I feel that there’s a genuine need for clarifying things.
The best example of this to date is my article about how the third tones works, which remains one of the most popular articles on the site. This is probably because most textbooks and teachers fail to explain how it works and students need to find that out on their own.
Optional and obligatory tone change rules in Mandarin
In this article, I will talk about tones and tone changes in Mandarin, which is another common source of confusion. Learning the theory about basic tone changes is really easy; mastering them in practice can take years. If an incorrect explanation or instruction sends you down the wrong path (as was the case when I learnt the third tone, for example), the student might never get it right.
Part of the confusion stems from the fact that some teachers and textbooks throw in extra rules that actually confuse students more than they help them. Thus, in this post I have split the tone change rules into two categories: obligatory and optional.
- Obligatory tone change rules are those that you have to apply when you speak Mandarin, otherwise you will pronounce things incorrectly and the risk of misunderstanding increases. These are rules that all native speakers follow all the time, but which we as foreigners struggle with. A good example is the rule saying that a third tone changes to a rising tone in front of another third tone, as in 你好 (nǐhǎo).
- Optional tone change rules are those that native speakers often apply, but which are not strictly speaking necessary. If you don’t follow them, you will sound a bit unnatural, but you won’t really be wrong. Furthermore, it’s also the case that these changes are usually the result of speaking more quickly, so as a student becomes more and more fluent, she’s likely to pick these changes up automatically without focusing directly on them directly. A good example is that several consecutive fourth tones are seldom pronounced as full fourth tones. If you have achieved basic fluency in Mandarin, you probably don’t pronounce them as full tones even if you’ve never heard of the rule.
When you read the summary below, keep this division in mind. If you’re new to learning Mandarin, you only need to care about the obligatory tone change rules; you can safely ignore the rest. I remember freaking out by occasionally hearing about additional tone change rules I had never heard of.
Obligatory tone change rules
The tone change rules you obsoletely have to learn aren’t that many and not that complicated. But you absolutely have to learn them. If you don’t you’ll regret it later, I promise.
When I say learn them, I don’t mean being able to recite the rules, I mean being able to nail them every single time, at least in two-syllable words, but preferably in sentences too, although that usually comes naturally later if you have the tone pairs down.
- Third tone changes – Before a first, second or fourth tone, the third tone is pronounced as a low, falling tone. Before another third tone, the first third tone is changed into a second tone. In isolation or when stressed, the third tone is usually pronounced as a falling-rising tone.
- Tone changes of 一 （yī) – In isolation, or when used as an ordinal number, 一 is pronounced as a first tone. In other situations, when it comes before a fourth tone, it’s pronounced as a second tone. When it comes before anything but a fourth tone, it’s pronounced as a fourth tone. More details.
- Tone changes of 不 (bù) – This character behaves like 一 （yī), except that it doesn’t have the first tone pronunciation at all. So, the basic pronunciation is fourth tone, but if the following tone is a fourth tone as well, 不 changes to a second tone. More details.
- The neutral tone – This is not the same situation as the others, but I might as well include it. The neutral tone (written without a tone mark) has no tone of its own; it’s entirely derived from the preceding syllable. It’s generally slightly lower than the end of the preceding tone, except after a third tone when it becomes a high tone. More details.
If your textbook and/or teacher didn’t tell you this, you should change textbook and/or teacher. These rules are extremely basic and you absolutely have to learn them. Before we move into territory I think most readers might be unaware of, I’d like to shamelessly recommend my pronunciation course:
Hacking Chinese Pronunciation: Speaking with Confidence
My hope is that you find this article helpful. If you want to learn more about Mandarin pronunciation, including tones, check out my pronunciation video course. It covers everything you need to know to speak clearly and with confidence!
Hacking Chinese Pronunciation: Speaking with Confidence
Optional tone change rules
Apart from the above obligatory tone change rules, you will sometimes see a number of other tone change rules, mostly applying to second and fourth tones. As I said in the introduction, you don’t need to study and learn these! While it’s good to know about them, you should ignore them for now unless you have a specific interest in pronunciation or phonology. I include them here because I receive questions about them and some of you might not know about them at all.
Here are some examples (there are many more). In these examples, I use numbers to represent tone height: 1 means low, 5 high:
- When a fourth tone is followed by another fourth tone, the first is usually not pronounced with a full fall from high to low (51), but rather from high to mid tone height (53). For example, dòngwù ought to be dòng (51) wù (51), but often turns into dòng (53) wù (51).
- When several second tones appear in a row, the middle one (and sometimes also the first one if speech is very fast) turns into flat, high tones. For example, my name ought to be líng (35) yún (35) lóng (35), but turns into líng (35) yún (55) lóng (35).
- When a second tone appears between a first tone and something else, the second tone turns into a first tone. For example, xīyángshēn (ginseng), is often pronounced as xīyāngshēn and yīniánjí (grade one) turns into yīniānjí.
For more of these, see the reference list at the end.
The thing here is that these changes are quite natural and while I can’t back this up with empiric research, I bet that if we took a large group of students and had them study Mandarin full time for a year or two, and never mentioned these optional rules, they would still follow them, without really trying to and probably without noticing it.
That’s because they are very natural changes based on the context the tones appear in. Of course, it could be argued that all tone changes are like this, but the research on tone acquisition tells us with all necessary clarity that most students don’t learn the third tone changes automatically. In fact, I often hear students who have studied for years and still can’t pronounce 美国. Sad but true.
An example where English and Mandarin work the same way
Let’s look at something similar, but which isn’t related to tones. Sounds influence each other when they appear next to each other in languages in general, and one of the ways they can do so is to blend together or at least become more similar. This is called “assimilation” in phonology and is very common in rapid speech, even if native speakers usually aren’t aware of it.
For example, say 面包 (miànbāo), “bread”, and pay attention to the final consonant in the first syllable, -n. Don’t just look at the Pinyin, say it in a relaxed, fluid manner (don’t enunciate in an exaggerated way). You may or may not have noticed that the consonant is [m], not [n], as the spelling indicates. That’s because [b] in the following syllable is a bilabial plosive (a stop produced with both lips), and [m] is also bilabial (but not a plosive). What happens is that the sounds are pronunced in a similar way to make the word easier to say. In fact, pronouncing it with a clear “n” harder and feels awkward.
For the same phenomenon in English, check how you pronounce the “n” in “input”! This is one of the common cases of assimilation in English, i.e. that [n] turns to [m] when followed by a [b] or [p]. You might not think that you do that, but it’s quite likely that you do when speaking quickly. Naturally, I can’t say how you actually pronounce this word in English, I’m just saying that this is a well-documented case of assimilation in English. For more about assimilation than you’re probably interested in, check Recasens (2018).
So, the point is that while you can say “miàn-bāo” and “input”, it’s easier to say “miàm-bāo” and “im-put”. The same goes for saying 第二次世界大战 with full fourth tones on every syllable. Or saying my name in Chinese (凌雲龍) with full second tones on every syllable. Doing so is not wrong and when enunciating clearly, it’s not uncommon, but in normal, rapid speech, sound do influence each other a lot.
There are two types of tone change rules in Mandarin from a student perspective. The first, obligatory tone change rules, is essential and you absolutely have to master all these cases to perfection. The second one, optional tone change rules, is also important if you want to sound natural when you speak, but most people tend to get it right without really trying. So don’t freak out about them: know that they’re there and speak/mimic more and you’ll be fine!
Recasens, D. (2018). The production of consonant clusters: Implications for phonology and sound change (Vol. 26). Walter de Gruyter GmbH & Co KG.
Tips and tricks for how to learn Chinese directly in your inbox
I've been learning and teaching Chinese for more than a decade. My goal is to help you find a way of learning that works for you. Sign up to my newsletter for a 7-day crash course in how to learn, as well as weekly ideas for how to improve your learning!
Aren’t some of the “tones” you are talking about as optional changes reflections more of stress than of tone? (E.g. your Chinese name 凌雲龍)
Yes, I think it’s accurate to say that it’s related to stress, which is also the main explanation given by e.g. Duanmu San, but that doesn’t mean that the tones don’t change? So, it’s not either or, it’s more that the tone changes arise because of stress-related factors. Also, if it was only related to stress, then we would assume that similar words with other tones would also change, which doesn’t seem to be the case.
In your example of 3 equal tones in a row, I was taught that the middle one would often turn into a neutral tone (not only unstressed, but follow the rules of the neutral tone). Have you noticed this?
Another example of an optional tone change seems to me to be successive first tones e.g. 新鲜 where the 鲜 is about a tone lower than the 新 – here by “tone” I mean “a whole tone; the sum of two semitones” in the sense of Western classical music. Do you agree that often, in a series of first tones, successive tones are pronounced lower and lower?
I am having trouble with unstressed second syllable tone changes. for example, Skritter says 不是 is 2 and neutral. the change to second tone for 不 follows the normal rule, but why does 是 become neutral? Another example is 可是。 here 可 is third tone, and 是 is fourth! Right now, I just write them down as I get them wrong in Skritter and hope to see the general rule with enough examples.
Oh,and I positovelydo NOT say ‘imput.’
I can’t comment on you specifically, of course, but it’s very common in natural speech to say “imput” and I think most native speakers do. Naturally, if you just now say the word aloud to test, that doesn’t work because that’s not natural speech. It would have to be in a conversation where you don’t think about what you’re saying. The process is called assimilation and is very common, much more common than ordinary (meaning no linguistics training) native speakers realise. Again, I can’t of course know how you say it, but the general pattern is certainly true.
Thanks. I am familiar with assimilation. By this reasoning, ‘nb’ should elicit assimilation even more strongly. I hear assimilation only rarely, and it is easy to find examples that are contrastive. pinprick, tinpot, tin pan, none of these normally show assimilation. nir would ‘tanboy’ contrasting with tomboy pen box unpin or unbin. contrastive pair – ‘in bed’ and “imbed”
I’m not sure we’re looking at the same thing, but both those are with fourth tones in Skritter (as they should be). To answer your question in general, what is pronounced as a neutral tone, or “pronounced lightly” is different from person to person and region to region. People in Taiwan often pronounce full tones on almost everything, whereas some people in northern China seem to have a neutral tone on the second syllable of every single word. It’s generally not a problem, so if you say dōngxī for “things” or dōngxi doesn’t matter. The first is common in e.g. Taiwan, the second is Mainland standard. There are very, very few cases where there’s a potential difference, the one just mentioned being one (so dōngxī means “east west” for people who normally say dōngxi).
hmmm. maybe my question is, how does Skritter decide these differences? 但是 is4, 4, while 要是 is 4， 0。 in both examples, the first word is tone four.and the second word is the same, 是。 but skritter has the second example with a different tone than the first.
I found a very good site on assimilation, with videos illustrating the examples.
Note that all of these examples are British. the speaker has a beautiful southern accent, but that is not how Americans speak. I’m. ot saying assimilation doesn’t occur in American English, it does, but clearly not to the same degree as in England.
Skritter uses the standardised pronunciation (Mainland) as far as possible (and if it doesn’t you should report it). The standard pronunciation for 要是 is yao4shi0, so that’s what’s in the app. If you’re asking how to know this as a student, you can’t, you have to learn that along with tones in general. There are some patterns, but you can’t predict which words have neutral tones without looking them up. Note that in most cases where there is a neutral tone, it’s often acceptable (sometimes equally acceptable) to pronounce it with full tones.
Hello, the rules, I’ve read for third tone are confusing.
I know that it is a dipping rising in isolation, or becomes rising if before another 3rd.
But according to an IPA guide, in speech the third tone is a low falling (21) if it the first word, if it is sentence medial it is a low flat tone (11), and if it is sentence final it is low with a slight rise (12).
This is hard to remember, and the diacritic for the third tone throws me off when I am trying to read pinyin. Is there a way to practice the above rules? How do I not let the diacritic get in the way? The sight of it makes instinctively think of dipping and rising.
Good question! My advice is to not care about minor differences like that. I teach students that the third tone is a low tone, but that it changes to a rising tone in front of another third tone. I then also say that it optionally has a rise when stressed heavily. That’s it. The rest are fine details I’m not even sure are true. Can you even pronounce a 11 low tone in the middle of a sentence? Probably not, unless the preceding tone ends low. Same for final position. Anyway, just emphasise that it’s low, except before another third tone, and most of the rest follows naturally, in my experience!
Linguists also share this confusion, I googled around a bit:
“Tones typically have a slight purely-phonetic drop at the end in citation form. It is therefore likely that a tone with a drop of one unit (54, say, or 21) is not distinct from a level tone (a 55 or 22); on the other hand, what one author hears as a significant drop (53 or 31) may be perceived by another as a smaller drop so it is often ambiguous whether a transcription like 54 or 21 is a level or contour tone. Similarly, a slight drop before a rise, such as a 214, may be from the speaker approaching the target tone and so may also not be distinctive (from 14).”
The origin of tones interest me too, they’re not musical notes but the leftover effects on vowels from lost or merged consonants. Now I remember my English teacher in high school once pointed out the vowel in bad is lower and longer than bat, so when the final consonant merges when speaking fast, you can tell them apart.
The above seems like simple trivia, but it changed how I think of tones. You are right, anyone who thinks “tone deafness” is an excuse to not study tones is incorrect.
Just a short question to make sure I got everything about that 3rd tone change right. It doesn’t matter if the two 3rd tone syllables belong to one word, right? So if I have a 3rd tone-heavy sentence like
我想了解一下。 Wǒ xiǎng liǎojiě yīxià.
with all possible tone changes applied the pronunciation would be
Wo2 xiang2 liao2jie3 yi2xia4?
Because the examples for tone changes are usually words or fixed expressions I had assumed that the tone combinations given above have to appear within one word/expression to trigger a tone change. So when speaking I basically have to think one syllable ahead before using a certain tone?
Thank you for this and the various other informative articles 🙂 And for the replies to comments as well!
Yes, correct. The basic rule of thumb is to sort by semantic units (words), but these boundaries are ignored as rate of speech increases. Your teacher will probably say wo2 xiang3 liao2jie3, but when speaking at more natural speed, it often is just wo2 xiang2 liao2jie3. The numbers here of course reflect tone changes, but that ought to be obvious. 🙂