Hacking Chinese

A better way of learning Mandarin

Learning to hear the sounds and tones in Mandarin

When we are born, we perceive the world around us largely without filters and without sorting things into categories. In order to make sense of the world, establishing categories and sorting our impressions into them is necessary, not least when it comes to languages and the sounds they consist of.

Without sorting the incoming information into meaningful categories, communication would not be possible. Learning our first language is therefore about figuring out which features are important and which can be ignored. We also need to learn which categories are used in our native language and where the boundaries between them are.

For infants, this process is automatic and almost always successful. That’s not the case for adult learners. In this article, I’m going to introduce two methods to learn to hear the difference between new sounds, as well as my research project to help you learn to hear the tones in Mandarin.

Speech sounds and categories

Even though the sound structure (phonology) of languages is surprisingly regular, there are of course differences in how sounds are sorted. A “p” in one language might sound like a “b” in another; vowels are often even more complex. Another very good example of this is tone, as it appears in languages such as Mandarin and most other Chinese dialects.

If you come from a non-tonal language background, you need to establish new phonological categories. You need to learn the difference between the tones and how to accurately tell them apart. Even if you speak another tonal language, you still need to adjust your categories to fit Mandarin, which might in some cases even be harder than learning the tones from scratch.

Learning to hear new sounds as an adult

Establishing new phonological categories as an adult is not easy, but it’s clear that it can be done, otherwise no English native speakers would be able to hear or pronounce the tones in Mandarin, which is obviously not the case. Some people are very good at hearing the differences between new speech sounds and quickly form categories resembling those of a native speaker.

But what about the rest? Can your ability to hear sounds be systematically trained? If so, how? What factors influence our ability to establish new categories for speech sounds as adult learners?

This is the topic of my research. Since I’m mostly interested in Mandarin, and tones constitute a unique challenge for adult learners, this is what I have chosen to focus on. In short, I want to find a scientifically sound way of helping students to form the correct tonal categories so they can identify the tones in Mandarin.

How to learn to hear the tones in Mandarin

In the meantime, I want to discuss two methods that have proved successful when it comes to teaching adults how to identify new speech sounds when learning a second language. Both these methods were originally researched for other features than tone (such as voice onset time, meaning aspiration and voicing; and r/l distinction for Japanese learners of English), but they have since been tried with Mandarin tones as well.

Variation of training audio

One of the most successful methods relies on systematic feedback using audio produced by several native speakers. Compare this with the normal foreign-language classroom situation where one or at most two native speakers are heard, rarely in a systematic fashion.

If you remember what I wrote in the introduction, it’s not hard to understand why it’s a good idea to use several different voices. Human speech contains a huge amount of information, but most of it is not used to determine the meaning of what is said.

For example, when it comes to tones, a child’s pronunciation is higher than an adults, and most women have a higher pitch than most men. Yet this does not lead to difficulties in hearing the tones, at least not for native speakers. The same can be said for other factors that make voices sound very different, even though they are obviously saying the same thing.

If your brain knows which sounds it’s hearing and also hear these sounds from a number of different speakers, it will have the information necessary to figure out which parts are crucial for understanding and which are just down to personal differences between speakers and can be ignored. For instance, absolute pitch height is not a crucial part for identifying tones; it’s how the tones change that matters. All fourth tones are falling, but they start and end on different height depending on the length and thickness of the speakers vocal chords.

This is the principle I’ve built the tone-training course on. In essence, it puts the learner through systematic exposure of tones spoken by different speakers, gradually helping the student to form the correct categories for the basic tones in Mandarin. Again, if you’re interested, please check this article.

My guess is that many readers will already be too advanced, but then I hope you’re willing to help me out anyway. You probably know other people who want to start learning or who need help with their tones. Could you please help me spread the word?

My own research: Helping you to hear the tones

This is a slow process since each step needs to be supported by evidence from either previous studies or my own research, but I certainly don’t start from scratch; this topic has been researched for several decades.

In order to advance our understanding of how tone learning works and thereby help students to learn more efficiently, I have created a tone-training course, together with Kevin Bullaughey over at WordSwing. Here’s some information about the course:

  • It targets any learner who has not yet mastered basic tones
  • It can be used on your computer or phone, as long as you have internet access
  • It’s built on previous research into tone learning
  • It’s effective (based on pilot-study data)
  • It’s free

The course will be available soon, hopefully next week. You can sign up here if you want to be notified when we’re ready to receive students:

The course is now available, read more here:

The tone training course is now open

Learning by exaggeration

Above, I mentioned that there are two methods that seem to work well. The other one relies on something which should be familiar to most readers: child-directed speech (CDS). That’s the exaggerated, slightly high-pitched language adults instinctively use when they speak to children. This also works for adults, or at least the exaggeration does.

The idea is that if the learner needs to identify a boundary between two distinct sounds, it makes sense to first learn to identify two very exaggerated versions that will be impossible to miss. For example, if you pronounce Pinyin “p” with so much aspiration that you knock the student off his or her feet, the student is unlikely to miss the point. Then, you gradually decrease the exaggeration until you reach more naturally sounding aspiration.

Let’s look at an example with tones, namely the difference between the second and third tones, which is the most difficult one for most learners. The main difference is that the third tone rises much later. They both have a dip. By starting out with second tones that rise very, very early and third tones that rise very, very late, you can help the student identify the crucial component (the turning point). Then you gradually reduce the exaggeration and approach the actual boundary between the two.

This method works particularly well if you have already identified two similar sounds that you find difficult to distinguish, but it’s harder to use as a general approach for learning new sounds (you need to contrast two sounds and it’s not always obvious which sounds you should combine). Still, the principle of exaggeration works in general and can be used with teachers or language exchange partners.


As an adult learner of a tonal language like Mandarin, you need systematic input from a number of different speakers. That will allow you to gradually identify what part of the input is essential for determining meaning and what can be safely ignored. I hope you are willing to help me with this research project, either by joining yourself or my spreading the word!

Learn more about how to join the course.

Tips and tricks for how to learn Chinese directly in your inbox

I've been learning and teaching Chinese for more than a decade. My goal is to help you find a way of learning that works for you. Sign up to my newsletter for a 7-day crash course in how to learn, as well as weekly ideas for how to improve your learning!


  1. 本岸哩 says:

    This looks pretty interesting. I’m sort of in a middle way to understand tones, but quite far from using them correctly all the time without having to recall each one, and certainly would benefit from some more basics, I’m going to give it a try! Thanks for the chance!

  2. Vanessa Candle says:

    Thanks a lot for such useful guide!

  3. AD says:

    Well, if I try to learn I’ll be a beginner, at least beyond hearing my preschool best friend’s parents speaking when I was like 2-5, but I don’t recall ever trying to learn then or at other times I was passively exposed as a child. I can be fresh data for the project. I am curious myself because I have a musical background and a good ear for it, but I also have CAPD, which is more about word differentiation, as well as autism and learning disabilities. It will be fascinating to see whether this makes learning a tonal language easier or harder; I always found reading, writing, and even speaking easier than hearing, regardless of language. But perhaps tonal phrases will prove easier than I anticipate and I’ll be pleasantly surprised that isn’t the trickiest part.

    Curiosity is a good motivator. Time to invest is the antagonistic factor.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.