Hacking Chinese

A better way of learning Mandarin

Why is listening in Chinese so hard?

Many people who know only a little about learning Chinese think that either pronunciation or characters is the most difficult part of the language. That’s understandable, because tones can be tricky in the beginning and there are so many characters to learn.

But the skill most learners struggle with is actually listening, not speaking, reading or writing. This is based on a survey I ran here on Hacking Chinese a few years ago, as well as my own experience as both a learner and a teacher.

In this article, I will explain why listening is so hard. There are some obvious reasons I’m sure most of you realise without me having to point them out, but there are also some less obvious reasons I didn’t realise until grad school and after many years of studying.

Tune in to the Hacking Chinese Podcast to listen to the related episode:

Available on Apple Podcasts, Google Podcast, Overcast, Spotify and many other platforms!

Why listening is hard in general

Before we get to the question why listening is hard in Chinese in particular (and it is, wait for it), let’s briefly discuss some things that make listening hard in general, that is, listening in any language. The main reason listening is hard is the lack of control. As a listener, you can’t control the rate of speech, the words and grammar used or anything else.

In a conversation, you can use some communication strategies to influence these factors by telling the speaker that you don’t understand, asking her to slow down and so on, but your control is still limited compared to when you speak yourself. In that sense, speaking is much easier as you can stick to things you know.

Compared with reading, listening poses extra challenges because it can usually not be slowed down or paused at will, at least not in the wild. This is different from reading, where you can stop, reread or otherwise control the speed.

Even if the overall speed is fixed, such as in an exam, you can still decide to dwell on a tricky character or reread a difficult sentence, and then make up for that by speeding up on easier passages. You can’t do any of that when listening.

Finally, listening is far more complex than most people realise, especially if you leave the classroom. If you only listen to your teacher, textbook audio and a few podcasts, you might get the impression that spoken language is well-ordered, neatly sorted into proper sentences and following the rules you’ve learnt.

You couldn’t be more wrong!

Normal people don’t speak in sentences, they occasionally violate (almost) every rule you’ve learnt, so it’s no wonder students often get a shock from a dose of real-world Mandarin. If you want to read more about how native speakers are often “wrong”, I suggest you check out this article next:

Can native speakers be wrong about Chinese grammar and pronunciation?

Why listening in Chinese is particularly hard

There are a few reasons to believe that listening in Chinese is indeed harder than in many other languages. I don’t mean to say that Chinese is hardest or harder than a particular language, I just want to point out a few things that objectively make it more difficult to understand spoken Chinese.

Reason 1: Limited number of sounds and syllables

Mandarin has, relatively speaking, very few sounds and very few syllables. This is neat when you learn pronunciation, because once you learn those sounds and syllables, you have most of what you need, and while there might still be things to polish and prosody to master, it’s still much easier than in some other languages.

However, having few syllables makes listening much harder. Many students say that they feel that everything sounds the same.

And they’re right, up to a point, things do sound more similar. Mandarin has slightly over one thousand unique syllables, whereas English has almost ten times as many. If you can’t identify the tones properly, the number of syllables effectively drops to just a few hundred, further aggravating the “everything sounds the same” syndrome.

If you think tones are hard to hear, I have discussed that issue in a separate article:


The low number of syllables is due to the strictly limited syllable structure. No complicated consonant clusters allowed (compare with English “street” and “months”), syllables can only end in vowels and a few consonants, few vowel sounds overall (Swedish has twice the number, just to mention one example), and so on.

Reason 2: Relying on context puts the burden on the listener

The world’s languages differ in how they encode information. In some languages, lots of information is encoded on the word level, meaning that you can look at a word and know a great deal about the context just by how it’s inflected, which prefixes and suffixes are attached and so on.

For example, in English, we know that “her” is the object of a sentence without knowing what the sentence is, so “I love her” and “her I love” both mean that there is a female person I love. By looking at the verb, we can also see that it’s present tense and that the subject is not third person, because then it would have been “loves”, not “love”.

This creates a whole lot of redundancy in English sentences (and even more so in languages like Latin), meaning that the same information is encoded more than once.

In the sentence “the neighbour’s dog bit me yesterday”, we know that it’s past tense both from the inflection of the verb, “bit”, and by the fact that “yesterday” is included in the sentence.

Chinese doesn’t work like that.

Words often sound the same in singular and plural, and there are no tenses. You can’t hear the difference between “he” and “she”, and the same word can sound the same whether it’s a noun, verb or adjective. There are some particles that compensate a bit, but they are not even close to providing the amount of redundancy we normally see in English sentences.

Now, it would be wrong to think that this information is just lost in Chinese. Instead, most of it can be inferred from context. Like I pointed out in the example with the neighbour’s dog above, we don’t have to inflect the verb to know that it’s past tense; we know that because it happened “yesterday”. Actually, we don’t even need to include the word “yesterday” in the same sentence if we are talking about other things that also happened yesterday just before we relate the incident with the dog.

This simplicity looks alluring to students at first, because you think “wow, how convenient, I don’t need to inflect verbs, and I don’t need to learn five different tenses”, but later, you realise that this is actually a double-edged sword, because it also means when someone else is talking, you have to infer these things from context.

This is easy if you grew up with the language, but not for people who learn it as a second language. In fact, it can be really hard, especially when combined with the other factors mentioned in this article.

Reason 3: Regional variants and accented Mandarin

Many people in China don’t speak Mandarin as their first language. This means that people you meet (depending on where you go, of course) may have learnt Mandarin in school, heard it on radio and TV and perhaps spoken it in formal education or when travelling, but they don’t speak it at home or with their friends.

When they speak Mandarin with you, they do so with an accent, borrowing vocabulary, grammatical structures and, most importantly, pronunciation, from their native tongue.

This is not to confuse with what’s often called “dialects”, such as Cantonese or Hokkien. For listening purposes, these might as well be treated as separate languages. Instead, I mean that when native speakers of these dialects speak Mandarin, they borrow features from Cantonese or Hokkien, and use them when they speak Mandarin. If you want to read more about this, here’s an article for you:

Learning to understand regionally accented Mandarin

The point is that apart from understanding clearly pronounced, standard Mandarin, you also need to deal with dozens of common variations and accents, many of which merge sounds, reducing the number of syllables further, thus making listening even harder.

This is of course not unique to learning Chinese, but it is certainly a more pronounced (hehe) challenge compared with many other languages, especially smaller ones. I mean, there are different accents in Sweden too, and some of them sound really different, but there are only a few of them. There are only about ten million people speaking Swedish.

More than a billion people live in China and the diversity is so enormous it’s hard to grasp. English, Spanish and Arabic of course also have vast difference in how they are spoken in different areas, so the challenge is real there, too.

Conclusion: Listening in Chinese really is very hard

So no, you’re not just imagining it, listening in Chinese is difficult. Listening in a foreign language is in itself hard because of the lack of control over speed and the language used, but Chinese is hard beyond that.

The fact that there are few sounds and few syllables means it really is easier to confuse them, and if you find tones hard to hear, then the problem is compounded.

As if that weren’t enough, Chinese relies a lot on context rather than redundancy in inflections and articles, which is fine if you’re a native speaker, but not so much if you’re a second language learner.

Finally, to add insult to injury, many people you speak with in the real world will speak with an accent, further blurring the boundaries between sounds and messing with your hard-earned listening skills.

I realise that this is not very encouraging, so to end on a slightly more positive note, let me say that listening really is a matter of practice. You will learn these things by listening more. Much more than you will do by taking classes or just living in China. Listen as much as you can, all the time, every day. If you want more practical advice, there’s a whole category of articles about listening ability here. Eänjoy and good luck!

Improving listening ability on Hacking Chinese

Tips and tricks for how to learn Chinese directly in your inbox

I've been learning and teaching Chinese for more than a decade. My goal is to help you find a way of learning that works for you. Sign up to my newsletter for a 7-day crash course in how to learn, as well as weekly ideas for how to improve your learning!


  1. Dufeizi says:

    Dear Olle,

    I enjoyed listening to this article during dinner quite a lot! I would like to add that, for an advanced learner who occasionally tunes in to Chinese news or similar shows, contractions of terms that the average Chinese person might be familiar with is yet another hurdle to listening.

    1. Stephen Holtom says:

      Amen to that. And so often it seems a word or phrase that is relatively unambiguous gets shortened into something easily confused with other things.

    2. Olle Linge says:

      Yes, contractions are a real problem, especially in anything formal! I spent a lot of time listening to news broadcast for a while and most of the passages I gave up on and had to look up where contractions, abbreviations or other succinct ways of expressing something that I would easily have understood in the full form. The more formal it is, the worse it gets!

  2. Paul underwood says:

    Thanks Olle, glad to hear it’s not just me. I’ve been living in China and learning Chinese for two years and still hardly understand a thing….

  3. “it is certainly a more pronounced (hehe) challenge”

    You rang?

    I never thought these things mattered too much but they’re true and may affect me more than I realize. Compared to my better listening of French and Japanese, which at times has even been better than my speaking, I did find it a bit odd that my Chinese listening skill is so much lower than my speaking. A lack of vocabulary and fully understanding the vocabulary I do know makes it worse. For example, I often run into the situation where they say a word similar to one I know, maybe even sharing a character. If I understand that character, I’ll often make the inference and get a general idea of what that word was, but if I don’t, I’m stuck. Of course, rarely is a single word the big issue, but this is the example that came to me.

    As for ideas on how to get better, I’d personally recommend learning and trying to copy some accents, Especially it you’re at least at a level to manage some basic conversation. I limit my erhua to only northerners / those who speak with it. When I went to Taiwan, I knew the accent and vocab would be an issue so I spoke in their accent and used their words.

    Instead of letting your brain get surprised when you encounter non-standard speech, warm up, adapt, and momentarily live in it so your brain is accustomed and ready.

    1. Olle Linge says:

      Yes, vocabulary is, I would say, the biggest problem with listening ability. In fact, that was the conclusion of an old series of articles I wrote about listening ability, where I analysed problems, one of which is “lack of vocabulary”. This is, strictly speaking, not related to listening ability directly, but that’s where the problem is most noticeable. You can check out the article series here:

      Chinese listening strategies: Problem analysis

      Regarding how to get better, mimicking and otherwise familiarising oneself with other dialects is essential. I think the biggest shock is if you’ve had only one teacher in your home country, in which case you’re in for a very rough awakening. This happens to me all the time, which is somewhat annoying, but still understandable. I mean, when travelling in China, I can communicate flawlessly with person #1, but then person #2 comes a long and suddenly I’m kicked back to some kind of lower intermediate listening level. They usually understand everything I say, but I feel so bad that I don’t understand what they’re saying. This will of course always happen and it happens to native speakers too, but it can still be frustrating. Cab drivers is a good example. 🙂

      1. Richard Pohl says:

        I noticed that cab problem happening even to a native speaker (she annoyingly said to a didi driver on the phone, that she could not understand his putonghua, and later even complained to me saying people from that province speak very weird Mandarin (she was from Haerbin and the driver was from Xian).

  4. 叶老师 says:

    thanks for the recording it was wonderful! i’m usually to lazy to read that much of a text but this time i was listening on the way to work and it was great, thanks for insightful and wonderful article:)

    1. Olle Linge says:

      I love audio recordings for this reason. I consume enormous amounts of audio material (audio books, audio versions of certain magazines, podcasts). I want to be able to provide this for Hacking Chinese, just need to know that enough people really want it. We’ll see!

  5. Birgit says:

    listened to this article while preparing a meal

    1. Olle Linge says:

      Perfect, 一石二鸟!

  6. Stephen Holtom says:

    It’s been a long road, but my listening is about ready to take HSK5 exam.
    But still, even at this point, I occasionally miss a whole sentence or even more. Because if I get the context wrong, or fixate on one word (e.g. if a word took a little effort to pull out of the memory banks), I can miss a whole stream of words in the meantime.

  7. Gareth says:

    Everyday really is a school day! I always thought it was just because Chinese, sorry, Mandarin, was a tonal language, or is that only partially the reality?

    1. Olle Linge says:

      Well, the article you left a comment to answers your question, so I’m not sure what I can add to make it clearer. 🙂

  8. Fearchar says:

    It struck me some years ago that there is another fundamental difference between the listening skills of a tonal language speaker and those of most Indo-European languages: the former are listening for vowels and the latter, by and large, for consonants. That’s why native Mandarin speakers often drop consonantas, particularly endings such as “-ed” in English. That lack of attention to the detail of consonants is the corrolary of our relatively poor ability to manipulate tones.

    1. Olle Linge says:

      I don’t know about the first bit (i.e. I have no idea), but I’m sceptical about the second part. I think the reason they (mainly referring to speakers of Chinese here) drop endings is because of syllable structure more than anything else. If you come from a language that only allows three consonant sound at the end of syllables and no complicated consonant clusters, it’s going to be hard to pronounce words like “talked” or “practised”. This can be heard sometimes when people insert extra vowels to conform to the syllable structure of their native language. An example would a cluster like “st” in “study”, pronounced as “sətudy” to avoid the cluster. The same thing can be achieved in words like “talked” by simply omitting the ending.

      I guess this would be possible to test by checking if they drop “-ed” more often when it’s part of a consonant cluster compared to if it appears directly after a vowel. I have never really studied this topic, though, maybe there already are such studies?

      Another factor worth considering is grammar and morphology. Native speakers of Chinese are not used to the whole idea of adding seemingly arbitrary, small sounds to the end of words to make them fit in a sentence. Sure, we have some particles in Chinese too, but I think we can agree that’s not really the same thing.

  9. 杰克 says:

    Great article, it’s reassuring I’m not the only one that finds this. Another factor you didn’t mention here (although it’s related to the redundancy issue) is that Chinese is an incredibly concise language. The vast majority of words are no longer than two syllables, and the sentence structure is quite simple and parsimonious. This means that when you are listening to a stream of speech in Chinese, much more semantic information is encoded in a much smaller space of time than would be the case for a European language, which gives less time for your non-native brain to process it. I find this effect is even more pronounced for more formal Chinese (due to the presence of classical structures).

    I think people often assume that languages with longer words (like Russian or German) are harder to learn, but I think they are actually much easier as it slows down the semantic onslaught when listening, and also gives you more means by which to differentiate words both when hearing them and learning them.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.