Hacking Chinese

A better way of learning Mandarin

Vocabulary lists that help you learn Chinese and how to use them

In a previous article, I discussed pros and cons with using vocabulary lists to learn Chinese. The conclusion was that lists can be useful, but that they generally shouldn’t be where you get most of your new words from.

In this follow-up article, I’d like to continue the discussion and get a little bit more practical, looking at specific kinds of lists and how they should or, as is sometimes the case, shouldn’t be used for learning Chinese.

We’re going to look at the following topics:

  • Frequency lists
  • Textbook vocabulary lists
  • Proficiency test lists
  • Thematic lists
  • Special purpose lists

That’s quite a lot for one article, so let’s get started:

Frequency lists

You might have heard that 80 % of normal language use is made up of only 20 % of the words (or some variation of this). To illustrate, I wrote an article here on Hacking Chinese about which words you should learn, and did so using only the 1000 most common words in English. Many readers didn’t even notice. You can achieve a lot with just 1000 words, provided you use the right words.

The problem is that after learning this, students of Chinese head to the search engine of their choice, grab the first frequency list they can find and start learning words from it, hoping this will lead to big improvements. This probably won’t happen, and there are several reasons for this.

  • Most frequency lists you find are character lists. While those give you some words, they mostly give you only parts of words, and we all know that knowing the characters doesn’t mean you know all the words that can be written with them.
  • Frequency lists need to be based on something, and most lists you will find are based on written language, not spoken. Thus, they contain lots of characters/words that are very common in newspapers, but will do little for your spoken proficiency.
  • A frequency list is still just a list. The reason I was able to write the article mentioned above using only 1000 words is that I know the words very well, I know how to use them in context, I know how to work with them and creatively use them to express what I want. You will not get that from a list; it will just help you identify the words (that is, if you choose the right list).

I won’t do a detailed description of various frequency lists in this article, but if there’s an interest for that, I might do a follow-up article about that later (edit: this article has now been written; you can check it out here). For now, though, you need to know that for a frequency list to be useful, you have to choose a list that is based on material close to what you yourself are aiming fore.

For example, if you want spoken language, you might want to check out a frequency list based on subtitles from TV and film (which is scripted spoken language, but still much better than novels or newspaper articles). If your goal is online communication and chatting, or just informal written language in general, you could use a list based on text on Weibo.

Please also note that the value of a frequency list diminishes the farther down the list you get. The top words are going to be really useful, but as you get to more infrequent vocabulary, what words are actually useful will be a lot more dependent on context.

The character frequency lists you see floating around are almost never useful for anything but plugging gaps in your knowledge and perhaps as a way to boost your word deciphering ability long term. I’ve written about these topics already:

Textbook vocabulary lists

So, how do textbook vocabulary lists compare to frequency lists? That question is very hard to answer, because it depends on the textbook and how thorough the textbook authors were.

If done right, a textbook should of course contain the words that are most useful for the target student group. Therefore, in the best of worlds, a textbook vocabulary list would be the ultimate source of vocabulary since it contains content curated by teachers, but based on frequency data.

Sadly, all the textbooks I’ve used so far fall short on this point. They all contain words that are neither useful nor high frequency, put some very useful words late in the textbook and some not so useful words early on. Sometimes this seems to be done for no other reason than to annoy the students (although to be serious, I think it’s more often a result of native speaking teachers having a bad grasp of what is useful for students). If you know of a textbook that does this well, please leave a comment!

As was the case for frequency lists, textbook word lists become less and less useful as the chapters go by. The first book typically only consists of really high-frequency words, whereas the vocabulary choice in book five looks more or less random. This is part of the reason why I recommend you to use more than one textbook. This also avoids the illusion of advanced learning, i.e. when you finish the fifth book in a series and learn some really fancy expressions, but can’t handle basic stuff which isn’t in your textbook.

Proficiency test lists

The two major proficiency tests for Chinese, HSK (Mainland) and TOCFL (Taiwan) both publish lists of vocabulary they recommend you to be familiar with when taking the test at various levels. These lists fall somewhere in between textbook lists and frequency lists.

They are similar to textbook lists in that they are curated with a second language learner in mind, which is great. They are also similar to frequency lists in that they aren’t based on a specific narrative, and are thus less likely to contain words that make sense only in the specific story being told in a textbook.

However, these lists have the same problems as the other two lists mentioned in that they become much less useful the further you progress. You can learn everything in HSK1-3 without needing to worry about if it’s going to be useful or not, but the HSK6 list contains a lot of words that aren’t all that useful. These lists also omit a lot of important words, which I wrote about here:

What important words are missing from HSK?

Special purpose lists

There are many lists created with a specific purpose in mind, such as:

These may or may not be suitable as sources of new words (some are better used for reference).

When you use lists like these, you have to understand that they are in general only indirectly boosting your proficiency. For example, learning components of characters doesn’t actually help you write or read characters directly, but it does make it a lot easier to learn more characters. In fact, learning more characters might not make you a better reader either, but it certainly makes any subsequent efforts in that direction a lot easier!

In other words, use them, but with a purpose in mind and not as your main source of new vocabulary.

Thematic lists

There are typically two ways to arrange vocabulary related to a certain theme: list by semantic similarity and list by narrative context. An example of the first case would be a list of 25 different kinds of fruit in Chinese; an example of the latter would be vocabulary derived from a description of someone going shopping (which might include names of fruits too).

In general, research shows that the second type of list is clearly superior for learning. In fact, lists that lump together things that are similar in some way (similar meaning, similar pronunciation) are bad because they increase the chance of students mixing things up. Lists based on a narrative context is much better and also place the words in a story which is easier to remember.

An example of this would be word lists derived from the adventure text games for Chinese learners I’ve created with Kevin over at WordSwing. They contain vocabulary from the narrative in the game, which means all the words are placed in a context and not arbitrary grouped together because they are closely related in meaning or form. You could use such lists to review what you have learnt in the game, or to preview in order to make playing easier.

However, the first type of list can still be useful for reference, even though it’s bad for bulk learning.As before, it’s also useful to plug holes in your vocabulary. Some basic familiarity with words can also help you approach a new topic you haven’t dealt with before.

For example, if you want to play a computer game such as StarCraft 2 in Chinese, or watch matches where the commentators speak in Chinese, you’d be well served by having a list of common vocabulary at hand, even if part of that list will just be all the units in the game.

I don’t suggest that you actually study the list to learn it, but it can still be useful! The article about StarCraft 2 linked to above actually contains such a list created by yours truly.


As you can see, there are many ways you can use lists, but each list has its own pros and cons. No list is a panacea and you won’t learn Chinese simply by memorising a lists, regardless how well curated they are. This was the main point in the first article.

However, lists can highlight weaknesses in your knowledge and understanding, as well as help you open doors to new areas of the language. Use them wisely!

Tips and tricks for how to learn Chinese directly in your inbox

I've been learning and teaching Chinese for more than a decade. My goal is to help you find a way of learning that works for you. Sign up to my newsletter for a 7-day crash course in how to learn, as well as weekly ideas for how to improve your learning!


  1. Thanks Olle for another great and useful article.

    Currently, I have my own list which grows day by day. My daily process is like using my LinuxOS with a Chinese language setup and chatting with people via the HelloTalk app. This combination alone gave me a great boost in the last 3 weeks. Sure there is always Anki and Skritter aside all of that. 🙂



  2. David says:

    Excellent. The voice of experience and common sense. Many thanks.

  3. Pie bright says:

    I do appreciate because the chinese language is beautiful.

  4. 万博 says:

    That’s a Great list of Chinese Vocabulary. I need to Do hard work to learn Chinese. Thanks For this Great post.

  5. Mark Peterson says:

    The one textbook I know of that does this well – picking words that are useful and/or high-frequency – is John DeFrancis’s Beginning Chinese Reader (and Intermediate and Advanced Chinese Readers). He explicitly introduced characters based on their frequency (e.g. the top 400 characters are introduced in Beginning Chinese Reader), and then useful words based on those characters. Really well done – the problem is that it’s so old that the vocabulary is quite out of date.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.