web analytics

I’ve heard many stories about people in East Asia who try to learn English by memorising dictionaries. Even if it’s true that some people actually do that, I think this somewhat puzzling technique isn’t common in the West. Hearing such stories, it’s easy to shake one’s head and wonder how someone could be so stupid to think that memorising dictionaries is the same as learning a language.

Then perhaps it comes to you as a surprise that I a couple of years ago spent roughly one hundred hours spread out over six weeks learning all the characters in the Far East 3000 Chinese Characters Dictionary. Of course, I knew a lot of them from before, but I learnt a considerable amount of new words as well. This article is not about this particular thing or about this dictionary. Any dictionary (or website) based on frequently use characters and/or words will be fine. If you don’t have a book already, I suggest using this online list.

I’ll try to explain why I think that going through such lists is an excellent idea if you do it right and at the right time, and I will also share some thoughts on how to do this without running into some of the problems I did. I never expected this, but the day has come when I actually recommend other people to memorise a dictionary!

Please note that you should only do something like this if you already know the majority of words in the list you want to study. If you use the online list I provided above, you can chose your own number. If you’ve only studied for a year, choose the 1000 most commonly used characters. If you studied for years, choose all 3000. It’s up to you, but I would rather aim slightly too low than too high.

Learning a dictionary isn’t necessarily stupid

First things first, why would memorising a dictionary be a good idea? I’ve argued before that Chinese is a language consisting of many building blocks (see my articles about building a toolkit) and rather than learning a character, it’s fruitful to learn its composition instead. The same goes for words in Chinese (words consisting of more than one character). Making sure that you know the 3000 most common characters, you gain access to a huge number of new words. By access I mean:

  • You can guess the meaning of a compound word because you know the characters in it
  • You can learn new words more easily, because you know the component characters

I have argued elsewhere that vocabulary is not only king, but god emperor as well. If you don’t feel convinced that vocabulary is extremely important, you should check my article about the importance of knowing many words.

Let’s look closer at above-mentioned benefits. The first one might be either useless or invaluable depending on the word. Chinese consists of lots synonym compounds (i.e. words that consist of two characters which mean the same thing, such as 快捷 or 馈赠 (饋贈)) and if you know both the characters, you can be pretty sure about what the word means, whereas if you only know one, the meaning might be anything. This is an example where your toolkit allows you to learn words for free, so to speak.

Moreover, there are numerous examples where there are more than one similar way of saying something. For instance, compare 时限 (時限), 期限 and 年限, which are really easy to distinguish if you know what the individual characters mean, but might cause trouble if you don’t. There are of course more examples, but I think this is enough to illustrate the point.

Now, let’s look at a graph I think some of you have seen before:


The picture is from Patrick Zein’s excellent introduction to Chinese (in Swedish, sorry). On the X-axis is number of characters one knows and on the Y-axis is the expected ability to understand written Chinese, assuming that grammar and character combinations are not a problem (which they of course are, but that’s not the issue here).

What does this graph tell us? Basically, it shows that if we know 3000 characters, we will very rarely come upon characters we don’t know when we read normal Chinese text, provided that we know the correct 3000 characters. If you’ve spent lots of time learning characters that aren’t within the 3000 most common, referring to this graph is wrong.

Using frequency lists plugs holes and makes your foundation more solid

Going through lists of words based on frequency allows you to learn characters you should know (because they are common) but have missed because your textbook or teacher hasn’t presented them yet. This means that you broaden your base, including more words that lie outside your textbook and your course. This provides you with a more solid foundation which you can later use to learn more words and understand spoken and written Chinese with more ease.

Suggestions and tips

Still, after having said all this, I’d like to say that memorising dictionaries is quite stupid. Of course, you shouldn’t just try to commit everything to memory by rote learning, you should use all the clever hacks I talk about in other articles. You use dictionaries to find commonly used words and to gain information about these words. However, this is not enough. Here is some more advice for you:

  • Be careful, sometimes you just think you know what a character means because it’s so common, but in fact it means something completely different when it’s on its own. Check all characters carefully once. This will either allow you to find flaws in your knowledge, or, if no such flaws are found, it will increase your confidence.
  • Learn at least one example word where a given character appears, also make a note of this word in connection with the single character so that when you review it, you can easily see at least one example. Learning words in complete isolation is bad for more than one reason.
  • Don’t feel forced to use the example words in the book. Some dictionaries provide examples that are extremely rare that some native speakers have never heard of. Dictionaries tend to focus on accuracy which isn’t necessarily a good idea. I suggest using an online corpus of examples, such as the one over at nciku.com.
  • Don’t learn the words in alphabetical order, starting from page one and going through the book, because it will be extremely hard to distinguish between one hundred different “shi”. A better way would be to first learn the first character on every page, then the next time learn the second character on every page.
  • Spread it out! Even if you’ve studied for a while, 3000 characters will take a while to go through (100 hours in my case). I managed this by portioning it out, going through a dozen characters at a time whenever I had some time to spare.

Some final words

Conclusively, memorising dictionaries is not a very good idea in general, but I think there is some merit in studying frequency lists, thus making sure you know characters and/or words you really should know. When I did this, I felt that the 3000 characters resulted in a quantum leap in reading comprehension. This will not take care of reading speed, complex grammar or other problems associated with reading ability, but it will enable you to understand many texts you would otherwise have been completely unable to decipher. More importantly, it will make it a lot easier for you to learn more later, given that you now have more building blocks and tools to understand and analyse the language you are learning!


Please consider supporting Hacking Chinese so that I can keep providing free content. Please also visit the site sponsors for high-quality Chinese products and services.


Tagged with:
 

22 Responses to Memorising dictionaries to boost reading ability

  1. gweipo says:

    Actually your post makes absolute sense. When you’re a beginner it is useful to use something like Tuttle’s Learning Chinese characters, which has the first 800 in frequency order with easy memory tricks. I used to cross reference their number to my text book, and then at the end of the first year I just went and learnt the rest that hadn’t been covered.

    I’m now paging through Chinese Character Fast finder (also Tuttle / Matthews) which is shape / radical based, with 3200 character, and more useful like you mention than something that is alphabetical – and yes I do kind of sit down and learn a page from time to time.

  2. [...] a high number of individual characters (parts of words would be applicable for other languages). I went through the 3000 most common Chinese characters before starting. This turned out to be incredibly useful since I usually only need to combine [...]

  3. Paul says:

    Linked (https://docs.google.com/spreadsheet/pub?hl=en_US&hl=en_US&key=0AqJaLYri_ZnYdF9YWG0zR3F5UDN6Nll5Q2d5MUM4UWc&single=true&gid=1&output=html) is a visualisation of character frequency, making the same point as the chart on this page.

    (I can’t work out how to make the chart bigger on google docs, and for some reason it doesn’t like Explorer, but you can drill down a bit, and also hover your mouse if you want to see which character is which.)

  4. Olle Linge says:

    @Paul: Thanks for sharing! What I like most is that this gives me a feeling for what character frequency means. Just viewing the words in a list is one thing and you can see where a certain character is, but this is a lot clearer. Is the size of each cell representative of the characters frequency? If so, where did you get the frequency data?

  5. Paul says:

    Yes, the size of the cell is proportional to the usage – frequency data is from the Modern Chinese Character Frequency List compiled here: http://lingua.mtsu.edu/chinese-computing/statistics/

    What I would really like to do is to match this up with radicals as well – if anyone knows where to find a text version of characters sorted by radical (i.e. the front index section of a dictionary), I can do this. That way, it can tell you not only which characters you might focus on, but the radicals that are most important to know as well.

  6. Olle Linge says:

    @Paul: This is really cool and it would be even more awesome for radicals. However, is the front page of a dictionary enough? Don’t you want all the characters in the dictionary listed by radical? There should of course be such lists, but I haven’t found anything yet. I’m going to Berlin later tonight and won’t be able to look more until next week when I get back. How about asking for this on http://www.chinese-forums.com? Someone should be able to help you!

  7. Paul says:

    By the front page, what I actually mean is what my paper dictionary calls the 检字表 – the second stage of the character lookup when all the characters are listed by radical then stroke order before it tells you which page to look on. I’ll see how I go hunting..

  8. Olle Linge says:

    Yeah, I know what you mean. I’ll do my best to help, but I probably won’t have time until next week. Let me know if you find anything!

  9. [...] going through frequency lists thoroughly and make sure you know all the individual characters (doing so boosted my reading ability a lot). Still, at any levels, learning words that occur naturally in your environment is a good way [...]

  10. Trystan says:

    Olle – how did you go about entering / uploading the dictionary to Anki?

    The reason I ask is that I would like to do something very similar for a different language. There are no online versions of the dictionary so I can only think of entering the vocabulary manually, which is clearly going to be time consuming.

    Your advice would be massively appreciated!

    • Olle Linge says:

      Hi Trystan,

      The boring but true answer is that I just typed in all the characters, definitions and example sentences. Did it take a lot of time? Yes, of course. Do I think that time was wasted? No, definitely not. Retying example sentences and making sure they are correct is a way of studying, selecting the right character from a list to match the character in a book is a way of reviewing. And so on. I don’t think creating one’s own word lists by manually inputting words is a bad idea.

  11. Tyson says:

    I downloaded the SUBTLEX chinese frequency list of words found in subtitles of TV shows, matched it up with HSK lists and have been going downwards through those.

    Even in the first 300 i found some gaps which have really helped me… read subtitles! For example 家伙 is like “guy” and used a lot casually and in action movies, but is considered HSK6. But in subtitles it’s as common as 穿 or 写.

    What I do is go through them, and mark how well i know the word on a scale of 1-5. 1 is a blank. 2 is seen before but pretty unfamiliar. 3 is I know the word but not well. 4 is in my SRS but not perfect yet. 5 is I can write this character from memory reliably.

    Words that are 1-3 go into my SRS via an example sentence. I also have a report of all the words rated 1-4 and occasionally check through them to see if they can be promoted. SRS does this too, so it’s almost repetition, but SRS doesn’t know which ones are highest frequency words.

    And for fun, I also calculate the total % so I have a rough idea of the % of words I understand. Right now it’s 70% at 5 although actually there’s quite a few words I haven’t rated yet so perhaps more like 75%-80% are 3-5 when I mark further down.

    • Olle Linge says:

      Sounds interesting! Can you offer some more information about that list? Also, I’m a bit surprised that 傢伙 is HSK6, it’s really quite common!

      • Tyson says:

        You can download the data here: http://expsy.ugent.be/subtlex-ch/

        There is a paper (easy to find online) that explains the data better, after reading the abstract I was happy enough about their segmentation strategy – I’m no linguistics expert, but I was convinced enough to use it.

        So I just load up a big old Excel file and I search away. I’ve sorted it all on frequency and just work my way down the list.

        It’s also fun to look for the most common 3 character and 4 character words (不好意思 is the most common).

        • Adam says:

          How did you match the list up with the HSK lists in excel? I mean, how can you compare what is in both automatically?

  12. Andrew says:

    Hi, Olle – when you were learning those 3000 characters, how did you handle multiple meanings of a character? For example, the CEDICT dictionary may give 5, 10 or even more meanings for a single character. Did you try to remember them all, or only a few most important of them? In the latter case, how many of them, approximately, did you retain for your studies? How did you decide which ones are the most important?

    Thank you in advance for the answer!

    • Olle Linge says:

      There’s usually a core meaning and I would say it’s very rare with characters that have five or ten different meanings. I would focus on this meaning. The goal isn’t to learn all possible meanings of a character, that would take ten times as long and would be quite pointless.

      • Andrew says:

        Olle, thank you for the reply. I fully agree that memorizing all meanings of a character is pointless. That’s why I asked this question in the first place – how to determine which 2-3 meanings are most important, or better yet, which meaning is the core meaning, as you call it.

        You say, it’s rare to see a character with five or ten different meanings. I guess it’s only because you are now way beyond the first 3000 characters and so this is what you usually see for not very frequent characters. However, for more frequent characters it’s not so rare to see many meanings. Just a few examples:

        薄:
        bo2 – meager; slight; weak; ungenerous or unkind; frivolous; to despise; to belittle; to look down on; to approach
        bo4 – peppermint
        bao2 – thin; cold in manner; indifferent; weak; light; infertile

        tu2 – to apply (paint); to smear; to daub; to blot out; to scribble; to scrawl; mud; street; way, route, road

        ling2 – quick; alert; efficacious; effective; to come true; spirit; departed soul; coffin

        cheng2 – to bear; to carry; to hold; to continue; to undertake; to take charge; owing to; due to; to receive

        Of the character 灵 I know the meaning “spirit”, which is the sixth one. Which is the core one for this character?

        It is similar for me with the character 承, I only know one of its meanings – “to undertake,” the fifth in the list. But which one is the core one?

      • Andrew says:

        It’s a pity that nobody has answered my question yet, it’s really important for me.

        In the meantime, I’d like to extend my question by giving one more example. (This character is in my current list right now.)

        The character: 逼 [bī]. Dictionaries give me the following meanings:

        1. to force, to compel;
        2. to drive;
        3. to press for, to extort;
        4. to close in on.

        Which meaning (or meanings) should I choose for memorizing?

        Generally, not only do I need an answer to this particular question, but I also would like to understand the general approach, how to select the main meaning(s) of the character. (Please see also my previous comment here.)

        • Olle Linge says:

          Thanks for posting another comment; I might have missed your first one if you didn’t. I run this site on my spare time and I receive tons of e-mails and comments, and it’s becoming harder and harder to respond to everyone.

          In general, I think you can see that most of the characters you mentioned actually don’t have that many meanings or that they are related in groups.

          For instance, regarding 涂, I see two meanings there, one which is related to writing/smearing/applying stuff to something, and one which is related to road. This character doesn’t have 11 different meanings, it’s just that it’s hard to capture the basic meaning in one English word.

          In general, though, you need a good dictionary, which will either list the most common one first or the original meaning first. I just checked Pleco’s CE dictionary for 逼 and it lists each meaning with examples and translation. I need to verify this, but it looks like the most common one comes first.

          If this doesn’t work, you can take a short cut and see which meaning the character has in the common words you know. For instance, if you learn 逼迫, any good dictionary will tell you that in this word, 逼 means “force, compel”. Learn that and care about the other meanings when you learn words that contain them.

          Hope this helps! There is no quick fix to your problem, but basing it on what you’re actually learning and not focusing too much on learning from dictionaries is probably a good idea. Remember that I combined this approach with a full immersion environment and knew about 2/3 of the characters when starting. I do NOT recommend this as your main approach.

  13. Oh Yonghao says:

    This is the exact book which I used to learn Chinese characters while I was in Taiwan. The method I used was to write every character on the front of a flash card with the pronunciation (I chose to use Zhu Yin instead of Pinyin) and meaning on the back. This took hours, not sure how many, but was well worth it.

    I did do this in alphabetical order which helped me learn how to pronounce characters by looking at them, you’ll find quite a few patterns in there and learn how some parts affect the sound and meaning. By the end I knew how to write any character I saw, and was good at guessing pronunciation and meaning.

    I did not memorize them in alphabetical order, but rather with the flashcards I mixed them up well, and remixed them every week or so as to prevent me from learning them in this stack, a problem I find with learning languages in Rosetta stone is that I know the language while I am in that lesson on the computer.

    Since then I have struggled to find a good list to add to my 3000 base, other than just learning characters that I come across. I do have the little yellow dictionary with over 12,000 characters with the radical and stroke lookup, but as you get higher in the number of characters the usefulness and frequency drops way down.

    • Olle Linge says:

      I have used a frequency list to learn up to around 6000 characters. This is completely useless from a practical point of view and I did this as a challenge and to learn more about learning characters. I strongly believe that learning characters after 2500-3500 should be on a need-to basis only, unless you think it’s fun to learn characters. Most of the characters I learnt this semester have no impact at all on my Chinese proficiency and I might never see some of them ever again. As you say, there are so many infrequent characters used only in place names and so on.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>