21 essential dictionaries and corpora for learning Chinese

Most learners of Chinese soon realise that available dictionaries have some serious problems. This is mostly true for Chinese-English (and English-Chinese) dictionaries, but it’s also true for Chinese-Chinese dictionaries (in short, they don’t work very well for learners). This article isn’t about the problem itself though, but how to overcome it. If you want to read about the issue, I suggest you head over to Albert Wolfe’s article about the shortcomings of CE dictionaries.

Image credit: sxc.hu/profile/Lockheed
Image credit: sxc.hu/profile/Lockheed

I have studied Chinese for some time now and have used a number of difference dictionaries. The bad news is that I still haven’t found a good dictionary that can do everything I want it to do, but the good news is that I have found several different dictionaries that among them can handle most of the questions I have.

In this article, I will share with you my favourite dictionaries, including why I think they are good, what I use them for and what drawbacks they (all) have. I also hope that you might give me suggestions of dictionaries that might replace those I list below. Note that I’m not looking for dictionaries that can do things that those below can already do well.

My goal here isn’t to give you a list of all available dictionaries. In fact, I have tried to keep the list as short as possible (it’s still quite long). This is because I know most learners are after simple and effective solutions. People who really want to explore other dictionaries will do that without my having to write about it.

This article about digital resources, so even if I mention offline dictionaries, they are still digital. I haven’t used enough paper dictionaries to evaluate them properly and most learners don’t bother with paper dictionaries today anyway.

I have sorted these sources into the following categories:

  1. Online dictionaries mainly relying on English
  2. Online dictionaries mainly relying on Chinese
  3. Online dictionaries for traditional Chinese
  4. Offline dictionaries you should check out
  5. Online corpora and other sentence sources

Online dictionaries mainly relying on English

  • MDBG
    What I use it for: This is my default dictionary in this category
    Pros: Clean interface, easy to use, handwriting recognition, stroke order, sound
    Cons: Sometimes inadequate English definitions (true for most dictionaries, though)
  • Zhongwen.com
    What I use it for: Etymology, character components, horizontal character learning
    Pros: Click on any character part to view that component, good etymology in English
    Cons: Horrible interface, characters as pictures rather than text you can copy
  • Arch Chinese
    What I use it for: Character components, collocations, word frequency
    Pros: Offers related characters and words sorted by frequency (this is awesome)
    Cons: None, really, this site is great in general
  • HanziCraft
    What I use it for:
    Breaking down characters for sensible character learning
    Pros: Very easy to use, integrates well, very fast
    Is still quite new, I haven’t found too many problems though
  • Yellow Bridge
    What I use it for: Same as above
    Pros: Reasonably quick look-up, all information in a tree structure
    Cons: Need to log in for full details (I don’t use that feature, though), much slower than HanziCraft
  • Youdao
    What I use it for: Academic and/or specialist jargon, fixed expressions, dictionary
    Pros: Provides parallel translations, excellent for translation work
    Cons: None, really, this site is very useful

Online dictionaries mainly relying on Chinese

  • Zdic
    What I use it for: This is my main Chinese-Chinese non-traditional dictionary
    Pros: Comprehensive, detailed definitions, English, very detailed single-character information
    Cons: None, really, this is my favourite online Chinese-Chinese dictionary
  • Baidu Dictionary
    What I use it for: Idioms, fixed expressions, other things I can’t find in other dictionaries
    Pros: User-edited, so very comprehensive (think Wikipedia), usually easier than formal dictionaries
    Cons: User-edited, so quality varies, but usually very good

Online dictionaries mainly relying on Chinese (only traditional)

  • Chinese spell checker
    What I use it for: Check the use of character variations in different words
    Pros: The above feature is unique as far as I know, incredibly useful
    Cons: None, really, this site fulfils its function pretty well
  • Character variant dictionary
    What I use if tor: Sort out character variants (obviously)
    Pros: This site is indispensable for independent advanced learners
    Cons: It’s sometimes a bit confusing and doesn’t always give clear answers
  • Taiwan Ministry of Education Chinese Dictionary
    What I use it for: Look up words I can’t find anywhere else, single character information
    Pros: Comprehensive and detailed
    Cons: Archaic examples, hard definitions, too detailed (this is not beginner-friendly at all)
  • Taiwan Ministry of Education Elementary School Dictionary
    What I use it for: Single-character definitions and collocations in Chinese
    Pros: Easier to understand than its bigger cousin (see above)
    Cons: Only has single-characters
  • Taiwan Ministry of Education Character Stroke Order Dictionary
    What I use it for: Check stroke order, check the current writing standard in Taiwan
    Pros: Detailed, well-structured, comprehensive
    Cons: None, really, it does the job pretty well

Offline dictionaries you should check out (apps)

  • Pleco
    What I use it for:Everything on the move, this is all you need, really
    Pros: Excellent handwriting input, OCR input, flashcards, excellent dictionaries
    Cons: Some functions aren’t free
  • Hanping
    What I use it for: Very similar to Pleco in terms of functionality
    Pros: Cheaper than Pleco
    Cons: Still costs money, fewer features than Pleco

Online corpora and other sentence sources

  • Jukuu
    What I use it for: Sentence mining, gathering large volumes of examples
    Pros: Contains a large number of sentences
    Cons: Sometimes hard to find actual sentences, some results are either only words or fragments
  • Iciba
    What I use it for: Similar to Jukuu above
    Pros: Contains a large number of sentences
    Cons: The English translations are horrible, don’t trust them more than you would trust Google translate.
  • Nciku
    What I use it for: Same as above; has fewer but in general better sentences
    Pros: Higher quality sentences with much better translations (reliable English in many cases)
    Cons: Lacks examples of uncommon words and sometimes have too few sentences to find the usage I’m after
  • LCMC
    What I use it for: Collocations, mostly
    Pros: Is a real, tokenised corpus, very big
    Cons: Hard to use if you’re not used to corpus research
  • Academia Sinica Balanced Corpus of Modern Chinese
    What I use it for: Most queries about traditional Chinese or Mandarin usage in Taiwan
    Pros: Is a real, tokenised corpus
    Cons: Only covers Taiwan, not big enough at times
  • Google
    What I use it for: Anything I can’t find using the other sources I’ve listed above
    Pros: Mindbogglingly high number of sentences
    Cons: Hard to find what you’re looking for, hard to be sure that what you find is actually a good example

That’s all for now, I think. If you have any suggestions for how to improve this list by replacing any dictionary with one which is strictly better, let me know! Remember, though, this isn’t an attempt to gather as many dictionaries as possible, but rather to list the best dictionaries for specific purposes. I will keep the list updated as I find better alternatives, please help!

Update: I removed Wenlin and added Hanping instead. Wenlin is great, but it’s very outdated and I can’t even use it with what I have available, whereas Hanping is much more likely to help students. I also removed I Cha Cha and added Youdao instead. The latter is roughly a hundred times better than the previous and I blame my previous inclusion of I Cha Cha on plain ignorance.

Creating a powerful toolkit: Individual characters

Learning to read and write Chinese is not like learning to read and write most other languages. Chinese doesn’t make use of a simple alphabet to represent all the sounds of the spoken language, but rather many thousands of characters to represent various concepts. Thus, if your goal is to learn Chinese properly, it’s likely that learning to read and write is what will take you the longest time to accomplish. Fortunately, this is also an area where there are lots of hacks that will make the process a lot easier.

Before reading this article, I assume that you have already started building the first part of your toolkit for learning Chinese, i.e. you have to know about radicals and character components. If you haven’t, read the first article here about the toolkit hereon Hacking Chinese.

Articles in this series

  1. Character components
  2. Individual characters (this article)
  3. Characters and words
  4. Learning words really fast

Learning characters with few strokes

Some characters, such as many radicals or some simplified characters, have very few strokes. Sometimes, they are pictures or represent logical concepts and these cases are easy to learn as long as you know what the character means. For instance, remembering that 一 means one and 人 means human is fairly obvious.

If you can’t figure it out just by looking at it (which you rarely can), head over to Zhongewen.com or Yellow Bridge and find the character you’re looking for. As soon as you’ve seen the logic behind the character, it becomes reasonably easy to remember that means under, means over and means big (click the characters to follow links that will explain them). Even non-obvious explanations might help, such as for the character (water). It’s probably impossible to guess the meaning of this character based only on what it looks like, but it’s not that hard to see it once you know the answer. Thus, knowing what a character represents is essential for remembering.

However, there are cases where the etymology is unhelpful, so you often have to come up with a mnemonic of your own to remember the character. This might also happen for some simplified characters which have simply lost their original moaning. It doesn’t matter what kind of trick you use to remember the character, anything goes as long as it help you remember it. It’s pointless to learn the real etymology of a character if it doesn’t help you remembering it!

Learning characters with many components

Most of the characters you will learn are fairly complex; they consist of many different parts that together make up a single character. This is even more true if you study traditional characters, but remember that most characters aren’t simplified at all and most of those that are still might be fairly complex (see this article for more about simplified and traditional Chinese). To learn these, you need to know what the component parts mean and then link them together using memory techniques. Again, you don’t need to care too much about the real origin of the word, as long as you use the real meaning of the component parts, you’re on the right track.

Here are three examples to show you how powerful this method can be:

(chóu) – sorrow, worry

This character consists of three parts: 禾 (grain), 火 (fire) and 心 (heart). The two first are combined into 秋 (autumn). This is in reality a phonetic combination, but it’s easy (at least for me as a Swede) to see how plants in nature turn into fire as autumn approaches. According to the dictionary, the combination “autumn” added to “heart” is also phonetic (秋 and 愁 are pronounced similarly), but again, we don’t really care about that now. Doesn’t feeling like there’s autumn in your heart mean that you’re sorrowful? Approaching winter is also a reason to worry, especially if your harvest has burnt down.

(zhèng) – politics, government

This is a character that has a useful mnemonic in it already, you don’t need to come up with something on your own. The character is constituted by two component parts 正 (correct) and 攵 (strike), so who, if not the government, corrects bad behaviour by hitting people? It might be a cynical view of the state, but the image is easy to understand and remember. Since this is what we’re after, this is a good association.

(jì) – covet, desire

The component parts are 山 (mountain), 豆 (bean) and 見 (see). The real origin of the word involves combining “see” with another character that has a similar sound, but which meaning is completely unrelated. However, adding some humour to learning Chinese, it’s easy to create a new idiom: “the other man’s bean mountain is always taller”. Having come up with this mnemonic, I will never ever forget this character.

How to avoid the “it looks like a man with a hat” trap

For the simple characters I’ve said that anything that helps you remember works. This is not true for complex characters with many parts. If you’ve just started studying Chinese and encounter a character which looks like a man wearing a hat, don’t create a mnemonic based on that. It will work for a while, but what you have to realise is that soon you will have fifty characters which all look like different people in various kinds of hats and the system breaks down completely. Also, you can’t create thousands of these pictures without going insane. The solution is to use the real meaning of the component parts and then make mnemonics based on those! Feel free to go crazy, but do it using a solid foundation.

Be creative, have fun!

When you’ve been creating these kinds of memory aids for yourself for a while, you will get very good at it. Take it easy in the beginning and have fun, try to find as cool mnemonics as you can and share them here! I think that my “the other man’s bean mountain is always taller” is almost unbeatable, but perhaps you’ve found something better for another character?

Knowing how to learn individual characters, you are close to discovering how to learn words really fast, but first we need to look a little bit closer into characters and words.