Hacking Chinese

A better way of learning Mandarin

Chinese input methods: A guide for second language learners

Making Chinese characters appear on the screen of a phone or computer is second nature to experienced users, and most people don’t think about input methods unless they switch to another method or something doesn’t work properly.

For people who know little or no Chinese, including beginner students of the language, it’s not at all obvious how Chinese characters are entered on a computer, however.

Tune in to the Hacking Chinese Podcast to listen to the related episode:

Available on Apple Podcasts, Google Podcast, Overcast, Spotify and many other platforms!

Chinese input methods: A guide for second language learners

Some people imagine humongous keyboards with one button for each character, but that’s obviously not how it works today.

Instead, there are several categories of input methods:

  • Phonetic typing – Type how something is pronounced and the computer will try to guess what characters to use. There are many different methods in this category depending on the transcription system used and the purpose of typing.
  • Non-phonetic typing – Instead of relying on pronunciation, these input methods make use of the components and composition of Chinese characters, building up the character through a sequence of strokes. Modern versions still use standard keyboards, of course.
  • Handwriting – Modern touch-screen devices support handwritten Chinese. You write characters directly on the screen with your finger or a stylus, and the computer gives you its best guess for what you mean.
  • Speech recognition – Voice controlled assistants have become more popular in recent years, and phones are now quite good at transcribing what you say into Chinese characters, at least provided that your pronunciation is clear or your utterance contains enough context.

There’s a lot to say about each of these input methods (and I have written about some of them before, which I will refer to when relevant).

It’s important to keep in mind that these input methods are largely developed for native speakers with no thought given to second language learners.

Benefits for learning vs. time and effort required

In this article, therefore, I want to focus on input methods from the perspective of a student learning Chinese as a second language.This involves balancing two factors:

  1. Impact on learning – The input method you use has an impact on your learning. For example, if you rely only on handwritten input even when writing long emails, your handwriting ability will benefit more than if you type Pinyin and just assume the computer gives you the correct characters. On the other hand, handwriting characters does not require you to recall pronunciation in the same way that typing Pinyin does.
  2. Time and effort required – Writing practice is best done in a communicative context, and in real life, speed and convenience matters at least as much as how much you learn. Most people opt for the easiest method available, even if it has serious downsides for their learning. For example, even though I could write long emails in Chinese by hand, I almost never do so because it takes five times longer than typing.

Let’s have a look at the different categories of input methods and see what they have to offer for language learners!

What are the pros and cons with typing, handwriting and speech-to-text?

Naturally, there are no simple answers here, as the analysis depends on your goals for learning Chinese in the first place. It goes without saying that using handwriting input is a waste of time if you don’t care about being able to write by hand at all. My goal here is to provide the basic information and insights and let you draw your own conclusions about what works best for you.

Phonetic typing

This is by far the most commonly used type of input method for Chinese, both on computers and phones. It requires you to know how to spell the spoken word to be able to type it, which is fine for most people as most people anchor their written language in the spoken language anyway.

There are several different types of phonetic typing methods depending on which phonetic transcription system is used.

Pinyin-based input methods

Pinyin is of course the most commonly used system, and there are dozens of different input methods that rely on it. Keyboards in China typically use a Qwerty layout and the ANSI standard should be familiar to most Americans, but somewhat awkward to us European ISO lovers.

Exactly which keyboard layout is used is only of peripheral interest here as it can often be customised (I type Chinese using Pinyin on an ISO board with a special Swedish version of Dvorak, for example).

If you need help with Pinyin-based input methods, I strongly recommend Pinyin Joe’s Chinese Computing Help Desk, which has saved my sanity several times over the years.

There are also very fast input metods used by professional typists, which are very hard to learn and out of the question for most second language learners. As these methods require a ton of practice to master and have almost no extra benefits for learning the language, I will ignore them in this article. For those who are interested, search for 速录机 for one example (here’s a short documentary in Chinese).

Pros of typing in Pinyin: The biggest pro with using a standard Pinyin-based input method as a student is that it connects the spoken language with the characters. It requires you to think about pronunciation and constantly verify that you got it right.

However, standard methods do not require you to input tones, which is a drawback. Some input methods have this as an option, so check if yours does. This both helps you review tones and shortens the list of word candidates at the same time.

Another upside with typing Pinyin is that you can keep your typing speed from other language using the Latin alphabet. You will also be able to use public computers or friends’ computers since this is the standard in China.

Cons of typing in Pinyin: The downside of using this type of input method is that it only allows for very shallow processing of Chinese characters. Depending on how diligently you monitor your output, it might be even worse than reading.

As anyone who has studied Chinese for a while will know, being able to type something does not mean that you are able to write it by hand, so if that’s something you want to be able to do, you need to learn and maintain that skill elsewhere.

In Japanese, there’s even a term for being able to type characters but not write them by hand: ワープロ馬鹿, which translates to “word-processor idiot”.

Zhuyin-based input methods

Zhuin (or Bopomofo) is the standard transcription system used in Taiwa. It relies on special symbols for initials, medials and finals, and so naturally requires you to learn those symbols to be useful.

Beyond that, it shares most of the pros and cons with other phonetic typing methods in that it connects speaking and writing, but that character processing is very shallow at best.

As I discussed in my series of articles about transcription systems, learning Zhuyin might have some benefits for pronunciation, but it’s not very practical for language learners when it comes to typing, and you can reap the benefits without using as your main input method for Chinese characters.

In summary, Zhuyin adds the hassle of having to learn a new keyboard layout without having any major advantages when it comes to entering text on phones and computers. You might want to learn it for other reasons, though.

Non-phonetic typing

From a typing perspective, the main disadvantage of phonetic typing is that it is ambiguous.

Typing rare characters is a good example of this, especially if the syllable in question is common: If the shì you want is number 89 on the list of options, you’re better of typing a word it’s contained in and then delete the extra character. This works, but is clearly inefficient!

Non-phonetic typing instead relies on the components Chinese characters are built out of. There are many different ways of doing this, but the easiest way to think about it is that each key represents several possible components, and by combining several keys, the computer will then output a specific character.

As this process is perfectly predictable, you don’t need to wait to see what the computer spits out before proceeding; you can be sure it’s right provided you hit the right keys.

In contrast, what characters show up when you type in Pinyin is based on context, including your personal typing background, which is often a good thing, but can also lead to slower typing as you need to monitor the output more closely.

Here’s an example of a keyboard showing legends for the Wubi (full name: 五笔字型输入法) input method:

For traditional Chinese, there’s also Cangjie, which is more universal, and Zhengma, but these systems generally share the same pros and cons, so I will discuss them together here. If you have insights to share about any of these, please leave a comment below!

Pros with using non-phonetic typing: The biggest advantage for learners is that you need to know the character to type it, which requires deeper processing and therefore is better for retention.

Another advantage is that typing is not related to pronunciation, so it can be used even if you don’t know how a character is pronounced, or to type in a dialect-neutral (or even language-neutral) fashion.

Cons with using non-phonetic typing: Naturally, the downside is that non-phonetic typing takes a long time to learn, much longer than most students are willing to invest.

I also doubt its merits for long-term learning. In my experience, humans are capable of completely automating even very complex motor sequences to the point where we might not even be able to untangle individual moves in the sequence.

For example, I know many algoritms for solving certain situations on a Rubik’s cube, but some of them I can’t do slowly. I can do the full 12-move sequence (or whatever) quickly in a flow, but I would struggle to write the steps down. Similarly, I suspect that people who really learn non-phonetic typing metods will end up not really thinking about the characters that much.

Still, getting there requires a lot of paying attention to components, which is bound to have a positive impact on your knowledge of them. Also, it would only be the most common characters that get fully automated, and those you probably know anyway.

In any case, I rule these methods out for the vast majority of second language learners because it’s inconceivable that the time spent learning the input method would lead to better learning results than simply investing that effort into learning characters in any other way.

I can still see why some people are drawn to these methods (I feel the urge sometimes myself), but unless you feel a very strong attraction indeed, I think it’s better to stay away.

Handwriting input

While few native speakers under 60 use handwriting input, it is a viable but rather slow method of inputting Chinese characters, mostly on mobile devices (you can connect your phone and use it to write on your computer, too, though).

Strange as it may seem to foreign learners of Mandarin, older native speakers aren’t very comfortable with Pinyin, so handwriting is a convenient alternative. Nowadays, though, voice recognition has become so good that handwriting has become even rarer, but more about this later.

When writing by hand, the software will try to determine what character you want to write and will show you the best candidates for you to choose among. You can even see this best guess updated live while you add strokes, which is pretty cool.

Provided that your stroke order and stroke count are correct, this input method is quite reliable, and you can even join strokes together and still get the right result.

Pros with using handwriting input: The advantage here should be obvious: You are actually writing characters, which requires much deeper processing than anything else mentioned in this article.

This is not necessarily the same as writing on paper, but it’s pretty close, and you also often get confirmation that you’ve written the character correctly and will see a computer-rendered version of it when you’re done.

Naturally, this does not protect you against writing the wrong character. Handwriting input is part of my minimum-effort approach to learning to write Chinese characters. 

A minimum-effort approach to writing Chinese characters by hand

Cons with using handwriting input: Handwriting is very slow compared to the other methods, even if you know all the characters. As a learner, you’re likely to run into characters you don’t know how to write, which will slow you down dramatically. Of course, this could be viewed as an opportunity to learn, but as an input method, this is still a big disadvantage.

However, as Kristin pointed out in a comment, on a phone, typing is not very fast to begin with, or at least not as fast as on a full-sized keyboard. That means that with enough practice, handwriting can actually in some cases be faster, which removes the main disadvantage! Naturally, the more you write by hand, the faster you will get, so this input method also becomes more practical the more you use it.

Speech recognition and speech to text

Speech recognition has come very far in recent years, far enough that it’s a viable options for inputting text in most major languages. That doesn’t mean that speech-to-text is necessarily a good way to input characters for learners of Chinese, though.

I have investigate this question before and wrote two articles about my findings:

  1. The first one asks the question of whether or not speech recognition is good enough to correctly transcribe clear, native speech. The answer is that it does very well, except in some specific situations.
  2. The second article investigates how well speech recognition does with audio produced by second language learners. The answer is mixed.

Another way of looking at the same data is to say that if you say something, and your phone gets it right (displays the characters you intended), you can’t be sure that your pronunciation is okay. However, if you say something, and your phone gets it wrong, you can be pretty sure that your pronunciation is off.

This means that speech recognition can work as a low-threshold check for pronunciation. It you’re pronunciation is really bad, you’ll know, but it won’t tell you if you’re mediocre or good.

So what does this mean for language learners?

It means that the feedback you get on your pronunciation is not good enough to be useful. Your phone or computer will understand things you’re saying, even if you mangle some tones and your initials and finals are a bit off. That essentially means that it’s just a bit better than talking to yourself in terms of speaking practice, which has its benefits, but also obvious limitations.

Pros of using speech recognition: The main benefit of speaking instead of typing is that you skip Pinyin and go directly to the spoken language. It also includes some minimum amount of feedback in the sense that if you’re completely off, it probably won’t display the characters you had in mind. This requires you to actually monitor your output to spot inconsistencies, though.

Cons of using speech recognition: Even though speech recognition  at first looks appealing for language learners, I think the downsides clearly outweigh the upside. The fact that your phone will recognise what you want to say even if you miss tones, initials or finals means that you could be reinforcing incorrect pronunciation by relying on speech to text. If you use this method, be aware that your phone is much more lenient than any teacher would be, so unless your bar is really low, you can’t trust it for feedback.

Conclusion

At the end of the day, phonetic typing wins because it’s the most convenient and also connects your writing with the spoken language. This is even more true if you enable tones.

The only other option that I think learners should seriously consider is using handwriting input occasionally. You don’t have to use it as your main input method, but make a habit of inputting characters by hand regularly. You don’t have to write the whole text this way, but maybe write the first five sentences, then switch to a faster input method?

The other methods, non-phonetic typing and voice recognition, are only useful in specific contexts, some of which I might have overlooked.

Finally, some questions for you:

  • What input methods do you use?
  • What effects do you think this has had on your learning so far?
  • Which input methods are you interested in checking out? Why?


Tips and tricks for how to learn Chinese directly in your inbox

I've been learning and teaching Chinese for more than a decade. My goal is to help you find a way of learning that works for you. Sign up to my newsletter for a 7-day crash course in how to learn, as well as weekly ideas for how to improve your learning!




4 comments

  1. Kristin says:

    I use handwriting almost exclusively, as I actually find it faster than pinyin and more helpful for learning as I speak a lot more than I write, so I need more review of writing than pinyin. I use pinyin in specific situations – on the occasion I’m using my computer rather than my phone, or when I only have one hand free (more common in the last 6 weeks since I had a baby), or, very occasionally, when I just can’t get the character I want with handwriting. I find pinyin actually takes longer as I have to look for the right character and make sure the one that comes up is the one I want. Maybe it’s a matter of practice, and since I use handwriting a lot more I’m better at that. My husband is one of those rare natives under 60 (he’s 31) who also prefers handwriting. He is somewhat of a purist though. He’s been teaching our 5-year-old (who speaks Mandarin by choice despite living in an English speaking country with only one native Chinese parent) to write characters but hasn’t mentioned pinyin yet (this probably makes more sense with a child than with an adult learning a second language, and I suppose pinyin will come up eventually).

    1. Olle Linge says:

      Your comment highlights something I didn’t really go into much in the article, namely that typing on phones is obviously much slower than on computers, especially for some people. I can type at 120 words per minute in English, and while it’s hard to compare, I can easily type Chinese at above 100 characters per minute even with a fairly bad input method. That is clearly not possible when writing by hand (that’s almost two Chinese characters a second).

      However, on a phone, this is quite different. I don’t type very fast at all, so the step to handwriting is actually much smaller, and could be, at least in some cases like you say, even be faster! I’ll add a comment to this effect in the article, as I think it’s something I should have mentioned, but didn’t. Thank you!

  2. Cowbay says:

    After years of learning Taiwanese Mandarin with Pinyin and Bopomofo, I found myself struggling to remember the rarer characters, as well as actively recalling any characters outside the few most common ones, so I started experimenting with Boshiamy (嘸蝦米), a shape-based IME. I’m not very proficient yet, but I already noticed how it forces me to remember character components, and now I’m having a much easier time actively recalling characters from memory. For some characters, it’s because of being able to remember the character itself more easily, and for the others I am able to remember its code and then reconstruct the character based on the code.

    Boshiamy is somewhat similar to Cangjie, but has optional shortcuts rather than having to always type the full sequence. According to some sources on the internet, that allows Boshiamy to have the average combination code length of 2.5 compared to Cangjie’s 4.5, and for that reason consistently wins at speed typing competitions. Cangjie differs a lot between operating systems and versions, while Boshiamy seems fairly consistent from what I’ve seen so far. In Taiwan and Hong Kong, many keyboards come with Cangjie symbols printed on them, but that’s not really needed for Boshiamy and you can use any QWERTY keyboard, as most components are more or less logically assigned to a latin letter.

    The main drawback of Boshiamy is that it’s a proprietary product, even though the codes can be found for free and used with ibus. Also, some codes are loosely based on components’ Mandarin or Taiwanese pronunciations, or their meanings in English, which can be confusing/undesirable for some people, but for me is quite interesting. Compared to Cangjie, there are more overlapping codes and more characters that share the same code, but that doesn’t seem difficult to get used to.

    Some examples:
    哈 (OAO) components look like letters O A O
    陪 (BLO) 阝 and 口 look like B and O, 立 is from “li”
    掰 (HBDH) 手 means Hand, 八 and 刀 are pronounced “ba” and “dao”
    了 (WI) the character looks like W from an angle, I is just a filler representing the last stroke because the code is shorter than three characters
    三 (S) numbers 1-10, as well as some other most common characters like 的, have one letter codes

    The IME’s logic is not hard to learn, the hardest thing is remembering the letter assignments of all the components as they don’t always make obvious sense. The official website offers a lot of practice material, however it’s all in Chinese, so it might not be very accessible to people who don’t have an advanced command of the language. If you do feel up for a challenge, however, I’d wholeheartedly recommend it.

    1. Olle Linge says:

      Thank you for sharing! This is very interesting indeed, but I do find the mix of similarity with English letters and actual components a bit confusing; maybe it makes more sense when you get into it more. It would be interesting to take a deeper dive into different non-phonetic typing systems, but it almost requires personal experience to say much about them from a learner perspective, which isn’t something I really have time to do. That makes comments like yours even more valuable, so thank you again for sharing!

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.