Hacking Chinese

A better way of learning Mandarin

Phonetic components, part 1: The key to 80% of all Chinese characters

When introducing characters for the first time, most teachers explain that there are six different kinds of ways that characters are composed in Chinese (六书/六書 in Chinese, read more here if you don’t know what I’m talking about). The first category brought up is usually pictographs, which are (or at least were) pictures of objects in the real world.

Sometimes, teachers spend a lot of time explaining how these work, showing how a picture of the sun turned into 日, how the moon turned into 月 and a tree into 木. Then, to show that Chinese characters aren’t that scary, some teachers demonstrate that character can be combined to form new characters, so that if 木 means tree, 林 means forest and 森 means luxuriant growth.

yangRegardless of what the teacher does next, this is what sticks in students minds. There might be other explanations of the other ways of character formation, but since they are less direct and requires you to already understand a bit about characters before you fully understand what it’s all about, they are either glossed over or not remembered by the students.

This is serious, because while pictographs are pretty and easy to explain, they only make up around 5% of all characters. Phonetic-semantic components, on the other hand, make up almost 80% of all characters. Funny that most material online and in textbooks tend to focus on the former and not the latter. Indeed, most textbooks I’ve seen don’t do more than give a few lines defining what phonetic-semantic compounds are.

A typical phonetic-semantic compound is shown in the picture to the right. It consists of one semantic part that relates to the meaning of the character (white, water in this case) and one phonetic part that indicates the pronunciation of the characters (red, sheep in this case).

A huge majority of characters belong to one category: phonetic-semantic compounds

After the introduction course, teachers will assume that the students already learnt about phonetic-semantic compounds the first week, so no-one will really make up for it later. This means that there are myriads of intermediate and even advanced learners who haven’t actually understood why phonetic components are crucial.

This article is the first of two about phonetic components of Chinese characters. Apart from introducing phonetic components, this article will show you how knowing about them can help you tremendously with your character learning and your ability to read out loud (and even guess the pronunciation of characters even if you’ve never seen them). This is something native speakers do all the time and most second language learners pick up sooner or later. I’d like to make it sooner rather than later for you.

The second article deals with something much less widely discussed: How you use phonetic components to hack Chinese characters. This is a variant of horizontal character learning, where you focus on a common phonetic component in order to distinguish between visually similar characters that would otherwise be very hard to learn and would keep on trolling you for years. This is avoidable if you understand phonetic components.

What does a phonetic-semantic character look like

In order to understand these characters, it helps being aware of how they were created. Spoken language of course predates written language, so when people in ancient China started to write, they already had a developed spoken language they wanted to express using characters. The most obvious pictographs probably weren’t that hard since they are just slightly stylised versions of real-world objects, but it should be obvious for everyone that you can’t have a picture of every single word you want to say. How do you draw an ocean? What about love? Yesterday? An hour?

Of course, these concepts already existed in the spoken language, so what people started doing was combining one character that represented meaning (the semantic component) and one that represented the sound in the spoken language (the phonetic component). Thus, such a character consists of two completely different parts that have no relationship to each other, but which still make up a new character.

To show what I mean, let’s look at an examples. 洋 (ocean) – this character consists of water 氵 and sheep 羊. Now, it should be obvious that this is not simply a combination of two related characters to form a third related character (such as 木, 林 and 森). Instead, the semantic component 氵 tells us that the character is related to water and 羊 tells us that the character is pronounced the same way as sheep is, i.e. yáng.

The power of phonetic components

As mentioned above, this kind of construction makes up around 80% of all characters in Chinese. That’s a considerable majority and if you want to learn many characters, you need to understand how they work. Most importantly, knowing about phonetic-semantic compounds gives you clues about the pronunciation of characters. Thus, it’s not true that written and spoken Chinese are completely separate, because in most cases, there is a phonetic component to the character. Still, it might have mutated, sometimes beyond recognition, through the ages, but most are still clearly discernible.

Here are some examples of phonetic components and characters they appear in, along with their pronunciations in Mandarin, as well as their meanings (which are usually unrelated to the pronunciation, of course). I have only included common sample characters, there are many more, of course.

Phonetic component: 羊, yáng (sheep)

  • 洋, yáng (ocean)
  • 樣, yàng (manner, appearance)
  • 養, yǎng (to support, to raise)
  • 氧, yǎng (oxygen)

Phonetic component: 青, qīng (green/blue)

  • 請, qǐng (please, to ask)
  • 清, qīng (clear)
  • 情, qíng (emotion)
  • 晴, qíng (clear, fine)

As you can see, sometimes the pronunciation isn’t identical. For instance, the characters might have different tones (氧/洋, yǎng/yáng), initial (湯/傷, tāng/shāng) or final (踉/浪, liàng/làng) or any combination of these, but these are still incredibly valuable clues. Some phonetic components are extremely regular. Have a look at these characters: 碟, 諜, 喋, 牒, 堞, 蝶, 蹀, 鰈. They are all pronounced dié!

Towards a better understanding of Chinese characters

This is just the beginning. When you understand what phonetic components are, you will see them all over the place. Chinese characters look very confusing at first, but phonetic components make up the most important piece in the puzzle. Read the second article about how to hack Chinese characters with our newly-acquired knowledge of phonetic components!

Phonetic components, part 2: Hacking Chinese characters

Update: Regarding phonetic components, I just thought of another pair: 唐 and 庸. This is actually a perfect case, since all common characters with 唐 are pronounced exactly like 唐 (táng): 糖塘搪瑭醣溏螗磄, and all common characters with 庸 are pronounced like 庸 (yōng): 慵镛墉鳙鄘. So, in essence, you just need to create one mnemonic for 唐 and one for 庸 and you’ll never confuse these characters again! I don’t know if this particular pair has cause you any problems, but since this seems to be a perfect case, I thought I’d share it with you.

Tips and tricks for how to learn Chinese directly in your inbox

I've been learning and teaching Chinese for more than a decade. My goal is to help you find a way of learning that works for you. Sign up to my newsletter for a 7-day crash course in how to learn, as well as weekly ideas for how to improve your learning!


  1. Ping He says:

    Good article as always. Underlying all this, I think the key thing to mention is how important studying radicals is (evidential in the examples cited in the article). Combine the knowledge of radicals + familiarity with character composition (knowledge that phono-semantic exist) + natural human ability to extrapolate = character acquisition.

    Horizontal learning can be helpful, but I can also see this may be difficult for some as this is basically studying homophones.


    1. Olle Linge says:

      Yes, but if you don’t do horizontal learning, you will forget how to write these characters over and over again. In fact, I think the main problem in most cases is that the learner isn’t even aware that there are several very similar characters that are giving him/her a hard time. This is also where it’s essential to learn characters as well, because even though they are homophones (or close to), they still look different and mean different things. Without that knowledge, it would be madness to try to study homophones like this!

      1. William says:

        I love your article. Your exchange with Ping He and Steven Daniels in the comments is the EXACT conversation I have had with hundreds of teachers and fellow students for years. Yeah, why CAN’T we teach beginner students that 马 is the phonetic component of 妈? ~blank stares~ I think the problem stems from a widespread belief that Chinese characters shouldn’t have phonetic components. In this way of looking at things, any characters that do have an obvious phonetic component are something that was created later on, a corruption of a purer Chinese that one existed. It’s like it’s something embarrassing that we should try to ignore. I’m telling you, I couldn’t easily distinguish between 昨 and 作 until I knew that the function of 乍 in those characters was as a sound component. At that point, I had already passed HSK5 and was about to pass HSK6. (If I tell my teachers and friends that 乍 is a phonetic component of 昨&作, they will mostly just give me blank stares. How could ‘zha’ have anything to do with the sound of ‘zuo’? They give me a look of pity, like these two commenters above.) In my experience, many people consciously/subconsciously use the sound components to guess what sounds they are reading and then connect it with their knowledge of oral Chinese.

        Q1. will Chinese character learning be better if the function and meaning of semantic radicals is explicitly taught?
        YES, but that form of teaching Chinese doesn’t yet exist on Earth; it should include teaching the functional role of the phonetic components as well!
        Q2. can adult second language learners of Chinese pick up the functional role of semantic radicals while learning the meaning of Chinese characters?
        YES. My question is, can you learn the meaning of Chinese characters WITHOUT pick up the functional role of the phonetic components and well as semantic radicals.

  2. Great point! Actually this is complementary to studying radicals, since the radical is normally the semantic part. Characters are typically organized by radical, so as a student it can take a while before you spot these phonetic links.

    Notable exception is Zhongwen.com, which tries to organize per phonetic component. I have it in book form (“Chinese Characters” by Rick Harbaugh) and used it a lot during my studies.

    1. Olle Linge says:

      Yes, exactly, and the reason I wrote this is because most students and teachers focus only on the semantic parts (at best).

  3. Beth says:

    My teacher hasn’t touched on phonetic components at all! I will definitely have to bring it up after my next class. Great article!

  4. Michael says:

    What this actually does not explain at all is how do you figure out what is the phonetic part and what is the semantic part?

    1. Olle Linge says:

      This should be obvious most of the time, you just break down the character and see which part corresponds to the sound and which to the meaning. If there is no obvious connection, it’s either not a phonetic-semantic character or one that has changed too much to be actually useful. You can also use Zhongwen.com or Hanzicraft.com to check.

      1. Tamhas says:

        As a general rule, if a character component takes up all of the left side of a character, all of the top (with a left/right structure character below the radical) or all of the bottom (with a left/right structure character above the radical) then it’s the radical, or semantic component. The remaining part is the phonetic component. And yes, there’s a good chance that 1) you recognise it from elsewhere and 2) it has a similar sound to the character you know, or that it forms a part of.

  5. I actually did a phonetic analysis of around 18,000 characters, looking for phonetic components.

    What I found is that while phonetic components are present in most characters, they aren’t important for early-stage characters.

    In the first 100 commonly studied characters there were maybe 3 that had a phonetic component (本、方、 and something else). Characters with phonetic components don’t really become important until after the first 1000 or so characters. At that point, a learner probably has started to notice some phonetic components by themselves.

    I think this should be a technique for intermediate to advanced learners. If you try introducing to early, you’ll end up bombarding beginners with characters that aren’t important yet.

    1. Olle Linge says:

      Thanks for pointing this out! I think the reason teachers focus on pictographs and ideograms is not only that they are easier to learn, but also because many basic characters belong to this category. Do you have a breakdown of the number or percentage of phonetic-semantic characters for different sets of characters, like the 1-1000 most common, 1000-2000 most common and so on? That would be really interesting.

      Still, I don’t agree with you that this should be kept until the students reach an intermediate level. I don’t mean to say that it needs to be taught thoroughly for all characters, but just highlight the principle might help a lot. There are lots of characters that students learn very early that are obvious candidates, like 馬(馬,碼,媽,罵)and 洋(樣,洋,養,癢). Of course, it becomes more and more important the more characters you learn, but I’m not sure that students pay enough attention to realise exactly how important it is.

  6. super i am so happy after read this article ..

    thanks for think this type of article .

  7. Frederik Rasmussen says:

    My question is still, how do I know when a component is the radical or phonetic? How do you distinguish?
    (Sorry if it was already mentioned somewhere, i might not have gotten it)
    And thanks!

    1. Olle Linge says:

      It’s noted in most character dictionaries. If you want to do this online, you can use Zhongwen.com which explicitly states which is the phonetic component, or HanziCraft which won’t tell you explicitly, but shows you similarities between the pronunciation of the character and that of the component parts.

      1. Garry Gadsby says:

        I have recently finished the ‘Outlier Chinese Character Masterclass’ and I am an older adult learner that gets confused easily, but this course talks about these very issues of Character components and how to learn and study Chinese characters, which has encouraged me to begin my learning. I can see your website can be a great asset for me as well.

        Thanks so much.


  8. For those who wish to learn characters in any kind of familly (whether semantic, phonetic, or other), then my publication “Radical plus” in my website should be useful: you can visit all the families where characters share a shape, up to HSK 5 (1850 characters) with all details. Characters up to a total of 3100 also appear, without details (pinyin only).

  9. Tobias Wolf says:

    Great article! I am experimenting with chinese character mnemonics (The Heisig method and Zhongwen) again in order to learn Japanese. There are two questions that came to me recently and I wanted to ask it here because I think they fit into this comment section. Sorry if these questions were already posted and/ or answered.

    1. How do native chinese people see the characters? From the semantic-phonetic-theory, I would think that they see a semantic radical in combination with a phonetic radical (which very often is a radical combination or a combination of altered radicals). So, in a sense, they “only” see two parts, wich would explain why the characters can be read so quickly. This leads to my next quiestion.

    2. In order to apply mnemonics to the characters (remembering the meaning of a character, not the reading), wouldn´t it be a good idea to learn all the radical combinations that make up the phonetics, then have a picture/ association for each phonetic, and then just build mnemonics? The major difference to the normal approach is that no only would you learn the somewhat 200 basic radicals, but also ALL the radical combinations that pop up. That´s a way bigger task, but I think this would enable you to read the characters as quickly as native people do (see question 1).

    Waiting for your thougts!


  10. manoj says:


  11. boctulus says:

    養 has no phonetic component

  12. boctulus says:

    Maybe I should retract me about if 羊 is not the phonetic component of 養 but if it’s then it has changed a lot!

    Can it be confirmed ? I researched a little before said it’s not

    1. Olle Linge says:

      Why do you say that? It hasn’t changed much at all. I’m no expert in actual etymology though, but the first source I found listed it as a semantic-phonetic compound. Now, using Wiktionary is not conclusive, but since the component is clearly in the character and the pronunciation is the same but with different tone, I’m going to assume it’s right until I’m proven wrong! Even if it isn’t right etymologically, it’s still works in practice. 🙂

  13. Cat says:

    Does anyone have any tips for studying phonetic characters? Do you learn the meaning of all of them? Some make sense to learn the meaning for ie 羊 but with more obscure meanings such as 臿 do you think it’s useful to know it means separate grain from husk or if it’s ok to just know it as phonetic component cha1?

    Any thoughts much appreciated!

  14. lizCOLE says:

    Q1. will Chinese character learning be better if the function and meaning of semantic radicals is explicitly taught?
    Q2. can adult second language learners of Chinese pick up the functional role of semantic radicals while learning the meaning of Chinese characters?

  15. Carson says:

    I was wondering…someone told me that the Chinese word for a large water-going craft was composed of the words for “boat,” “eight,” and “people.” Is this true, or are some parts of that word merely phonetic?

    1. Aurelio says:

      They meant 船 chuán: the 舟 is indeed a boat which is to be expected in a nautical term. 㕣 is a rather rare phonetic also found, e.g. in 鉛 qian. The interpretation as eight people 八口(sc. 人) is at best described as ‘fanciful’ (I would have said ‘humbug’), but Christian missionaries love it, because they take it as a sign that the ancient Chinese knew about Noah and his sons + their respective wives reportedly being a family of eight on the Ark and decided to bake this into the Chinese character system. Which is why this folk etymology keeps coming up again 々.

  16. John says:

    this is a good article i learned a lot to think about!

  17. Connor says:

    How should I be able to determine that 2 characters combined through this method, give a particular word. For ex: How would I be able to determine that the symbol for water and the symbol for sheep equals ocean.

    1. Olle Linge says:

      You almost never can. This is a memory aid, not a way to decipher characters you’ve never seen. However, you can often guess the pronunciation of a character if you’re good at phonetic components!

  18. Antonina 朵夏 says:

    Great article! I was aware of this semantic-phonetic division, but have never really read about it in such a consistent manner <3 It will also help me a lot to explain this idea to my students. They were very disappointed with Chineasy method – it's beautiful and promises that Chinese is easy, but the logic falls apart quickly if you focus only on pictographs.

  19. Sergey says:

    Here’s the best and most comprehensive table of Mandarin phonetic components I’ve found so far – http://chinese.exponode.com/r2_4.htm
    Approximately 1200 syllabic components, I have randomly checked through the first two parts of 2013 official list “Tongyong guifan hanzi biao” (3500 and 3000 “frequent” and “common” characters respectively) – almost all are present in this table. It’s traditional but can be easily converted into simplified using online software.
    It’s really weird that there are few such lists – makes learning Chinese much easier and funny.

    1. Olle Linge says:

      That does indeed look quite comprehensive, thank you for sharing! I will have a closer look and see if I can’t compile something more directly aimed at students.

  20. 闪电54 says:

    This is a great article and great information. I do have a question (I apologize if it’s already been asked and answered):

    What’s the best way to start learning these phonetic components? Is it better to just pick them out as you learn new Hanzi? Or would it be more beneficial to study a list of common phonetic components first before fully diving into learning Hanzi? Is there such a list available for simplified characters only?

    Thank you for all your information!

    1. Olle Linge says:

      I’m working on a list of the most important ones and will publish something about that in the relatively near future (this year, hopefully, but I don’t think that’s the right way to do it. If you’ve already been studying for a while and have focused too little on phonetic components, seeing such a list can be very helpful as it might open your eye to how pervasive phonetic components are in Chinese. For most people, however, learning something like this from a list is not very good. In fact, learning words from lists in general is seldom a good idea. Most students do best by knowing about the existence of phono-semantic characters and roughly how they work, then pay attention to components and their pronunciation when learning characters. The really useful ones will stand out and be easy to see, the ones you have to look up in order to understand (such as 也 being the sound component of 他) are not necessary to learn, although it can help you understand the character and stop thinking about why the ancient Chinese associated the meaning of those characters with “he; him” (which they didn’t, because it’s not a semantic component).

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.