Phonetic components, part 1: The key to 80% of all Chinese characters

When introducing characters for the first time, most teachers explain that there are six different kinds of ways that characters are composed in Chinese (六书/六書 in Chinese, read more here if you don’t know what I’m talking about). The first category brought up is usually pictographs, which are (or at least were) pictures of objects in the real world.

Sometimes, teachers spend a lot of time explaining how these work, showing how a picture of the sun turned into 日, how the moon turned into 月 and a tree into 木. Then, to show that Chinese characters aren’t that scary, some teachers demonstrate that character can be combined to form new characters, so that if 木 means tree, 林 means forest and 森 means luxuriant growth.

yangRegardless of what the teacher does next, this is what sticks in students minds. There might be other explanations of the other ways of character formation, but since they are less direct and requires you to already understand a bit about characters before you fully understand what it’s all about, they are either glossed over or not remembered by the students.

This is serious, because while pictographs are pretty and easy to explain, they only make up around 5% of all characters. Phonetic-semantic components, on the other hand, make up almost 80% of all characters. Funny that most material online and in textbooks tend to focus on the former and not the latter. Indeed, most textbooks I’ve seen don’t do more than give a few lines defining what phonetic-semantic compounds are.

A typical phonetic-semantic compound is shown in the picture to the right. It consists of one semantic part that relates to the meaning of the character (white, water in this case) and one phonetic part that indicates the pronunciation of the characters (red, sheep in this case).

A huge majority of characters belong to one category: phonetic-semantic compounds

After the introduction course, teachers will assume that the students already learnt about phonetic-semantic compounds the first week, so no-one will really make up for it later. This means that there are myriads of intermediate and even advanced learners who haven’t actually understood why phonetic components are crucial.

This article is the first of two about phonetic components of Chinese characters. Apart from introducing phonetic components, this article will show you how knowing about them can help you tremendously with your character learning and your ability to read out loud (and even guess the pronunciation of characters even if you’ve never seen them). This is something native speakers do all the time and most second language learners pick up sooner or later. I’d like to make it sooner rather than later for you.

The second article deals with something much less widely discussed: How you use phonetic components to hack Chinese characters. This is a variant of horizontal character learning, where you focus on a common phonetic component in order to distinguish between visually similar characters that would otherwise be very hard to learn and would keep on trolling you for years. This is avoidable if you understand phonetic components.

What does a phonetic-semantic character look like

In order to understand these characters, it helps being aware of how they were created. Spoken language of course predates written language, so when people in ancient China started to write, they already had a developed spoken language they wanted to express using characters. The most obvious pictographs probably weren’t that hard since they are just slightly stylised versions of real-world objects, but it should be obvious for everyone that you can’t have a picture of every single word you want to say. How do you draw an ocean? What about love? Yesterday? An hour?

Of course, these concepts already existed in the spoken language, so what people started doing was combining one character that represented meaning (the semantic component) and one that represented the sound in the spoken language (the phonetic component). Thus, such a character consists of two completely different parts that have no relationship to each other, but which still make up a new character.

To show what I mean, let’s look at an examples. 洋 (ocean) – this character consists of water 氵 and sheep 羊. Now, it should be obvious that this is not simply a combination of two related characters to form a third related character (such as 木, 林 and 森). Instead, the semantic component 氵 tells us that the character is related to water and 羊 tells us that the character is pronounced the same way as sheep is, i.e. yáng.

The power of phonetic components

As mentioned above, this kind of construction makes up around 80% of all characters in Chinese. That’s a considerable majority and if you want to learn many characters, you need to understand how they work. Most importantly, knowing about phonetic-semantic compounds gives you clues about the pronunciation of characters. Thus, it’s not true that written and spoken Chinese are completely separate, because in most cases, there is a phonetic component to the character. Still, it might have mutated, sometimes beyond recognition, through the ages, but most are still clearly discernible.

Here are some examples of phonetic components and characters they appear in, along with their pronunciations in Mandarin, as well as their meanings (which are usually unrelated to the pronunciation, of course). I have only included common sample characters, there are many more, of course.

Phonetic component: 羊, yáng (sheep)

  • 洋, yáng (ocean)
  • 樣, yàng (manner, appearance)
  • 養, yǎng (to support, to raise)
  • 氧, yǎng (oxygen)

Phonetic component: 青, qīng (green/blue)

  • 請, qǐng (please, to ask)
  • 清, qīng (clear)
  • 情, qíng (emotion)
  • 晴, qíng (clear, fine)

As you can see, sometimes the pronunciation isn’t identical. For instance, the characters might have different tones (氧/洋, yǎng/yáng), initial (湯/傷, tāng/shāng) or final (踉/浪, liàng/làng) or any combination of these, but these are still incredibly valuable clues. Some phonetic components are extremely regular. Have a look at these characters: 碟, 諜, 喋, 牒, 堞, 蝶, 蹀, 鰈. They are all pronounced dié!

Towards a better understanding of Chinese characters

This is just the beginning. When you understand what phonetic components are, you will see them all over the place. Chinese characters look very confusing at first, but phonetic components make up the most important piece in the puzzle. Read the second article about how to hack Chinese characters with our newly-acquired knowledge of phonetic components!

Update: Regarding phonetic components, I just thought of another pair: 唐 and 庸. This is actually a perfect case, since all common characters with 唐 are pronounced exactly like 唐 (táng): 糖塘搪瑭醣溏螗磄, and all common characters with 庸 are pronounced like 庸 (yōng): 慵镛墉鳙鄘. So, in essence, you just need to create one mnemonic for 唐 and one for 庸 and you’ll never confuse these characters again! I don’t know if this particular pair has cause you any problems, but since this seems to be a perfect case, I thought I’d share it with you.

21 thoughts on “Phonetic components, part 1: The key to 80% of all Chinese characters”

  1. Good article as always. Underlying all this, I think the key thing to mention is how important studying radicals is (evidential in the examples cited in the article). Combine the knowledge of radicals + familiarity with character composition (knowledge that phono-semantic exist) + natural human ability to extrapolate = character acquisition.

    Horizontal learning can be helpful, but I can also see this may be difficult for some as this is basically studying homophones.


    1. Yes, but if you don’t do horizontal learning, you will forget how to write these characters over and over again. In fact, I think the main problem in most cases is that the learner isn’t even aware that there are several very similar characters that are giving him/her a hard time. This is also where it’s essential to learn characters as well, because even though they are homophones (or close to), they still look different and mean different things. Without that knowledge, it would be madness to try to study homophones like this!

  2. Great point! Actually this is complementary to studying radicals, since the radical is normally the semantic part. Characters are typically organized by radical, so as a student it can take a while before you spot these phonetic links.

    Notable exception is, which tries to organize per phonetic component. I have it in book form (“Chinese Characters” by Rick Harbaugh) and used it a lot during my studies.

    1. Yes, exactly, and the reason I wrote this is because most students and teachers focus only on the semantic parts (at best).

  3. My teacher hasn’t touched on phonetic components at all! I will definitely have to bring it up after my next class. Great article!

  4. What this actually does not explain at all is how do you figure out what is the phonetic part and what is the semantic part?

    1. This should be obvious most of the time, you just break down the character and see which part corresponds to the sound and which to the meaning. If there is no obvious connection, it’s either not a phonetic-semantic character or one that has changed too much to be actually useful. You can also use or to check.

  5. I actually did a phonetic analysis of around 18,000 characters, looking for phonetic components.

    What I found is that while phonetic components are present in most characters, they aren’t important for early-stage characters.

    In the first 100 commonly studied characters there were maybe 3 that had a phonetic component (本、方、 and something else). Characters with phonetic components don’t really become important until after the first 1000 or so characters. At that point, a learner probably has started to notice some phonetic components by themselves.

    I think this should be a technique for intermediate to advanced learners. If you try introducing to early, you’ll end up bombarding beginners with characters that aren’t important yet.

    1. Thanks for pointing this out! I think the reason teachers focus on pictographs and ideograms is not only that they are easier to learn, but also because many basic characters belong to this category. Do you have a breakdown of the number or percentage of phonetic-semantic characters for different sets of characters, like the 1-1000 most common, 1000-2000 most common and so on? That would be really interesting.

      Still, I don’t agree with you that this should be kept until the students reach an intermediate level. I don’t mean to say that it needs to be taught thoroughly for all characters, but just highlight the principle might help a lot. There are lots of characters that students learn very early that are obvious candidates, like 馬(馬,碼,媽,罵)and 洋(樣,洋,養,癢). Of course, it becomes more and more important the more characters you learn, but I’m not sure that students pay enough attention to realise exactly how important it is.

  6. My question is still, how do I know when a component is the radical or phonetic? How do you distinguish?
    (Sorry if it was already mentioned somewhere, i might not have gotten it)
    And thanks!

    1. It’s noted in most character dictionaries. If you want to do this online, you can use which explicitly states which is the phonetic component, or HanziCraft which won’t tell you explicitly, but shows you similarities between the pronunciation of the character and that of the component parts.

  7. For those who wish to learn characters in any kind of familly (whether semantic, phonetic, or other), then my publication “Radical plus” in my website should be useful: you can visit all the families where characters share a shape, up to HSK 5 (1850 characters) with all details. Characters up to a total of 3100 also appear, without details (pinyin only).

  8. Great article! I am experimenting with chinese character mnemonics (The Heisig method and Zhongwen) again in order to learn Japanese. There are two questions that came to me recently and I wanted to ask it here because I think they fit into this comment section. Sorry if these questions were already posted and/ or answered.

    1. How do native chinese people see the characters? From the semantic-phonetic-theory, I would think that they see a semantic radical in combination with a phonetic radical (which very often is a radical combination or a combination of altered radicals). So, in a sense, they “only” see two parts, wich would explain why the characters can be read so quickly. This leads to my next quiestion.

    2. In order to apply mnemonics to the characters (remembering the meaning of a character, not the reading), wouldn´t it be a good idea to learn all the radical combinations that make up the phonetics, then have a picture/ association for each phonetic, and then just build mnemonics? The major difference to the normal approach is that no only would you learn the somewhat 200 basic radicals, but also ALL the radical combinations that pop up. That´s a way bigger task, but I think this would enable you to read the characters as quickly as native people do (see question 1).

    Waiting for your thougts!


Comments are closed.