Let’s talk about radicals in Chinese characters, like 虫 in 蚊, and why they don’t work the way that you think they work.
This discussion will take us outside of China and into parts of the historical “sinographosphere”: 🇰🇷🇰🇵🇯🇵, and especially 🇻🇳.
🧵
We'll start with a little quiz. Here are five characters. Which one doesn’t fit the category that the others are in?
a 𠃣
b 𠀧
c 𡈺
d 𠳒
e 夠
Before you answer, let me warn you that it’s a trick question.
You said (e) 夠, didn’t you?
Even though I warned you it was a trick question?
Well, I can’t actually say you’re wrong. Of these five, 夠 is the only character used in Standard Written Chinese.
Quite possibly it’s the only one of the five that you recognized.
So (e) is a fair answer.
But the answer I was looking for is (d) 𠳒.
Before I explain why, let’s have another go, this time with a clue: The category to which all but one of these characters belongs is related to character structure and component function.
a 𡗉
b 夠
c 𦊚
d 𠸗
e 說
The answer is …
.
.
.
.
.
.
.
.
(e) 說.
Did you get it? No?
Okay, I see that I better start explaining myself before the crowd gets restless and starts pelting me with rotten tomatoes.
To start the explanation:
The following characters were all created in Vietnam to write Vietnamese words. They are part of the chữ Nôm script:
𡗉𠃣𠀧𦊚𡈺𠸗𠳒
We really shouldn’t call them “Chinese characters”, as the are not part of the mainstream written Chinese tradition.
Instead, I’ll use the term “sinogram” to refer to any graph adapted or derived from Chinese characters or their components, regardless of whether the graph is used to write a Chinese language or some other language.
The umbrella term “sinogram” covers not only mainstream and “dialectal” Chinese characters, but also Japanese kanji, Korean hantcha, and Vietnamese chữ Hán, including kokuji 国字, kukcha 국자, and chữ Nôm 𡨸喃.
But I’m getting ahead of myself. I promised to talk about radicals, and explain my answers to the quizzes above.
To do that, I’m going to have to back up a bit and talk about Chinese phonetic-semantic compound characters, aka xíngshēngzì 形聲字. Then we’ll come back around to the characters in the two exercises, and see how this all applies to an analysis of Vietnamese chữ Nôm graphs
So settle in, this will take a little while. But it’ll be worth your time.
Phonetic-semantic compound graphs are, by far, the most common type of Chinese character.
If you can read Standard Written Chinese (in Mandarin, Cantonese, or any other mainstream reading tradition pronunciation), or are familiar with Japanese on-yomi or Korean hantcha pronunciations, then you are certainly familiar with this type of character and how it works.
An example: 情. It writes a word* meaning ‘emotion’ and has two parts: (1) the phonetic component 青 which indicates the (approximate) syllabic pronunciation of the word; (2) the semantic component 忄 < 心 ‘heart’ which indicates the meaning of the word.
(*well, a morpheme, but let’s just say “word” for simplicity)
But here’s the thing: everything you’ve been told about phonetic-semantic compounds is a lie. 🤯
Well okay, not everything. The phonetic part is fine, you haven’t been lied to about that. 😌
Here are pronunciations of 情 and 青 in Mandarin, Cantonese, Japanese, Korean. You can easily see that the pronunciation of 青 is a decent approximation of the pronunciation of 情:
So what’s going on here? Why do we say that 忄 and ⺮ indicate the meanings of these characters, when they most plainly do no such thing?
What, then, do they do?
The answer is that they represent meaning *categories*.
The categories have a metonymic or superset relationship to the base meaning of the radical.
The meaning category indicated by 忄 is [mental or emotional state].
The meaning categories indicated by ⺮ are [types of bamboo] and [objects made of bamboo].
We can describe the function of these radicals as “taxonomic”, or classificatory. A character like 情 has a structure that can be called "phonetic-taxonomic".
This method of forming characters is millennia-old and deeply ingrained in the Chinese script. New characters continue to be created on this model.
For example,
the metallic element Mendelevium (Md, #101) was first synthesized in 1955 and was named after the Russian chemist Dmitri Mendeleev. The Mandarin name given to it was mén.
In order to write this new word mén, a new phonetic-taxonomic character was created.
For the pronunciation part, the homophone 門/门 was chosen. For the radical, 金 was chosen.
金 itself means ‘gold’, but as a taxonomic component it indicates [metals] and [objects made of metal].
And so the character 鍆/钔 (mén ‘Mendelevium’) was created.
New characters have been created in this way continuously over the history of the script.
Okay, if you’re still with me, we’re ready to get back to the sinograms in the quizzes.
🙌
The Chinese character 夠 (M. gòu ‘enough’), as ordinary as it seems, is actually rather unusual.
Why?
Because it’s not phonetic-taxonomic.
🤯
夠 may superficially look like 狗 (M. gǒu ‘dog’) and 鉤 (M. gōu ‘hook’), but it has a different functional structure. The component 多 is not a radical and does not indicate a meaning category.
In fact, 多 ‘much, many’ is actually a (near-)synonym of gòu ‘enough’. The graph 夠 is "phonetic-synonymic". It’s got a sound part and a meaning part, and the meaning part is very specific to the meaning of the word in a way that the radicals of most Chinese characters are not.
(Note that the Song dynasty medieval dictionary Guǎngyùn 廣韻 defines 夠 as "多" duō ‘many’.)
And this brings us, at last, to the Vietnamese sinograms we started with.
Beginning around the 14th century, Vietnamese speakers began to write the colloquial Vietnamese language using adapted Chinese characters. The script they developed is called chữ Nôm.
These Vietnamese speakers took as a starting point their shared knowledge of Classical Chinese, the written language of Vietnam at that time. Specifically, their raw material was the set of mainstream Chinese characters which represented Chinese words.
Those characters/words each had a meaning and a standard Sino-Vietnamese reading pronunciation (much like Korean hantcha-ŭm 한자음 and Japanese on-yomi 音読み).
While many Vietnamese words were written with existing Chinese characters, in some cases native Vietnamese words were written by innovated sinograms: newly-created graphs made up of a meaning component and a sound component, based on the meanings and sounds of Classical Chinese.
To take a very simple example: If you are a 14th-century Vietnamese speaker who is literate in Classical Chinese, how might you choose to write the Vietnamese word ba ‘three’?
Answer: 𠀧. The component 巴 has standard Sino-Vietnamese reading ba; it indicates the syllabic sound. The component 三 writes a Classical Chinese word meaning ‘three’; it indicates the meaning.
The result, 𠀧, is a phonetic-semantic compound.
But note that it is more like 夠 than it is like 情 or 鍆.
The semantic part, 三, is a synonym of the Vietnamese word ba ‘three’, not a category-indicator. The sinogram 𠀧 is *phonetic-synonymic*, not *phonetic-taxonomic*.
Here are some more chữ Nôm innovated graphs. These six examples are all phonetic-synonymic.
1-3.
𡗉 nhiều ‘many’, from 堯 (SV nhiêu) and 多 (‘many’)
𠃣 ít ‘few’, from 乙 (SV ất) and 少 (‘few’)
𦊚 bốn ‘four’, from 四 (‘four’) and 本 (SV bổn)
4-6.
𦹵 cỏ ‘grass’, from 草 (‘grass’) and 古 (SV cổ)
𠸗 xưa ‘old, ancient’, from 初 (SV sơ) and 古 (‘ancient’)
𡈺 tròn < *klon ‘round’, from 圓 (‘round’) and 侖 (SV luân)
And here’s one that is phonetic-taxonomic, using a Chinese radical to indicate a meaning category.
𠳒 lời ‘spoken word’, from 口 [actions using the mouth] and 𡗶 (trời ‘sky’)
Note that its sound component is itself a chữ Nôm innovated graph.
We’re now in a position to understand the structural difference between 𦹵 [⿰草古] (Vietnamese cỏ ‘grass’) and 苦 (Mandarin kǔ ‘bitter’), which are superficially so similar.
They are both made up of ‘grass’ and ‘old’. But the former is phonetic-synonomic, while the latter is phonetic-taxonomic. In the former 草 actually means ‘grass’; in the latter, ⺿ means [metonymically associated with a property of herbs and grasses].
Why are phonetic-synonymic graphs so common in Vietnamese sinographic writing, yet so rare in the Chinese script itself?
[Word to the weary, including me: we are only a half-dozen tweets from the end.]
Turns out it’s not hard to suss out the reason. When you move *across* languages, e.g. from Chinese to Vietnamese, it’s easy to find synonyms. You just think of the translation of your Vietnamese word in Chinese. But *within* a single language, synonyms are harder to come by.
If I’m an ancient Chinese speaker who needs to invent a new character to write the word fēng ‘maple tree’, it’s easy enough to find a character to serve as phonetic component: 風 fēng ‘wind’.
But ...
I’m not going to find a synonym for ‘maple tree’ elsewhere in my vocabulary. So my meaning component has to be a category indicator: 木 [tree].
So you see, the characters 𠳒 (Vietnamese lời ‘spoken word’) and 說 (Mandarin shuō ‘to speak’) have a lot in common structurally: they are both phonetic-taxonomic.
And the characters 𡗉 (Vietnamese nhiều ‘many’) and 夠 (Mandarin gòu ‘enough’) have a lot in common structurally: they are both phonetic-synonymic.
And that’s why the answer to quiz #1 is (d) 𠳒. It’s the only phonetic-taxonomic character in the set, i.e. the only one with a “radical”.
@Tao_Collective@KIRINPUTRA@viroraptor@homosappiest@xiao_collective@catielila@BadLingTakes They aren't commensurate, for several reasons: (1) The textual record is incomplete, much is lost to us. So there might be words attested only in texts that haven't survived. (2) Because writing is employed only in certain socio-cultural contexts and is not a precise
This will be my last follow-up to this earlier thread on Pokémon names. I just want to give a shout-out to some of the researchers and their work on Pokémon names ("Pokémonastics") that I learned about from replies posted to the thread.
Shigeto Kawahara seems to be the dominant figure in the field. He was lead author of this paper that demonstrated, among other things, correlation between the length (in moras) of Pokémon names and the size, weight, and evolution status of the Pokémon.
Within my thread on Pokémon names posted last week, I talked about the English, German, and Japanese names of the three Pokémon pictured here, which make up an evolutionary family.