zev handel Profile picture
19 Mar, 58 tweets, 12 min read
Let’s talk about radicals in Chinese characters, like 虫 in 蚊, and why they don’t work the way that you think they work.

This discussion will take us outside of China and into parts of the historical “sinographosphere”: 🇰🇷🇰🇵🇯🇵, and especially 🇻🇳.

🧵
We'll start with a little quiz. Here are five characters. Which one doesn’t fit the category that the others are in?

a 𠃣
b 𠀧
c 𡈺
d 𠳒
e 夠

Before you answer, let me warn you that it’s a trick question.
You said (e) 夠, didn’t you?

Even though I warned you it was a trick question?
Well, I can’t actually say you’re wrong. Of these five, 夠 is the only character used in Standard Written Chinese.

Quite possibly it’s the only one of the five that you recognized.

So (e) is a fair answer.
But the answer I was looking for is (d) 𠳒.
Before I explain why, let’s have another go, this time with a clue: The category to which all but one of these characters belongs is related to character structure and component function.

a 𡗉
b 夠
c 𦊚
d 𠸗
e 說 Image of the graphs in the text of the tweet.
The answer is …

.
.
.
.
.
.
.
.

(e) 說.
Did you get it? No?

Okay, I see that I better start explaining myself before the crowd gets restless and starts pelting me with rotten tomatoes. Photo of a smashed tomato
To start the explanation:

The following characters were all created in Vietnam to write Vietnamese words. They are part of the chữ Nôm script:

𡗉𠃣𠀧𦊚𡈺𠸗𠳒

We really shouldn’t call them “Chinese characters”, as the are not part of the mainstream written Chinese tradition. Image of the Nôm graphs in the text of the tweet
Instead, I’ll use the term “sinogram” to refer to any graph adapted or derived from Chinese characters or their components, regardless of whether the graph is used to write a Chinese language or some other language.
The umbrella term “sinogram” covers not only mainstream and “dialectal” Chinese characters, but also Japanese kanji, Korean hantcha, and Vietnamese chữ Hán, including kokuji 国字, kukcha 국자, and chữ Nôm 𡨸喃.
But I’m getting ahead of myself. I promised to talk about radicals, and explain my answers to the quizzes above.
To do that, I’m going to have to back up a bit and talk about Chinese phonetic-semantic compound characters, aka xíngshēngzì 形聲字. Then we’ll come back around to the characters in the two exercises, and see how this all applies to an analysis of Vietnamese chữ Nôm graphs
So settle in, this will take a little while. But it’ll be worth your time. Photo of cozy reading chairs in front of stone-faced lit fir
Phonetic-semantic compound graphs are, by far, the most common type of Chinese character.
If you can read Standard Written Chinese (in Mandarin, Cantonese, or any other mainstream reading tradition pronunciation), or are familiar with Japanese on-yomi or Korean hantcha pronunciations, then you are certainly familiar with this type of character and how it works.
An example: 情. It writes a word* meaning ‘emotion’ and has two parts: (1) the phonetic component 青 which indicates the (approximate) syllabic pronunciation of the word; (2) the semantic component 忄 < 心 ‘heart’ which indicates the meaning of the word.
(*well, a morpheme, but let’s just say “word” for simplicity)
But here’s the thing: everything you’ve been told about phonetic-semantic compounds is a lie. 🤯
Well okay, not everything. The phonetic part is fine, you haven’t been lied to about that. 😌
Here are pronunciations of 情 and 青 in Mandarin, Cantonese, Japanese, Korean. You can easily see that the pronunciation of 青 is a decent approximation of the pronunciation of 情:

M: 情 qíng, 青 qīng
C: 情 cing4, 青 cing1/ceng1
J: 情 jō/sei, 青 shō/sei
K: 情 chŏng, 青 ch’ŏng
It’s the semantic part, the part often called a “radical”, that isn’t what you've been led to believe.

And you’ll realize why the instant I describe it:
情 doesn’t mean ‘heart’. It means 'emotion'

So 心 is *not* the meaning of 情.

I know, right? Scientific drawing of human heart, with parts labeled
Here are some other characters that have 忄 or 心 ‘heart’ as their semantic component. None of them means ‘heart’.

忍 ‘to be patient’
忘 ‘to forget’
恐 ‘to fear’
念 ‘to think’
悅 ‘to be pleased’
忿 ‘to be angry’
恨 ‘to hate’
怨 ‘to resent’
急 ‘to be anxious’
怒 ‘to be angry’ Cartoon of emoji-like yellow face with angry expression.
Here are some characters that have ⺮ < ⽵ ‘bamboo’ as their radical. None of them means ‘bamboo’.

笠 ‘type of hat’
筐 ‘basket, chest’
筷 ‘chopsticks’
箕 ‘winnowing basket’
籃 ‘basket, cage’
簡 ‘slip, chit’
籠 ‘steaming basket’ Photo of a Chinese steaming basket with lid lifted to reveal
So what’s going on here? Why do we say that 忄 and ⺮ indicate the meanings of these characters, when they most plainly do no such thing?

What, then, do they do?
The answer is that they represent meaning *categories*.

The categories have a metonymic or superset relationship to the base meaning of the radical.

The meaning category indicated by 忄 is [mental or emotional state].
The meaning categories indicated by ⺮ are [types of bamboo] and [objects made of bamboo].

We can describe the function of these radicals as “taxonomic”, or classificatory. A character like 情 has a structure that can be called "phonetic-taxonomic". Photo of about a dozen bamboo chopsticks
This method of forming characters is millennia-old and deeply ingrained in the Chinese script. New characters continue to be created on this model.

For example,
the metallic element Mendelevium (Md, #101) was first synthesized in 1955 and was named after the Russian chemist Dmitri Mendeleev. The Mandarin name given to it was mén. Photo of Mendeleev
In order to write this new word mén, a new phonetic-taxonomic character was created.

For the pronunciation part, the homophone 門/门 was chosen. For the radical, 金 was chosen.
金 itself means ‘gold’, but as a taxonomic component it indicates [metals] and [objects made of metal].

And so the character 鍆/钔 (mén ‘Mendelevium’) was created.

New characters have been created in this way continuously over the history of the script.
Okay, if you’re still with me, we’re ready to get back to the sinograms in the quizzes.

🙌 Cartoon image of word "QUIZ", a magnifying glass,
The Chinese character 夠 (M. gòu ‘enough’), as ordinary as it seems, is actually rather unusual.

Why?

Because it’s not phonetic-taxonomic.

🤯
夠 may superficially look like 狗 (M. gǒu ‘dog’) and 鉤 (M. gōu ‘hook’), but it has a different functional structure. The component 多 is not a radical and does not indicate a meaning category.
In fact, 多 ‘much, many’ is actually a (near-)synonym of gòu ‘enough’. The graph 夠 is "phonetic-synonymic". It’s got a sound part and a meaning part, and the meaning part is very specific to the meaning of the word in a way that the radicals of most Chinese characters are not.
(Note that the Song dynasty medieval dictionary Guǎngyùn 廣韻 defines 夠 as "多" duō ‘many’.)
And this brings us, at last, to the Vietnamese sinograms we started with.

Beginning around the 14th century, Vietnamese speakers began to write the colloquial Vietnamese language using adapted Chinese characters. The script they developed is called chữ Nôm.
These Vietnamese speakers took as a starting point their shared knowledge of Classical Chinese, the written language of Vietnam at that time. Specifically, their raw material was the set of mainstream Chinese characters which represented Chinese words.
Those characters/words each had a meaning and a standard Sino-Vietnamese reading pronunciation (much like Korean hantcha-ŭm 한자음 and Japanese on-yomi 音読み).
While many Vietnamese words were written with existing Chinese characters, in some cases native Vietnamese words were written by innovated sinograms: newly-created graphs made up of a meaning component and a sound component, based on the meanings and sounds of Classical Chinese.
To take a very simple example: If you are a 14th-century Vietnamese speaker who is literate in Classical Chinese, how might you choose to write the Vietnamese word ba ‘three’?
Answer: 𠀧. The component 巴 has standard Sino-Vietnamese reading ba; it indicates the syllabic sound. The component 三 writes a Classical Chinese word meaning ‘three’; it indicates the meaning. Vietnamese Nôm graph consisting of 巴 and 三, writing the
The result, 𠀧, is a phonetic-semantic compound.

But note that it is more like 夠 than it is like 情 or 鍆.

The semantic part, 三, is a synonym of the Vietnamese word ba ‘three’, not a category-indicator. The sinogram 𠀧 is *phonetic-synonymic*, not *phonetic-taxonomic*. Same image as on previous tweet, of Nôm graph writing ba 't
Here are some more chữ Nôm innovated graphs. These six examples are all phonetic-synonymic.

1-3.
𡗉 nhiều ‘many’, from 堯 (SV nhiêu) and 多 (‘many’)
𠃣 ít ‘few’, from 乙 (SV ất) and 少 (‘few’)
𦊚 bốn ‘four’, from 四 (‘four’) and 本 (SV bổn) Image of the three examples given in the tweet, for those wh
4-6.
𦹵 cỏ ‘grass’, from 草 (‘grass’) and 古 (SV cổ)
𠸗 xưa ‘old, ancient’, from 初 (SV sơ) and 古 (‘ancient’)
𡈺 tròn < *klon ‘round’, from 圓 (‘round’) and 侖 (SV luân) Image of the three examples given in the tweet, for those wh
And here’s one that is phonetic-taxonomic, using a Chinese radical to indicate a meaning category.

𠳒 lời ‘spoken word’, from 口 [actions using the mouth] and 𡗶 (trời ‘sky’)

Note that its sound component is itself a chữ Nôm innovated graph. Image of text in the tweet, for those lacking the Unicode ex
We’re now in a position to understand the structural difference between 𦹵 [⿰草古] (Vietnamese cỏ ‘grass’) and 苦 (Mandarin kǔ ‘bitter’), which are superficially so similar.
They are both made up of ‘grass’ and ‘old’. But the former is phonetic-synonomic, while the latter is phonetic-taxonomic. In the former 草 actually means ‘grass’; in the latter, ⺿ means [metonymically associated with a property of herbs and grasses].
Why are phonetic-synonymic graphs so common in Vietnamese sinographic writing, yet so rare in the Chinese script itself?

[Word to the weary, including me: we are only a half-dozen tweets from the end.]
Turns out it’s not hard to suss out the reason. When you move *across* languages, e.g. from Chinese to Vietnamese, it’s easy to find synonyms. You just think of the translation of your Vietnamese word in Chinese. But *within* a single language, synonyms are harder to come by.
If I’m an ancient Chinese speaker who needs to invent a new character to write the word fēng ‘maple tree’, it’s easy enough to find a character to serve as phonetic component: 風 fēng ‘wind’.

But ...
I’m not going to find a synonym for ‘maple tree’ elsewhere in my vocabulary. So my meaning component has to be a category indicator: 木 [tree].
So you see, the characters 𠳒 (Vietnamese lời ‘spoken word’) and 說 (Mandarin shuō ‘to speak’) have a lot in common structurally: they are both phonetic-taxonomic.
And the characters 𡗉 (Vietnamese nhiều ‘many’) and 夠 (Mandarin gòu ‘enough’) have a lot in common structurally: they are both phonetic-synonymic.
And that’s why the answer to quiz #1 is (d) 𠳒. It’s the only phonetic-taxonomic character in the set, i.e. the only one with a “radical”.

Image of five characters in Quiz 1
And that’s why the answer to quiz #2 is (e) 說. It too is the only phonetic-taxonomic character in the set.

I presume, by now, that you’ve had “enough”. So time to stop.

/end The words 夠了! 'Enough!' in playful purple font

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with zev handel

zev handel Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @ZevHandel

18 Mar
@Tao_Collective @KIRINPUTRA @viroraptor @homosappiest @xiao_collective @catielila @BadLingTakes Sure. Let's just suppose for the sake of argument that we have a core set of Sinitic languages descended from a common ancestor spoken in what is now northern China, under historical circumstances that can be reasonably approximated by the tree model of divergence.
@Tao_Collective @KIRINPUTRA @viroraptor @homosappiest @xiao_collective @catielila @BadLingTakes We can call that language Old Chinese and we'd like to reconstruct its vocabulary and phonology. We have two windows into that language's vocabulary and pronunciation. One is primarily text-based or, if you will, philological.

The other is cognate vocabulary in modern lgs.
@Tao_Collective @KIRINPUTRA @viroraptor @homosappiest @xiao_collective @catielila @BadLingTakes They aren't commensurate, for several reasons: (1) The textual record is incomplete, much is lost to us. So there might be words attested only in texts that haven't survived. (2) Because writing is employed only in certain socio-cultural contexts and is not a precise
Read 12 tweets
18 Mar
@Tao_Collective @KIRINPUTRA @viroraptor @homosappiest @xiao_collective @catielila @BadLingTakes They might claim they have but I wouldn’t characterize it that way. Both reconstructions are fundamentally based on projecting MC backwards into phonological categories induced from analysis of poetic rhyming and xiesheng series.
@Tao_Collective @KIRINPUTRA @viroraptor @homosappiest @xiao_collective @catielila @BadLingTakes B & S use Norman’s Proto-Min and extra-Sinitic borrowings as supplement to that method. Zhengzhang’s comparative supplements are highly unsystematic.
@Tao_Collective @KIRINPUTRA @viroraptor @homosappiest @xiao_collective @catielila @BadLingTakes Based only on a more or less strict application of the comparative method, I doubt you could reconstruct to earlier than Han. Much of the clustering and morphology they reconstruct is gone by then.
Read 4 tweets
17 Mar
@KIRINPUTRA @Tao_Collective @viroraptor @homosappiest @xiao_collective @catielila @BadLingTakes Thank you for taking the time to write this all out. I'll keep my responses brief.

1) I don't disagree with anything substantive you have said here.
@KIRINPUTRA @Tao_Collective @viroraptor @homosappiest @xiao_collective @catielila @BadLingTakes 2) I think your view of the field is somewhat outdated. It is not nearly as rigid as you describe. There are many young scholars, often native speakers, doing top-notch work describing the lexicon, morphology, and syntax of Chinese language varieties.
@KIRINPUTRA @Tao_Collective @viroraptor @homosappiest @xiao_collective @catielila @BadLingTakes They are not wedded to the old methods and they bring valuable perspectives, including knowledge of language use in socio-cultural context. This is not to say there isn't still an old guard, just that there are generational shifts happening.
Read 9 tweets
12 Mar
This will be my last follow-up to this earlier thread on Pokémon names. I just want to give a shout-out to some of the researchers and their work on Pokémon names ("Pokémonastics") that I learned about from replies posted to the thread.
Shigeto Kawahara seems to be the dominant figure in the field. He was lead author of this paper that demonstrated, among other things, correlation between the length (in moras) of Pokémon names and the size, weight, and evolution status of the Pokémon.

ncbi.nlm.nih.gov/pmc/articles/P…
And he hosted the 1st Conference on Pokémonastics at Keio University in 2018.

1stpokemonastics.wordpress.com
Read 8 tweets
11 Mar
Within my thread on Pokémon names posted last week, I talked about the English, German, and Japanese names of the three Pokémon pictured here, which make up an evolutionary family.

Two readers, @CranberryMorph1 and @gyankotsu , made an interesting observation. 1/ The three Pokémon Deino, Zw...
Before I get to their observation, let's review the names of these one-, two-, and three-headed dragon-like Pokémon:

ENGLISH / GERMAN / JAPANESE
Deino / Kapuno / Monozu モノズ
Zweilous / Duodino / Jiheddo ジヘッド
Hydreigon / Trikephalo / Sazandora サザンドラ

2/
Each set of names contains some form of the numbers 1, 2, 3.

The English names: German eins, zwei, drei

The German names: Latinate uno, duo, tri ("tri" could also be Greek, which matches Greek "kephalo")

The Japanese names: Greek mono, Japanese ji and sa/san

3/
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!