syllables consist of either an initial sound (声母), or a final sound (韵母), or both, and a tone
initials: b c ch d f g h j k l m n p q r s sh t x z zh
finals: a ai an ang ao e ei en eng i ia ian iang iao ie in ing iong iu o ong ou u ü ua uai uan üan uang üe ueng ui un ün uo
4 tones and neutral
exceptions not reflected in pinyin
third tones followed by another third tone are pronounced as second tone; for example, 33 -> 23. longer sequences, such as 333, may become 233 or 223, depending on phrasing
不 bu4 is second tone when followed by a fourth tone. 44 -> 24
一 yi1 is pronounced like 不 except it is pronounced as first tone when it stands alone. 13/12/11 -> 43/42/41, 14 -> 24
繁体字, traditional characters: in common use in hong kong, macau and taiwan, as well as in south korea and japan to a certain extent
shared component same pinyin -> hit same syllable same tone -> hit different tone -> contrast cue same syllable prefix -> cue same syllable suffix -> cue similar phonetic prefix (c/z, s/sh, j/q, zh/ch, k/g, t/d) -> cue same tone -> weak cue opposite tone (2/4) -> strong contrast cue (if same syllable) uniqueness (unique pinyin-tone combination) -> hit same pinyin -> hit same syllable same syllable prefix -> cue same syllable suffix -> cue same tone -> weak cue similar phonetic prefix -> cue same tone -> weak cue opposite tone (2/4) -> contrast cue (if tone contrast is known) visual similarity -> rough cue same field -> weak cue opposites -> weak cue rarity (<3 contexts) -> ad-hoc mnemonic
tones are marked with numbers
tones are alternatively marked with diacritics
frequency lists
frequency list 1 50000 entries
frequency list 2 9933 entries
frequency list 3 10000 entries
名词 noun, 代词 pronoun, 动词 verb, 形容词 adjective, 副词 adverb, 数词 numeral, 量词 measure word, 连词 conjunction, 介词 preposition, 助词 particle, 叹词 interjection, 拟声词 onomatopoeia
主语 subject, 谓语 predicate, 宾语 object, 定语 attributive, 状语 adverbial, 补语 complement, 短语 phrase, 分句 clause, 句子 sentence
reading
listening
beginner
pronounced zeros
patterns
_ 五 wu3 5 十 十 shi2 10 十_ 十四 shi2si4 14 _十 三十 san1shi2 30 _十_ 五十七 wu3shi2qi1 57 _百 三百 san1bai3 300 _百零_ 二百零二 er4bai3ling2er4 202 _百_十 一百二十 yi1bai3er4shi2 120 _百_十_ 一百三十五 yi1bai3san1shi2wu3 135 _千 一千 yi1qian1 1000 _千零_ 一千零一 yi1qian1ling2yi1 1001 _千_百 一千二百 yi1qian1er4bai3 1200 _千_百_十 一千二百三十 yi1qian1er4bai3san1shi2 1230 _千_百_十_ 一千二百三十五 yi1qian1er4bai3san1shi2wu3 1235 _万 一万 yi1wan4 10000 _万零_ 一万零一 yi1wan4ling2yi1 10001 _万零_十 一万零一十 yi1wan4ling2yi1shi2 10010 _万_千_百 一万二千三百 yi1wan4er4qian1san1bai3 12300 _万_千_百_十_ 一万二千三百四十五 yi1wan4er4qian1san1bai3si4shi2wu3 12345 十万 十万 shi2wan4 100000 十万零_ 十万零一 shi2wan4ling2yi1 100001 十万零_十 十万零一十 shi2wan4ling2yi1shi2 100010 十万_千 十万一千 shi2wan4yi1qian1 110000 十万_千_百_十_ 十二万三千四百五十六 shi2er4wan4san1qian1si4bai3wu3shi2liu4 123456
max(1, 10 * (unique_chars_length / all_possible_chars_length + median(last_10(unique_chars_frequency_indices)) / all_possible_chars_length))
/[\u{30A0}-\u{30FF}\u{2E80}-\u{2EFF}\u{31C0}-\u{31EF}\u{4E00}-\u{9FFF}\u{3400}-\u{4DBF}\u{20000}-\u{2A6DF}\u{2A700}-\u{2B73F}\u{2B740}-\u{2B81F}\u{2B820}-\u{2CEAF}\u{2CEB0}-\u{2EBEF}\u{30000}-\u{3134F}\u{31350}-\u{323AF}\u{2EBF0}-\u{2EE5F}]/gu