  • My source collection
  • The terms one, two, three, etc in languages must be put under numeral (not number) categories or they will not be lemmas.
  • Loan blend or hybrid loanword is the compound of foreign word and native word. We must use both com and bor templates.
  • Arbitrary PUAs are not allowed to name pages. For unencoded Han characters, use IDS instead. See Category:Terms containing unencoded characters.



  • Proto-Southwestern Tai & Proto-Tai tone symbols: ᴬ ᴮ ꟲ ᴰˢ ᴰᴸ ¹ ² ³ ⁴. There is no superscript capital S. 😭
  • Some Han characters may have 2 indices. Here is the list (Unicode 15.0.0):
  • 12 code points in CJK Compatibility Ideographs range
    , , , , , , , , , , ,
    (U+FA0E, U+FA0F, U+FA11, U+FA13, U+FA14, U+FA1F, U+FA21, U+FA23, U+FA24, U+FA27, U+FA28, and U+FA29)
    lack a canonical Decomposition_Mapping value in UnicodeData.txt and so are not true CJK Compatibility Ideographs. These twelve characters should be treated as proper CJK Unified Ideographs.