NanshuBot is operated by Nanshu.

CJK Unified Ideographs completed!

Main stuffs were from the Unihan database.
Quote from Unihan.txt

Recipient is granted the right to make copies in any form for internal distribution and to freely use the information supplied in the creation of products supporting Unicode. Unicode, Inc. specifically excludes the right to re-distribute this file directly to third parties or other organizations whether for profit or not.

The 4-corner table was created by Christian Wittern and Urs App (public domain): http://www.ibiblio.org/pub/packages/ccic/software/data/4corner.readme

The Cangjie data was taken from www.chinesecj.com with permission.

If you find problems especially legal ones, tell Nanshu.

Known Problems

edit
  • The Bot does not use long vowel signs to Japanese Kun because it requires morphological analysis. Compare 講師 kou-shi and 子牛 ko-ushi.
  • The Bot detects the type of a character ("simplified", "traditional" or "both") by variants. If it has one or more simplified variants, it is "traditional". If its simplified variant is also a traditonal character (like 臺), its simplified variant (like 台) is "both". The same is true of traditional variants. But the detection is sometimes wrong.
  • The Bot does not designate the type of a character ("simplified", "traditional" or "both") if it has neither simplified nor traditional variants.
  • The Bot sorts Chinese readings by alphabetical order. They should be rearranged by frequency.
  • Some Pinyin spellings are broken. See Wiktionary talk:Chinese Pinyin index
  • Some Japanese On and Kun readings are spelt in the old orthography.
    • au -> ō
    • iu -> yū
    • eu -> yō
    • fu -> u
      • afu -> ō
      • ifu -> yū
      • efu -> yō
    • kwan -> kan
    • The compiler would be non-Japanese. He/she mixed ゐ up with る.
  • Inconsistency in the format.
    • Radical (-542F)
    • Korean (-5EFF)
  • 'Stroke number' should be 'stroke count'. (See Talk:维.)