Hello, welcome to Wiktionary, and thank you for your contributions so far.

If you are unfamiliar with wiki editing, take a look at Help:How to edit a page. It is a concise list of technical guidelines to the wiki format we use here: how to, for example, make text boldfaced or create hyperlinks. Feel free to practice in the sandbox. If you would like a slower introduction we have a short tutorial.

These links may help you familiarize yourself with Wiktionary:

  • Entry layout (EL) is a detailed policy documenting how Wiktionary pages should be formatted. All entries should conform to this standard. The easiest way to start off is to copy the contents of an existing page for a similar word, and then adapt it to fit the entry you are creating.
  • Our Criteria for inclusion (CFI) define exactly which words can be added to Wiktionary, though it may be a bit technical and longwinded. The most important part is that Wiktionary only accepts words that have been in somewhat widespread use over the course of at least a year, and citations that demonstrate usage can be asked for when there is doubt.
  • If you already have some experience with editing our sister project Wikipedia, then you may find our guide for Wikipedia users useful.
  • The FAQ aims to answer most of your remaining questions, and there are several help pages that you can browse for more information.
  • A glossary of our technical jargon, and some hints for dealing with the more common communication issues.
  • If you have anything to ask about or suggest, we have several discussion rooms. Feel free to ask any other editors in person if you have any problems or question, by posting a message on their talk page.

You are encouraged to add a BabelBox to your userpage. This shows which languages you know, so other editors know which languages you'll be working on, and what they can ask you for help with.

I hope you enjoy editing here and being a Wiktionarian! If you have any questions, bring them to the Wiktionary:Information desk, or ask me on my talk page. If you do so, please sign your posts with four tildes: ~~~~ which automatically produces your username and the current date and time.

Again, welcome! — justin(r)leung (t...) | c=› } 12:49, 14 February 2017 (UTC)

|m_note=Only used as a component in a characterEdit

Hmm, I'm not sure this is the way to note this. |m_note= is for notes on the Mandarin pronunciation; Only used as a component in a character sounds more like a ====Usage note==== IMO. (@Wyang, Justinrleung, any opinions?) —suzukaze (tc) 05:33, 9 April 2017 (UTC)

I agree that the note should not be in the pronunciation section, but isn't "component variant" sufficient to say that it's only used as a component? Also, Cantonese muk6 seems out of place. It would be component variant of 目 instead. — justin(r)leung (t...) | c=› } 05:37, 9 April 2017 (UTC)

My ReplyEdit

This note is included for CJK Unified Ideographs characters that have a kMandarin or kCantonese pronunciation defined in Unicode but not found in any commercially available Mandarin dictionaries or national standards such as BIG5, CNS11643, GB18030. —KevinUp (talk) 06:03, 9 April 2017 (UTC)

But that doesn't mean we put the note in the pronunciation section, since it doesn't have much to do with the pronunciation. — justin(r)leung (t...) | c=› } 06:46, 9 April 2017 (UTC)
The note could be modified to "Pronunciation derived from " or "Pronunciation derived from " for characters such as and KevinUp (talk) 07:11, 9 April 2017 (UTC)

I propose that the pronunciation entry be scraped altogether for such characters that are not pronounceable on their own and have common names such as 草字頭 for , 絞絲旁 for , 豎心旁 for and 卷字頭 for KevinUp (talk) 07:11, 9 April 2017 (UTC)

A specially created box such as zh-see to indicate that the character is only used as a component may be appropriate for Unicode characters that are not dictionary characters. —KevinUp (talk) 07:30, 9 April 2017 (UTC)
I think we could keep the pronunciation section but have it contain the name of the character, i.e. 艹#Pronunciation would have Mandarin: cao3 zi4 tou2, and the rest of the entry would treat as a Chinese symbol instead of word. —suzukaze (tc) 09:02, 9 April 2017 (UTC)
Good idea. The pronunciation section could include the names of the character. —KevinUp (talk) 10:44, 9 April 2017 (UTC)

Here are a list of symbols that I have come across: 𩙿. These are the ones that are (1) not found in dictionaries (2) mostly used as character components for ideographic description characters (⿱⿰⿵) —KevinUp (talk) 10:44, 9 April 2017 (UTC)

Isn't a kwukyeol note, only found in Korean? Johnny Shiz (talk) 14:07, 10 February 2019 (UTC)
Are you two going to add 亻 to the dānrénpáng pinyin page? If not, can we truly consider 'dānrénpáng' to be the 'pronunciation' of 亻? --Geographyinitiative (talk) 23:02, 11 February 2019 (UTC)
No, because these are symbols, not actual lemmas or words that are used in the Chinese language. We could assign names to these symbols, but not pronunciations. KevinUp (talk) 12:24, 12 February 2019 (UTC)
@Johnny Shiz, KevinUp I agree with your stance and have made some preliminary edits to ; let me know what you think. --Geographyinitiative (talk) 12:59, 12 February 2019 (UTC)
Much better. Yes, we shouldn't add dānrénpáng as the pronunciation of , because that is the Pinyin for 單人旁单人旁 (dānrénpáng), not (rén). KevinUp (talk) 13:13, 12 February 2019 (UTC)
By the way, keep in mind that the pronunciations aren't just in Chinese, they're also in Japanese, Korean, and Vietnamese. Johnny Shiz (talk) 20:02, 12 February 2019 (UTC)


Hey- I'm fascinated by the strange characters you are adding. Do you have a font that can display this character normally in your browser while on wiktionary? Right now it's just a box to me unless I search for it in a dictionary. Keep it up! Very cool. Thanks for any help! --Geographyinitiative (talk) 13:53, 22 April 2018 (UTC)

Hi, I'm using the Hanazono font. You can get it from here: http://fonts.jp/hanazono/. It displays CJK characters up to extension F. KevinUp (talk) 05:23, 23 April 2018 (UTC)
@Geographyinitiative: What browser do you use? For me, Extension F characters only display on Microsoft devices. Johnny Shiz (talk) 22:02, 11 February 2019 (UTC)
@Johnny Shiz Opera. Since the time I made that post in April 2018, I have become very familiar with various work-arounds to overcome the problem of not seeing the characters, so I don't even think about trying to display the characters anymore. --Geographyinitiative (talk) 22:27, 11 February 2019 (UTC) (modified)

Unihan regional codes in IDSEdit

Hi, I don't think we need to include regional codes in the IDS if there is only one IDS. We only need them for distinguishing different IDSes. — justin(r)leung (t...) | c=› } 03:16, 23 April 2018 (UTC)

  • Hi, I'm adding it to the translingual section based on the Unicode chart. It provides information where the glyph comes from. I added it so our readers can know the source of the glyph without going through the official unicode charts, which may be hard to download for those with a slow Internet connection and hard to navigate for those that are new to the charts. Since this is the translingual section, the information may also be useful for future editors to add sub-entries for languages that are not yet created or to remove entries for languages that may not be relevant. Also, this information is useful for characters that have various compatibility ideographs because many font manufacturers do not follow the Unicode standard. Here is an example of a proposed entry that might be useful to determine which glyph form belongs to which region. I hope that the wiktionary programmers can code an additional functionality in future so that when your mouse hovers over an area code it will explain what it means. For now I will not add any ids if the glyph exists across all six regions (GHTJKV). However, if the glyph is only for GHTK (such as ) or GTJV (such as ) this information may be useful for those that are studying the characters.
    (radical 181 +4, 13 strokes, cangjie input 一山一月金 (MUMBC) or X一山一月 (XMUMB), composition(G,T for U+980B or T for U+2F9FE) or ⿰⿸(K for U+FACB) or ⿰⿸⿺⿱(T for U+2F9FF))
  • Hi, this may be worth checking out: is provided by GHTK but a Vietnamese entry exists and the Korean entry is missing. —⁠This unsigned comment was added by KevinUp (talkcontribs).
I appreciate your concern about the details of Han characters. However, this information does not belong in the IDS unless there are more than one IDSes to distinguish, like for or . We already link to the Unihan database, which should have the same info about the sources as the Unicode charts; these should be accessible even with slow Internet. While the regional sources are a good indicator for whether a character is used in the specified regions, they may not actually reflect actual usage in the regions. Many characters have a G source just because GB wants to include whole blocks for compatibility. On another note, many of the H glyphs don't actually reflect current standards in Hong Kong; the representative glyph is essentially the same as the T source glyphs. I'm fine with the info for , but for and , I'd prefer this to be in a usage note that describes actual usage. It's more complicated than what the regional codes say:
  • The G source has both characters because of what I've described above, but the actual standard would be 说 (説 as its traditional form).
  • In Taiwan, 說 is much more common than 説 in printed material; both are used in handwriting.
  • The current HK standard (HKSCS-2016) now includes , which is its actual educational standard, but 說 is still very common in publications and on computers.
  • While 説 is the current standard in Japan according to Jōyō Kanji, 說 was historically used in Japan as well.
(BTW, don't forget to sign after your comments with four tildes.) — justin(r)leung (t...) | c=› } 19:50, 23 April 2018 (UTC)
Hi. Thanks for your reply. After going through the Unihan database I agree that the same information can be obtained from the external links provided. Also, after reading your detailed explanation of and , I have to agree that the glyph sources may not necessarily reflect its current actual usage. I started adding this information because I thought it would become useful for more obscure characters such as those in the E and F extension blocks. I'll stop doing this in future edits and thanks for reminding. KevinUp (talk) 17:10, 26 April 2018 (UTC)

Thanks for adding the derived charactersEdit

Hey- thanks for adding the derived characters. When I study Chinese vocabulary, I sometimes like to compare the characters with characters that have similar components. I especially enjoy the rare characters you have added. thanks! --Geographyinitiative (talk) 04:51, 18 May 2018 (UTC)

You're very much welcome. The derived characters added to the translingual section may also include characters that were invented outside of China, such as Korean made Hanja 한국제 한자 (han-gukje hanja) 韓國國字, Japanese kokuji 日本國字 and also Vietnamese Nom characters. KevinUp (talk) 09:45, 21 May 2018 (UTC)

Sycophantic praise; etc.Edit

Whenever I look at a page, I can tell if you've been there or not- for instance, the page- with one look, I knew that you must have made edits to that page. The tell-tale signs were all there: Derived characters, Related characters, Glyph origin, Usage notes- and a great work up of the definition! You are the editor I wish I was. I liked your interpretation of the character as a pictograph/phono-semantic compound; 汉字源流词典 says it is a ideogrammic compound/phono-semantic compound [1]. Great stuff, just great! --Geographyinitiative (talk) 14:06, 16 December 2018 (UTC)

Archiving of discussionsEdit

Hi. The top of request pages like "Wiktionary:Requests for verification/English" have the following instruction: "At least a week after a request has been closed, if no one has objected to its disposition, the request may be archived to the entry's talk-page". Thus, do please leave closed discussions on the page for at least seven days before archiving them. Thanks. — SGconlaw (talk) 09:13, 4 January 2019 (UTC)

Sorry for the mistake, I'll take better note of this next time. KevinUp (talk) 09:15, 4 January 2019 (UTC)
No worries! This was also pointed out to me by another editor some time back. — SGconlaw (talk) 10:18, 4 January 2019 (UTC)


@KevinUp Hello! I couldn't find a rare character and I would like to ask if you can find it. I didn't see it here: [2], here: [3], here: [4], or here: [5]. Baidu, jiu haishi buzhidao. The unfindable character in question seems to have the same Cantonese pronunciation as the in 簷蛇 (Jyutping: jim4), but the character is written with a 虫 on the left and with a 嚴 on the right: "⿰虫嚴". The character can be seen in 香港粵語詞典 on page 208, where it is used in the word "⿰虫嚴蛇". Any help would be appreciated! Please let me know if you look for it and can't find it. --Geographyinitiative (talk) 09:49, 8 January 2019 (UTC)

If you can't find it or are not interested, then don't worry about it- I added ⿰虫嚴 to the jim4 page. --Geographyinitiative (talk) 11:20, 8 January 2019 (UTC)
@Geographyinitiative: Wow. You seem to have stumbled upon an extremely rare character. For future reference, there's also [6] which lists soon-to-be encoded characters (Extension G,H,etc), and [7] which lists derived characters under a certain component, and [8], [9], [10] for extremely rare characters used for personal names. I've also looked up 《漢語方言大詞典》 (中華書局, 1999) to check if its a dialectal character. However, all these turned up negative. After googling "⿰虫嚴" with apostrophe marks, I found the character quoted here: [11]
I'm not sure if the meaning above is same as that of "⿰虫嚴蛇". An image of ⿰虫嚴 can be found here: [12] Anyway, today I found this: [13], another site to search for rare characters. Here's another site: [14] (缺字系統), but no definitions are provided. KevinUp (talk) 17:13, 8 January 2019 (UTC)

Incorrect use of {{inh}}Edit

KevinUp, I noticed you incorrectly used {{inh}} here. {{inh}} is only to be use in unbroken chains of inheritance. The proper template to use there would have been {{der}}. Please see the template page for further instructions and please, if you can, go back and correct any past edits. Thanks. --{{victar|talk}} 08:04, 23 January 2019 (UTC)

@Victar: Sorry, overlooked that one. I modified {{etyl|la|fr}} to {{inh|fr|la|}} based on the previous statement {{inh|fr|ML.|pūblicitātem}}. I found that herbage#French (plus 16 other French entries) and lealdade#Portuguese (plus 14 other Portuguese entries) also has this mistake. Shall I correct these entries as well? Those were done by other editors. KevinUp (talk) 09:08, 23 January 2019 (UTC)
No problem. Yeah, if you see any examples of that in the areas you work, please amend. Also note that {{bor}} should only be used at the start of a etymology, so if you see entries to the contrary, please fix them as well. Thanks! --{{victar|talk}} 16:38, 23 January 2019 (UTC)
@Victar: It seems that up to 430 entries have {{der|en|enm}} instead of {{inh|en|enm}} (Search here) Should these be automatically converted to {{inh|en|enm}}? There's also Category:English terms borrowed from Middle English and Category:English terms borrowed from Old English which uses {{bor|en|enm}} or {{bor|en|ang}} but most of them appear to be reintroduced terms. KevinUp (talk) 17:12, 23 January 2019 (UTC)
Looking over those search results, definitely many should be converted to {{inh}}, but I wouldn't say automatically. All those particular borrowings appear fine. --{{victar|talk}} 18:21, 23 January 2019 (UTC)

Using "鎮 / 镇" in zh-divEdit

Hey- as I've been going through the towns of China, I've always thought it was strange that I would only add the traditional form of the name of the administrative division- '鎮'- and not add the simplified form- '镇'. Would there be a way to add both? Would this change even be desirable? Just a thought. (see Category:zh:Towns in China) --Geographyinitiative (talk) 14:33, 17 February 2019 (UTC)

@Justinrleung, Suzukaze-c Would like your input too. I don't know who else would be interested in this topic. --Geographyinitiative (talk) 14:36, 17 February 2019 (UTC)
For example, '镇' does not appear on the 古驛 page- should it? --Geographyinitiative (talk) 14:38, 17 February 2019 (UTC)
Regarding the simplified form, it seems that Suzukaze-c had a similar request in Nov 2016 (See Template talk:zh-div).
As for adding (zhèn) to 古驛古驿 (Gǔyì) or creating a separate entry 古驛鎮古驿镇, this depends on how the town/village/river/geographical entity is cited in local news, publications or historical records. I think we should create entries based on its attestability, particularly how the locals refer to the place, and not based on listings found in statistical tables/government census/etc. My opinion is that if 古驛古驿 (Gǔyì) and 古驛鎮古驿镇 refers to the same place, then 古驛鎮古驿镇 can be redirected to 古驛古驿 (Gǔyì), similar to how 上海市 redirects to 上海 (Shànghǎi). However, if 古驛古驿 (Gǔyì) is not used on its own, then this entry ought to be moved or redirected to 古驛鎮古驿镇 instead. KevinUp (talk) 16:19, 17 February 2019 (UTC)
Thanks for your reply. Regarding the first issue, I left a message on the Template talk:zh-div page. Regarding the second issue, I have maps which list Mainland China locations without using 市 , 县 , or 镇 , and Justin found some great examples that seemingly proved the attestability of the 頭筆 residential community recently- 社区 wasn't 'obligate'. --Geographyinitiative (talk) 18:40, 17 February 2019 (UTC)
I agree with the redirects you mentioned (上海市 redirects to 上海 (Shànghǎi)). I usually don't actively make pages with the 市 , 县 , or 镇 tacked on unless there's an ethnic minority area involved, in which case they don't redirect. --Geographyinitiative (talk) 19:06, 17 February 2019 (UTC) modified


Do you know what the origin of the element se- in this word is? ←₰-→ Lingo Bingo Dingo (talk) 09:08, 18 February 2019 (UTC)

@Lingo Bingo Dingo: I think the element se- may be derived from a regional dialect, perhaps a form of Bazaar Malay. A google search of "sesate Indonesia" reveals that the word exists in the Balinese language and is a symbolic weapon used in some form of ritual represented by a type of food. [15] [16] [17] [18] [19] KevinUp (talk) 21:02, 18 February 2019 (UTC)
Interestingly, although Afrikaans sosatie is currently listed as a descendant of Dutch sesaté, the Wikipedia article for sosatie suggests that sosatie is of Cape Malay origin, from saus (spicy sauce) + sate (skewered meat).
However, the origin of the Malay word sate is disputed. Most native Malay words have corresponding rhymes, but sate does not rhyme with any other word. The Indonesian Wikipedia article for sate suggests that sate is from a Tamil word, and is a type of street food invented in Java island during the early 19th century.
I'm not sure whether the Dutch version of saté or sesaté contains any pork in historical recipes. If it does, than there's a strong Balinese connection, because the majority of Javanese people are Muslim and do not consume pork, unlike the Balinese people. KevinUp (talk) 21:02, 18 February 2019 (UTC)
Linking sosatie as a descendants was my doing, based on the etymology of English sosatie having alleged a Dutch intermediate step. But that might be wrong, though I wouldn't rely on Wikipedia at all for this either.
I will look into the question about pork. ←₰-→ Lingo Bingo Dingo (talk) 15:58, 19 February 2019 (UTC)
@Lingo Bingo Dingo: I found this while searching for "sesaté sosatie": http://www.etymologiebank.nl/trefwoord/sate The second reference (Dialectwoordenboeken en woordenboeken van variëteiten van het Nederlands) suggests that the term "sateh" is from Javanese sateh, originally Tamil sataj, but I'm not sure about the original spelling in Dutch/Javanese/Tamil.
As for English or Afrikaans sosatie, I think it is likely to be derived from Dutch saus + saté rather than sesaté. KevinUp (talk) 17:05, 19 February 2019 (UTC)
Whether it is strongly linked with pork is a little hard to tell, but seems like it isn't in the earlier results.
Van Wyk (Afrikaans etymological dictionary) gives a Indonesian Dutch or Malay origin from sesate(h), sateh. The oldest word list (ca. 1880) gives sassati, with sosatie appearing in the early 20th century. I don't think the compound is likely at all, because then one would expect stress on the first syllable (the Afrikaans is stressed on the second syllable). The form sausati appears once in a word list from 1899, which seems secondary. Forms like sasaté also appear a few times in Dutch. ←₰-→ Lingo Bingo Dingo (talk) 08:54, 20 February 2019 (UTC)
The link from Algemeen Nederduitsch-Maleisch Woordenboek [20] seems to suggest that sasaté has a Javanese origin. Anyway, as mentioned above, sate isn't a native Malay word, due to lack of corresponding rhymes. I found the Javanese spelling ꦱꦠꦺ (saté) on Javanese Wiktionary. Someone else will have to check whether ꦱꦱꦠꦺ (sasaté) exists or not. KevinUp (talk) 09:33, 20 February 2019 (UTC)
Yes, ascribing it to Malay and stopping there is not useful. I have added another unspecific step to the etymology and changed Malay to Indonesian. Does the Tamil origin seem plausible to you? ←₰-→ Lingo Bingo Dingo (talk) 10:05, 20 February 2019 (UTC)
I think the previous edit where sosatie is stated as "from Dutch sesaté or directly from Malay sate" is good enough. We might be dealing with two different etymologies: (1) Dutch sesaté from a Javanese or Balinese word and (2) Dutch saté from a type of street food, presumably based on Betawi (Jakarta Malay dialect) saté, from Tamil சதை (catai, flesh).
However, the origin of the word is disputed, so I think it would be better to exclude the Javanese/Balinese or Tamil origin and revert to the previous edit. KevinUp (talk) 11:12, 20 February 2019 (UTC)
On an unrelated note, I think we have to be careful about converting Malay to Indonesian since technically Indonesian did not exist before its independence in 1945. Prior to independence, there's Malay, spoken in the vicinity of the Riau-Lingga Sultanate, and also the Betawi language, a dialect/creole of Malay spoken in Jakarta. Modern colloquial Indonesian, although based on Riau-Lingga Malay, is significantly influenced by Betawi, because of the position of Jakarta as its capital.
I think we can use "Malay" for the etymology of Dutch saté, because "Betawi" is a direct descendant of Malay. A historical dictionary for Betawi (Batavia/Jakarta Malay dialect) to Dutch might be useful for us to identify such words. I'm reminded of Dutch toko, which might be from Betawi, rather than Malay or Indonesian. KevinUp (talk) 11:12, 20 February 2019 (UTC)
Is 1945 used as a cutoff point for Indonesian? Isn't there continuity with the variety of Malay used by the colonial administration though? ←₰-→ Lingo Bingo Dingo (talk) 14:59, 22 February 2019 (UTC)
  • The cutoff point for Indonesian can be taken as 1928, when the Youth Pledge (Sumpah Pemuda) was made by young nationalists who proclaimed "bahasa Indonesia" as the language of unity. The chosen language was based on the standardized form defined by the late Ali Haji bin Raja Haji Ahmad (1808-1873), who wrote Kitab Pengetahuan Bahasa, the first monolingual Malay dictionary in the region based on the Malay dialect of Johor-Pahang-Riau-Lingga.
  • As for the language used by the Dutch colonial administration, Malay was designated as the second official language in 1865, but was later removed as an official language in 1932 due to the prominent rise of nationalism. [21] This language is probably the same language standardized by Ali Haji bin Raja Haji Ahmad.
  • I think it is important to identify when a "Malay" word was first attested in the Dutch language. From 1641 to 1825, the Dutch occupied Malacca (present day Malaysia), so words from that time period is "Malay". However, words borrowed during the time of the Dutch East Indies (1800-1948) could also be Javanese, Balinese, Sundanese or some other creole of Malay, such as Betawi (these four languages are spoken on Jawa island).
  • To be safe, Malay words based on Riau-Lingga can be searched here: Kitab Pengetahuan Bahasa (romanized in Indonesian) or Puisi-puisi Raja Ali Haji (romanized in modern Malay). KevinUp (talk) 03:57, 23 February 2019 (UTC)
In the Malay language, when the prefix se- is used before a noun, it usually means "one" or "the whole/the entire". However, I've never heard of the term "sesate" used for the sense "one satay" or "the entire satay". The grammatical form to refer to "one satay" is "secucuk sate" (a skewer of satay) while "entire satay" is "seluruh sate".
As stated in the entry for se-, the "one" sense is from a shortened form of esa while the "whole/entire" sense is a clipping of seluruh. I don't think the element se- in sesaté is derived from Malay or Indonesian though. KevinUp (talk) 21:02, 18 February 2019 (UTC)
Additional comment: It seems that the se- element may be a reduplicated form of sate in the Balinese language. [22] (from [23]) KevinUp (talk) 02:01, 25 February 2019 (UTC)
In this word, I would rather based on partial reduplication of saté and means as a plural form of saté which written as saté-saté. The example of this is the pair of rerata and rata-rata in Indonesian/Malay. Xbypass (talk) 16:25, 11 November 2019 (UTC)


How can I resolve it if I can't see some han characters? It seems like square. --Dingyday (talk) 14:30, 18 February 2019 (UTC)

@Dingyday If everything else fails, copy-paste it into the dictionary at to ctext.org or zdic.net --Geographyinitiative (talk) 20:37, 18 February 2019 (UTC)
@Dingyday: To view Han characters that cannot be displayed, you have to install a font such as the Hanazono font, which has the best coverage [24]. If you're using a mobile browser, you can copy the "square" character to http://ko.glyphwiki.org/wiki/GlyphWiki:대문 KevinUp (talk) 21:16, 18 February 2019 (UTC)
By the way, almost all hanja are viewable in modern browsers. Only 56 hanja for personal names (인명용한자표) are encoded in the extension set of CJK Unified Ideographs, so these characters will need font support to be viewable. KevinUp (talk) 21:16, 18 February 2019 (UTC)
Once you have the proper font installed, the red boxes will display correctly. KevinUp (talk) 21:16, 18 February 2019 (UTC)

The font which I have displays all Hanjas except two letters; (𬟓, hun) and (𬄕, jip). If I use Hanazono Font, it displays all Hanjas, but Hangul is separated and printed, So It is uncomfortable. --Dingyday (talk) 14:54, 19 February 2019 (UTC)

@Dingyday: Interesting. May I know the name of the font you are using that can display all Hanjas except two letters? The Hanazono font is more suitable for Japanese systems, which is why the Hangul appears to be separated. I think you can uninstall the Hanazono font, because most of the "square" characters on your system that cannot display are not hanja. They are mostly obsolete characters found only in historical Chinese dictionaries and Vietnamese chữ Nôm. KevinUp (talk) 15:34, 19 February 2019 (UTC)
The font I use is Kaigen Gothic. I love this font because it prints Hangul and Hanja smoothly. --Dingyday (talk) 14:45, 20 February 2019 (UTC)
Thanks! This font really does print both Hangul and Hanja smoothly. KevinUp (talk) 13:32, 21 February 2019 (UTC)


Hi. What do you think about this claim? that 李's phonetics derived from 來? Because there was no source behind this argument, I have put a rfv on this claim.. Do you think you can back up this claim? B2V22BHARAT (talk) 10:56, 9 May 2019 (UTC)

It's solved. @Geographyinitiative taught me where the source came from. B2V22BHARAT (talk) 12:12, 9 May 2019 (UTC)

心, 必Edit

Can you see my edit on 必? Like this? B2V22BHARAT (talk) 13:18, 13 May 2019 (UTC)

@B2V22BHARAT: Okay, I've edited here: Special:Diff/52840118/52840209
  1. {{ko-hanja}} is now the same as {{ko-hanja/new}}, thanks to a recent update by User:Suzukaze-c.
  2. Use {{hanja form of|syllable|definition}} for hanja definitions.
  3. Use ====Compounds===={{der-top3|Compounds}} for hanja compounds and ====Derived terms==== {{der3|ko}} for derived terms of Hangeul (see example for Hangeul here) - the formatting and template used is different because hanja uses more server memory and needs to use a low memory template.
  4. For hanja compounds, you can sort the compounds manually (based on word length, word order) because {{der-top3}} cannot sort the compounds automatically.
  5. Don't add 심리학 (心理學, simnihak) which is a derived term of 심리 (心理, simni) at the page for (, sim), add 심리학 (心理學, simnihak) as a derived term at the page for 심리 (心理, simni) instead.
  6. No need to add the English definition for each hanja compound in {{ko-l}} unless it's a red link.
Overall, an important concept is to reduce duplication of the same content at different places. KevinUp (talk) 14:01, 13 May 2019 (UTC)

Okay, I'll copy the format. Thanks. B2V22BHARAT (talk) 14:10, 13 May 2019 (UTC)

李, 齒Edit

@KevinUp Why did you add strange symbol next to the middle Korean word? There is no such thing in naver.com dictionary. For example, 치〯 and 니〯. B2V22BHARAT (talk) 07:10, 16 July 2019 (UTC)

These are not strange symbols but tone marks used in Middle Korean. See this article for more information. KevinUp (talk) 07:22, 16 July 2019 (UTC)
I know that it's 방점, but did you have any discussions with other users about adding them? Because in other dictionaries, 방점 is not shown and I think it's unnecessary to show it in wiktionary. B2V22BHARAT (talk) 07:24, 16 July 2019 (UTC)
May I ask why you are doing this all of a sudden? Sincerely, B2V22BHARAT (talk) 07:27, 16 July 2019 (UTC)
If you look at other electronic dictionary websites, such as naver and daum, there is no tone shown next to middle Korean words. So I think it's right to follow this convention. Sincerely, B2V22BHARAT (talk) 07:31, 16 July 2019 (UTC)
The reason why 방점 (bangjeom) is not shown in other dictionaries is because older computing systems (especially those using KS X 1001) were not able to key in these symbols prior to the availability of (U+302E) and (U+302F) in the Unicode block known as CJK Symbols and Punctuation. @Erutuon, Suzukaze-c, Wyang, any ideas on how to handle (U+302E) and (U+302F)? One possibility is to redirect 치〯 to . KevinUp (talk) 07:42, 16 July 2019 (UTC)
May I ask why you are doing this all of a sudden? I think that 방점 (bangjeom) should not interfere in linking 치〯 to . B2V22BHARAT (talk) 07:46, 16 July 2019 (UTC)
The reason I am doing this is because there are words in Middle Korean such as 심〯 () which is not the same as (). Latin has a similar situation with long and short vowels, e.g. {{m|la|satūs}} (satūs) which redirects to [[satus#Latin]] without the long vowel symbol rather than satūs.
Yes, 방점 (bangjeom) should not interfere in linking 치〯 to . One method is to use [[치|치〯]] or to ask some of our expert programmers to help modify the {{m|okm}} template to handle the redirection. KevinUp (talk) 08:15, 16 July 2019 (UTC)
What about these Middle Korean? ,,,, and appear to be omitted in mainstream Korean electronic dictionaries. Are you planning to add them also? Sincerely, B2V22BHARAT (talk) 08:28, 16 July 2019 (UTC)
I deal mostly with entries or information that are related to hanja or Chinese characters, so I don't plan to add these Jamo as I am not familiar with them. 치〯 now links to so the issue is resolved. KevinUp (talk) 08:39, 16 July 2019 (UTC)
@KevinUp 齒 in hangul is 이 in both North and South Korea. Why did you put 니,치 for North Korea? Sincerely, B2V22BHARAT (talk) 09:26, 16 July 2019 (UTC)
I've fixed it. I became confused after reading the previous statement regarding 두음 법칙 (頭音法則, dueum beopchik) which affects native Korean compounds of (i) and not the hanja (, chi). KevinUp (talk) 09:35, 16 July 2019 (UTC)
By the way, I'm not familiar with 방점's, so if you find any missing 방점's in my edit, please add 방점. I'm referring to naver and daum mainly and these sites don't show 방점 for Middle Korean. Sincerely, B2V22BHARAT (talk) 10:09, 16 July 2019 (UTC)
No worries, 방점 (bangjeom) can be added at a later stage by referring to scanned images of the original documents. KevinUp (talk) 10:29, 16 July 2019 (UTC)

Community Insights SurveyEdit

RMaung (WMF) 14:34, 9 September 2019 (UTC)

Cangjie InputEdit

Where did you find the data of Cangjie input about 𩛜? I cannot find that data anywhere. --Meoru00 (talk) 13:39, 13 September 2019 (UTC)

The data can be found here: [25]. 11-3C2C in CNS 11643 is mapped to U+296DC with cangjie value OIYKI. KevinUp (talk) 14:16, 13 September 2019 (UTC)
😮 Oh, That's what I want to find. Thanks! --Meoru00 (talk) 14:31, 13 September 2019 (UTC)
Hmm..🤔 How can I match Code 'CNS 11643' and Unihan Data? It's little bit hard for me. --Meoru00 (talk) 14:39, 13 September 2019 (UTC)
It's in the /MapingTables/Unicode folder. If you're not sure what you're doing, leave it blank. There are errors in your recent edits. Yes, there are many errors in our current entries but don't create more errors for other editors to clean up. If you think something is not right, discuss the issue at the talk page before proceeding. KevinUp (talk) 14:55, 13 September 2019 (UTC)
Ok, I understood. Thanks. --Meoru00 (talk) 00:19, 14 September 2019 (UTC)

Reminder: Community Insights SurveyEdit

RMaung (WMF) 19:14, 20 September 2019 (UTC)

Why did you remove my post without my permission?Edit

I want to delete all of the things that you have wrote in my talk page. Is it okay to do that? B2V22BHARAT (talk) 20:02, 21 September 2019 (UTC)

This is because I can do well without your help. B2V22BHARAT (talk) 20:04, 21 September 2019 (UTC)

Your error in Yale romanization proves to us that you're not in position of teaching someone. B2V22BHARAT (talk) 20:07, 21 September 2019 (UTC)

If you check the revision history of this page, I mentioned "Thank you but these have already been corrected". Also, my edits here, here and here had nothing to do with the incorrect Yale romanization. KevinUp (talk) 20:20, 21 September 2019 (UTC)

That's called a sample bias. I'm pretty sure that you have made other transliteration errors that Wyang forgot to clean up. Please check and stop putting biased information that's only beneficial to you. This proves that you're not impartial and weak. B2V22BHARAT (talk) 20:23, 21 September 2019 (UTC)

Thank you for pointing this out. Wiktionary is a collaborative project so I have cleaned up the previous transliteration errors done by other users at Category talk:Middle Korean lemmas. KevinUp (talk) 20:27, 21 September 2019 (UTC)

(It's generally better practice to archive talk page discussions rather than removing them outright, even if the issue is dealt with.) — justin(r)leung (t...) | c=› } 20:45, 21 September 2019 (UTC)
Alright. Thanks for pointing this out. KevinUp (talk) 20:47, 21 September 2019 (UTC)

What's the difference between {{ko-IPA}} and {{ko-IPA|l=y}}?Edit

And also, please don't delete my post without my permission. B2V22BHARAT (talk) 21:00, 21 September 2019 (UTC)

I'm glad you asked. There is a vowel length distinction in Seoul standard Korean that is disappearing among younger speakers. See the article here and the book chapter here. The Standard Korean Language Dictionary has these marked in the dictionary, so you may refer to it. KevinUp (talk) 21:32, 21 September 2019 (UTC)

Middle Korean First attested formatEdit

So have you reached a consensus in your proposal at Beer Parlour regarding Middle Korean entry?? Should I follow this style from now then? Sincerely, B2V22BHARAT (talk) 15:28, 22 September 2019 (UTC)

In the Beer Parlour discussion, Suzukaze-c and Metaknowledge supported creating actual entries for Middle Korean whereas DTLHS thinks that they should be added as quotations.
The {{inh|lang|ancestor-lang}} format is the standard format used in the etymology section of many languages, so there shouldn't be a problem with it.
Because we currently lack an expert who is familiar with Middle Korean, it will take a while before someone creates proper entries for Middle Korean. (See here for entries that were previously deleted in 2014). In the meantime, feel free to create citation pages for Middle Korean lemmas.
Existing information has been migrated here so that other users can view it. Yes, you may use either this or this format for the etymology of native Korean words. You can comment out the old format so that it remains searchable. KevinUp (talk) 16:10, 22 September 2019 (UTC)
OK. Thanks. B2V22BHARAT (talk) 16:24, 22 September 2019 (UTC)

Can I use this format instead for Middle Korean first attested format?Edit

For example, in here https://en.wiktionary.org/w/index.php?title=%EC%8B%A0&oldid=54280404, clicking 신 in etymology 3 does not lead to separate page that deals with quotations, descendants, etc unlike English pages. Then, the new format obviously hides information from users and looks even worse than before to me. Sincerely, B2V22BHARAT (talk) 19:47, 24 September 2019 (UTC)

I don't think 신#Etymology 3 has any descendants, unless that sense has been borrowed into other languages. You can still add Korean quotations for 신#Etymology 3 as long as the quotation is from 17th century onwards (corresponding to the period where Modern Korean begins) and not written in Literary Chinese. Please don't confuse the Korean language with the Middle Korean language because these are treated as separate languages on Wiktionary, e.g. compare Category:Middle French lemmas and Category:French lemmas or alphabet#Middle French and alphabet#French which has identical meaning but separate entries. If you want to add quotations from 訓民正音解例 / 훈민정음해례 (1446) you may add it to 신#Middle Korean but not at 신#Etymology 3. And as I've mentioned above, some Middle Korean entries were deleted in 2014 (see Appendix:Middle Korean deleted entries) so if you are interested to add quotations from 훈민정음해례 you may add it to Citations:신 instead using a format similar to that found in Citations:잡다 (Note that 重刊老乞大諺解 / 중간노걸대언해 is a 1795 republished version of 노걸대 / 老乞大 that was originally written using 15th century Middle Korean). KevinUp (talk) 20:18, 24 September 2019 (UTC)
Thank you for your explanation. I will look into it. Citations:자다 and Citations:븓잡다 https://ko.dict.naver.com/#/entry/koko/b53b26873d8345c78429c29bbd9b92e1 B2V22BHARAT (talk) 20:29, 24 September 2019 (UTC)

First attested format for Middle Korean after 17th centuryEdit

If you see this article https://en.wikipedia.org/wiki/History_of_Korean#Middle_Korean, it is stated that Middle Korean corresponds to 10th to 16th centuries. Then, how do I make the first attested format for Middle Korean terms that were first attested after the 17th century? For example, 뾰족하다 (ppyojokhada). ᄲᅩ죡ᄒᆞ다 (Yale: spwocyokhota) was first attested in 박통사언해 in 1677.

Can you make one? B2V22BHARAT (talk) 10:58, 25 September 2019 (UTC)

Yes, I've updated the entry for 뾰족하다 (ppyojokhada). Note that 박통사언해 / 朴通事諺解 (1677) and 중간노걸대언해 / 重刊老乞大諺解 (1795) were both originally written by 최세진(崔世珍) (1468-1542) in Middle Korean but has been modified and republished many times. Some portions of these two books may be written in Middle Korean. Since ᄲᅩ죡ᄒᆞ다 (Yale: spwocyokhota) is not attested elsewhere, it is clearly a Modern Korean word used to explain and annotate the hanja in the text.
As for creating an entry for ᄲᅩ죡ᄒᆞ다 (Yale: spwocyokhota) using the "Korean" header I would advise against it because there is still no ISO 639-3 language code for Modern Korean (근대 한국어) yet. This means that Wiktionary will have to create its own language code to have proper templates and categories. For example, I'm using {{okm-inline}} for the transliteration of ᄲᅩ죡ᄒᆞ다 (Yale: spwocyokhota) but this is actually incorrect. This is a technical issue that has to be solved some time in future. If you're interested you can still create a citations page for Citations:ᄲᅩ죡ᄒᆞ다 rather than creating an actual entry for it. One more thing, I think it is redundant to use {{defdate}} for the etymology of 신#Etymology 3 because the Middle Korean word has the same spelling as its modern form. If there are different spellings across different time periods then {{defdate}} would be more useful.
What are your thoughts on Modern Korean (근대 한국어)? Would you consider it as a separate language that deserves its own section, or should it be merged under a unified "Korean" header? I would prefer it to be separate because of differences in orthography. KevinUp (talk) 14:12, 25 September 2019 (UTC)
I think that Modern Korean is different from Middle Korean and should be distinguished. I'm going to use the same format, then. Thanks for the information. B2V22BHARAT (talk) 14:41, 25 September 2019 (UTC)

Questions about etymologiesEdit

First of all, thanks for your update of kongsi.

  1. I was curious what the reasons were for deciding that Dutch kongsi was borrowed from Malay. (A borrowing from Indonesian is, by the way, impossible. The Dutch word first appears in the late 18th century.)
  2. Do you know if Indonesian bakmi is commonly pronounced without a [k]? I wonder whether Dutch bami was borrowed from a lect that used a different phone, perhaps [ʔ].
  3. Finally, this is a very random hunch, but do you know whether Indonesian and Brunei Malay langit-langit was in some way influenced by Dutch gehemelte (or maybe verhemelte)? Dutch ge-...-te often denotes collectives which could explain the reduplication and the meanings "palate" and "roof, ceiling, canopy", while frequently coinciding or represented by related words, are not often associated with "sky" or "heaven", except apparently in Russian, Armenian and Ukrainian.

←₰-→ Lingo Bingo Dingo (talk) 10:25, 1 October 2019 (UTC)

  1. Regarding Dutch kongsi, the original sense from Min Nan 公司 (kong-si) is "clan hall (benevolent organization of overseas Chinese of the same origin)". Because clan halls often restrict membership to exclusive groups of people, e.g. gender restrictions, people with the same surname, people from the same hometown, the sense of the word in Malay has evolved to mean "group of people with a common motive", most notably in the compound word kongsi gelap (secret society, literally dark association) and this is reflected in the Dutch sense "clique, coterie". Because this sense is absent from Min Nan or other Chinese varieties (the modern sense is "company, firm, corporation"), the borrowing of this sense into Dutch via Malay is a plausible one.
  2. bakmi in modern Indonesian is definitely pronounced with a "k". Note that the first syllable of the original word in Min Nan, (bah) is pronounced as /baʔ/ by Min Nan speakers. The glottal stop /ʔ/, which is commonly found as the last syllable of a word in modern Malay and Indonesian is usually written as "k". However, the glottal stop would be pronounced as /k/ instead if it occurs in the middle of the word. For example, compare Malay banyak (many) (/ba.ɲaʔ/) and kebanyakan (most) (kə.ba.ɲa.kan). Hence, Dutch bami may be a direct borrowing from Min Nan, rather than a borrowing from Malay or Indonesian, because /ʔ/ at the end of the first syllable would have been converted into /k/ in Malay or Indonesian.
  3. No. I don't think Malay langit-langit (also written as lelangit, from langit (sky)) was influenced by Dutch. There are other coincidences, such as Japanese 天井 (tenjo, ceiling), from (ten, sky) + (, well). Compare also Korean 입천장 (天障, ipcheonjang, “palate”), from (ip) + 천장 (天障, cheonjang, “ceiling”). The Korean word for palate is intriguing because it is derived from the word for ceiling, which is derived from the Sino-Korean word for sky ( (, cheon)). According to Robert Blust's Austronesian Comparative Dictionary, the sense for "palate; canopy" is derived from Proto-Malayo-Polynesian *laŋit laŋit [26]. Among these cognates, Balinese, Karo Batak, Tabo Batak and Wolio uses the same word, "langit-langit" for the sense "canopy, palate". It is fascinating to see how the three senses "sky, canopy, palate" is similarly associated in different language families. KevinUp (talk) 15:52, 1 October 2019 (UTC)
Thanks for your answers, they're very informative. I'll update the etymology of bami to include the possibility of it having been borrowed from Min Nan. ←₰-→ Lingo Bingo Dingo (talk) 06:53, 2 October 2019 (UTC)
Additional note for point 2, in the case of <k>, it is must be differentiated between /k/ and /ʔ/. However, it is partially true. The glottal stop would be pronounced as <k> instead if it occurs in the middle of the word and followed by a vocal as the example of Indonesian/Malay banyak and kebanyakan. However, it does not change when it followed by a consonant and it became unreleased. So, Indonesian bakmi will be pronounced as /baʔmi/ or /bak̚mi/ as the Min Nan source. Another example for this rule is rakyat which be pronounced as /raʔjat̚/. Xbypass (talk) 16:41, 11 November 2019 (UTC)
Thanks for the clarification. I had overlooked this feature because words with medial <k> followed by a consonant are uncommon and only occur in loanwords. By the way, I noticed that some speakers pronounce maktab as /mak.tab/ similar to the Arabic pronunciation of مَكْتَب(maktab) which lacks a glottal stop. KevinUp (talk) 19:28, 11 November 2019 (UTC)
Interesting. So the relationship between "sky" and "canopy" is also found in European languages. Thanks for sharing! KevinUp (talk) 21:30, 5 December 2019 (UTC)
Compare also French: voûte palatine (hard palate) and voûte céleste (welkin, vault of heaven, canopy of heaven, celestial vault, sky, heavens) (both from voûte (arch; arched ceiling, vault)). Canonicalization (talk) 22:30, 19 December 2019 (UTC)
Interesting. Prior to this I've only heard of palais dur (hard palate) and palais mou (soft palate) in French. Since French palais is derived from Latin palātium (palace), the pattern I can infer here is that something wide or large, e.g. sky/canopy/vault/palace is linked to the word for palate (roof of mouth) in several languages from different language families. KevinUp (talk) 23:08, 19 December 2019 (UTC)

Reminder: Community Insights SurveyEdit

RMaung (WMF) 17:04, 4 October 2019 (UTC)

Tbot entriesEdit

There are still some fifty Tbot entries in Indonesian. Could you perhaps check one or two once in a while, update or correct what's necessary and remove the Tbot template if the contents are fine? ←₰-→ Lingo Bingo Dingo (talk) 07:38, 24 October 2019 (UTC)

Yes, I'll look into this. Thanks for informing. KevinUp (talk) 13:42, 24 October 2019 (UTC)
You've made good progress so far! Well done! I have another question about the entries candra and lunta. Apparently they are precategorials, but that term doesn't seem to be valid. Can you help improve the entry? --Vealhurl (talk) 23:37, 2 December 2019 (UTC)
@Vealhurl: Yes, I've improved these entries and converted the "Precategorial" heading to "Particle". KevinUp (talk) 21:24, 5 December 2019 (UTC)

Ordering of derived charactersEdit

Hey- I just did a preliminary version of the derived characters section for the and pages, and I had a thought about the best way to order derived characters in the translingual section. I would suggest ordering them by their inclusion in CJK Unified Ideographs (etc.) from earliest to latest. I don't have any idea how this could be done quickly and efficiently, but it does seem more ideal than ordering characters by the stroke order of one side (probably according to Kangxi Zidian's stroke orders in the case of the above two examples).

I had another related thought as well- I think what we are doing by linking the 木 in the Han character section on the page to the page Index:Chinese radical/木 where the characters are ordered by alleged stroke order (by which standard?) is to imply that the stroke order standard used there is "translingual", which it probably is not in many cases. That would seem to be "not good" to me- you can't pretend that there aren't different stroke counts and orders between the different traditions in Asia. Again, I would suggest some kind of "unicode-order". Not to say that the current stroke count ordering is bad- that aspect could be kept and used for each individual language/tradition. But we don't want to give readers the impression that essentially non-translingual stroke counts are translingual (IMO).

Let me know how this hits you. Thanks for your work here. --Geographyinitiative (talk) 03:16, 1 December 2019 (UTC)

If it is more convenient for you to include the derived characters based on Unicode stroke number you may do so, but please do not reorder or rearrange existing derived characters that are sorted based on radical position (left/right/top/bottom) followed by stroke number to stroke number only. The twice-sorted format is much more time-consuming so if you don't want to use it, you can use the single-sort format first and I'll sort it eventually to the double-sort format which is more of a hassle but makes things clearer at a glance.
The character order at Index:Chinese radical/木 is based on the stroke number defined by Unicode. The problem with Unicode's stroke number is a computing issue, because different people would count different stroke numbers for the same character depending on the font they are using, yet Unicode arbitrarily assigns a single stroke number to characters that have different stroke numbers based on different standards. If you are not pleased with the arbitrary stroke number used by Unicode and would like them to choose a fixed standard rather than an arbitrary standard you can write an email or send a proposal to the Unicode Consortium.
Once again, please do not reorder existing derived characters sorted using radical position followed by stroke number to the single sort method using stroke number only. You can use stroke number if it is quicker for you but me (or someone else in future) will sort it eventually to the position+stroke number. Some radicals such as can be counted as three or four strokes but is fine to use either one as long as some consistency is maintained within the same list of derived characters. KevinUp (talk) 04:24, 1 December 2019 (UTC)
  • (chiming in...)
"...ordering them by their inclusion in CJK Unified Ideographs (etc.) from earliest to latest."
That sounds very user-unfriendly. How are users supposed to know when a given glyph was included in the Unicode standard? That's an entirely arbitrary sorting criterion, and it's impossible to derive from the character itself.
At least stroke order, with or without radical, is something that users can figure out from the character. Even if a given character has different stroke counts in different traditions, the count won't be off by much, so the user has a good chance of finding things through a little persistence. ‑‑ Eiríkr Útlendi │Tala við mig 01:05, 3 December 2019 (UTC)
I was also confused by the statement "order of inclusion from earliest to latest". After checking the edits at and , I noticed the characters were arranged by stroke number first, followed by Unicode order (main block followed by Extension A/B/C/D/E/F). Although this arrangement is easier and quicker for editors to add, it will take slightly more time for our readers to search for the character they are looking for because the order of inclusion of these characters in Unicode is arbitrary and not arranged in a predictable manner. For example, if someone is looking for ⿰言林 they will still need to browse through much of the list. However, if it's arranged by radical position (top/bottom/left/right) followed by stroke number, it will take less time because they can ignore characters with a different radical position. Anyway, there is no right or wrong way on how to arrange these characters. It also doesn't matter if there are slightly different stroke counts between different regions, because the main purpose of listing derived characters is for users to (quickly) search for uncommon characters with uncommon formations. KevinUp (talk) 22:15, 5 December 2019 (UTC)

Topic cats for taxonomic namesEdit

Some editors have added these, but there is no consensus to categorise them as we do actual words for organisms. I generally remove them, as they don't serve a purpose. We ought to come to an agreement on it, but in the mean time, it would be preferable not to add more. —Μετάknowledgediscuss/deeds 17:21, 6 January 2020 (UTC)

@Metaknowledge: Thanks for the explanation. I've been wondering why these are not categorized by topic. In the mean time, I've removed the ones that I've recently added. KevinUp (talk) 06:28, 7 January 2020 (UTC)