Module talk:ja-kanji-readings

Latest comment: 1 year ago by Benwing2 in topic export plain_link, get_script, labels

Text edit

@Suzukaze-c, Krun, TAKASUGI Shinji: What should the text on the pages for the kanji categories be? I have the category structure almost correctly handled so far. — Eru·tuon 21:45, 12 June 2017 (UTC)Reply

Personally I don’t like to use a period to separate a kun-reading in category names. Okuragana was quite arbitrary and is still so in compounds. — TAKASUGI Shinji (talk) 23:28, 12 June 2017 (UTC)Reply
Hm, I'm talking about the text on the pages, not the category names. I modified my post to make that clearer. — Eru·tuon 23:33, 12 June 2017 (UTC)Reply
Personally I think the category names are fairly self-explanatory but I'm not sure if that's acceptable at Wiktionary. I'm not sure. —suzukaze (tc) 16:23, 14 June 2017 (UTC)Reply
I think it should have at the very least a short description with links (and romanization of the reading), at the most an explanation of what the reading type and the period are. I guess I will have a go at it and see what I can come up with. — Eru·tuon 18:14, 14 June 2017 (UTC)Reply

Period and hyphen in readings in category names edit

Some category names have periods or hyphens in the category names: see the categories in , for instance. I don't quite understand what these symbols are used for. If it's not desired, it can be fixed in Module:ja. — Eru·tuon 06:13, 13 June 2017 (UTC)Reply

The period has been placed there to affect the transliteration. It should not appear in the category names, nor be used in matching with the jōyō readings. – Krun (talk) 10:40, 13 June 2017 (UTC)Reply
@Krun: Okay, this edit should remove all periods from kana in category titles. — Eru·tuon 18:05, 13 June 2017 (UTC)Reply
@Erutuon: (@Suzukaze-c, TAKASUGI Shinji) The hyphens represent the boundary between the kanji and okurigana (kana that follow the kanji when the reading is used). It’s of course debatable whether we should include the hyphens in category names, but it may have the benefit of disambiguating between distinct readings, e.g. the noun (はる) (haru) and the verb () (haru). I suppose there would have to be some cross-referencing between such categories, though. As I think about it, also considering Takasugi Shinji’s comment above, I think maybe the benefit is too small to warrant the complications. I wonder how many kanji might share the same reading (at maximum)? I guess we’ll only find out once they’ve all been categorized. – Krun (talk) 10:51, 13 June 2017 (UTC)Reply
Um... if it is beneficial to disambiguate はる as a noun and as a verb, what about disambiguating かえる as a godan verb (classical かへる) and as an ichidan verb (classical かふ)? @Krun: For the kanji sharing the same reading, a glance at the 字訓索引 of 広漢和辞典 reveals that kun-readings with 120+ corresponding kanji include
and kun-readings with about 100 include
  • くらい (幺, 幼, 民, 汒, ..., 曭)
  • つつしむ (共, 忯, 劼, 忠, ..., 孎)
  • とる (弋, 手, 吸, 伋, ..., 攬)
  • はかる (寸, 支, 占, 宅, ..., 爨)
  • ひく (𠂆, 勾, 𠬜, 引, ..., 羈)
  • みる (在, 佔, 見, 物, ..., 矚)
(大漢和辞典 might reveal more as it includes more kanji and variant forms.) --Dine2016 (talk) 16:01, 11 September 2017 (UTC)Reply

reading none? edit

- is this correct? DTLHS (talk) 06:19, 13 June 2017 (UTC)Reply

Nope. I think the |on= parameter should just be omitted there. Someone added "none" as a reading in this edit, when the parameter was originally just empty. — Eru·tuon 06:41, 13 June 2017 (UTC)Reply

Odd transliteration edit

@Suzukaze-c, Krun: In the entry , the kun reading みっつ is transliterated mitstsu, perhaps because of the hyphen in the source code (みっ-つ). It should be mittsu, right?

っ-ち would probably transliterate a similar way (chch(i)), but I couldn't find any examples (insource:/っ-ち/). — Eru·tuon 05:39, 17 August 2017 (UTC)Reply

It is wrong. —suzukaze (tc) 06:03, 17 August 2017 (UTC)Reply
Yes, the small っ should always be transliterated as <t> before た, ち, つ, て, と: tta, tchi, ttsu, tte, tto. – Krun (talk) 13:51, 17 August 2017 (UTC)Reply
Okay, I found where the problem was in Module:ja and fixed it. — Eru·tuon 18:16, 17 August 2017 (UTC)Reply

Okinawan edit

While looking at entries using the old readings format, I discovered that Okinawan entries, such as , currently use this module through {{ja-readings}}. Probably they should be switched to {{ryu-readings}}, and either the module should be adapted for Japanese and Okinawan or a separate module should be created, to fix the tagging and the destination of the link. — Eru·tuon 20:00, 17 August 2017 (UTC)Reply

Sometimes I wonder if our Japanese infrastructure is too Japan-centric. {{ja-r}} and {{ja-def}} can be used for languages like Okinawan as well. —suzukaze (tc) 00:03, 18 August 2017 (UTC)Reply

Small ィゥ in historical readings on Japanese Wiktionary edit

The Japanese Wiktionary often gives historical readings such as ヂャゥ or ティ (ja:定#発音) in addition to シャウ (ja:証#発音), with a small vowel kana added to an already complete mora. I'm guessing that this means something like dyaw and tey, with a final semivowel, as opposed to syau, with a disyllabic (or bimoraic) sequence. (The modern use of , (, ) to specify the vowel part of the preceding full-sized kana doesn't make sense here.) I (kana) on Wikipedia seems to confirm this.

I gather these aren't historical spellings, as small kana were somewhat recently invented to disambiguate between cases such as kiya and kya or transcribe new sounds in loanwords, and it is the currently established practice not to use small kana in historical readings. But perhaps we could create a way to (in the background) distinguish these cases, more or less as we distinguish historical きや (= kya) and き.や (= kiya). But it might be best to add a symbol indicating that or represents a semivowel, so as not to mess up existing transcriptions, rather than to make all existing cases of ぢやう suddenly change to dyaw (and require a period to be added to undo that: ぢや.う.

Another option is to allow the input small kana, but have the module display them as large kana: ぢゃぅぢやう (dyaw)). That would be pretty neat. — Eru·tuon 21:15, 11 September 2017 (UTC)Reply

(@Krun? —suzukaze (tc) 21:44, 11 September 2017 (UTC))Reply
@Suzukaze-c, Erutuon As I understand it, this is just an alternative spelling scheme for full-size vs. small kana. There is no difference between シャウ and シャゥ. Both represent the same single, bimoraic syllable. In both cases, small kana continue the sound of a full-size kana, and together the full-sized kana and its following small kana form a unit. The difference seems to be whether that unit is a mora or a full on-reading of a kanji. The former is standard in modern Japanese writing, where as the latter scheme seems to be employed by lexicographers primarily to mark the boundary between readings of the particular kanji in compounds. – Krun (talk) 14:01, 12 September 2017 (UTC)Reply

recursive Category:Japanese kanji with on reading x edit

@Erutuon: Category:Japanese kanji with on reading じょう seems to contain itself. Is this desirable? —suzukaze (tc) 07:41, 15 September 2017 (UTC)Reply

Fixed! It was a teeny error in the pattern. — Eru·tuon 08:12, 15 September 2017 (UTC)Reply

Category:Japanese kanji read as ああ- edit

I wonder if maybe these would be better as Category:Japanese kanji read as ああ (without the hyphen). (Category:Japanese kanji with kun reading ああ- would retain the hyphen.) —suzukaze (tc) 04:21, 21 September 2017 (UTC)Reply

That would make sense to me. I was also thinking that each kun category should check for other categories whose titles are the same except for the presence or position of a hyphen, and link to them. — Eru·tuon 05:01, 21 September 2017 (UTC)Reply
It's not just the trailing hyphens that are problematic, but also the inclusion of the kana after the hyphens. This results in the module generating incorrect categories. As another example, [[Category:Japanese_kanji_with_kun_reading_やっ-つ]] should not exist:
  • The kanji reading is や, not やっ -- the gemination is basically an excrescence, and should be handled by using the k1= parameter to indicate sound shifts. I've since added that param at the 八つ entry, but {{ja-kanjitab}} as at has no mechanism for indicating such sound shifts.
  • The -つ is not part of the kanji reading, i.e. how the kanji portion itself is pronounced. No such appended okurigana should be treated as part of the reading. In addition to the yatsuyattsu excrescence above, we also have verbs and adjectives, which get very complex with conjugated forms. Take , for instance. If we categorize this with conjugated endings appended to the kun reading, we have to account for みる, みえる, みせる, at a bare minimum. What about みゆ? What about みす? What about みられる, or みます, or みた? Where do we stop adding conjugated endings? And what about just the verb stem み, used in compounds such as in みいだす? The underlying kun is just み: the conjugated endings are all added to that, but are not part of that.
(Note that here, I'm only talking about categorization. This same query has implications for what we include in the {{ja-readings}} list as a whole, which we should probably also suss out at some point -- and then document at WT:AJA, which is horribly out of date -- but that's a matter for a separate thread.)
  • If we include all inflectionary endings as part of the reading, we run the risk of confusing our readers. If we state that the kanji 八 is read as yattsu, then logically 八つ must be read as yattsutsu, since we have the kanji 八 with the reading yattsu and then we have that additional kana つ on the end. Similarly, 八雲 must be read as yattsu kumo or yattsu-gumo, following the same reasoning. Sure, we put in hyphens in the readings to theoretically show that something special is going on, but I don't think that's explained anywhere. Readers have to already know what it means in order to understand what it means. That's not a good approach, from the perspectives of accessibility or usability; I grow increasingly worried that this kind of assumption will frustrate and alienate language learners trying to use Wiktionary.
I hope the above lays out a case for reworking our treatment of kun'yomi reading categories. ‑‑ Eiríkr Útlendi │Tala við mig 16:33, 21 August 2018 (UTC)Reply

Request: Hidden category for kanji with unclassified on readings edit

Possibly added whenever "on=" parameter is invoked. So that someone with an appropriate dictionary can add the specification where need be. 50.217.25.234 22:31, 1 March 2023 (UTC)Reply

I've thought about this some, but come to no conclusions.
The problem with categorizing all plain-on readings as "uncategorized" is that, for those kanji where the kan'on and goon are the same, we just use on. So the readings for these are categorized -- it's just that we don't have any clear means of indicating when kan'on and goon are the same, versus when we just haven't gotten around to checking yet.
@Fish bowl, Alves9, Atitarev, Huhu9001, do you have any ideas on how we could better categorize those characters where kan'on and goon are the same? ‑‑ Eiríkr Útlendi │Tala við mig 17:30, 13 March 2023 (UTC)Reply
So...force the editors to manually input the same value for both |goon= and |kanon= ? -- Huhu9001 (talk) 09:12, 14 March 2023 (UTC)Reply

export plain_link, get_script, labels edit

@Benwing2 Why was these functions exported? It is abnormal for other modules or templates to use the subroutines of this module, as it is designed solely for t:ja-readings. Unexpected invocation or calling elsewhere can make the code entangled and confusing. -- Huhu9001 (talk) 03:26, 4 April 2023 (UTC)Reply

@Huhu9001 See Module:category tree/poscatboiler/data/lang-specific/jpx. Benwing2 (talk) 03:59, 4 April 2023 (UTC)Reply
@Benwing2: What should I see in that page? I see no require("Module:ja-kanji-readings") there. -- Huhu9001 (talk) 04:03, 4 April 2023 (UTC)Reply
@Huhu9001 Module:category tree/poscatboiler/data/lang-specific/ja, Module:category tree/poscatboiler/data/lang-specific/ryu Benwing2 (talk) 04:07, 4 April 2023 (UTC)Reply
Return to "ja-kanji-readings" page.