Wiktionary talk:About Japanese

Active discussions
Archive
Archives
  1. Archive 1 (Inactive topics as of April 25, 2006)
  2. Archive 2 (Threads from 2004 to 2010)

Updates needed for {{ja-readings}}?Edit

I've been filling in the yomi of a number of kanji entries lately, and I've run into some structural limitations of the {{ja-readings}} template. For single kanji used for verbs, the kun'yomi section in particular can become ridiculously large and visually messy, as can be seen at 結#Japanese. I've been trying to use sane wiki markup so anyone coming after me can more easily see what's going on, like the following:

====Readings====
* {{ja-readings
| on=<!--
-->[[けつ]] (''[[ketsu]]''), <!--
-->[[けい]] (''[[kei]]'')
| kanyoon=<!--
-->[[結]] ([[けち]] ''[[kechi]]'') to win an [[archery]] competition; to claim undecided territory in the [[endgame]] of [[go#Etymology_2|go]], <!--
-->[[結する]] ([[けっする]], ''[[kessuru]]'') to become [[constipated]]; to [[tie up]] or [[conclude]] an [[argument]] or stated position, <!--
-->[[結す]] ([[けっす]], ''[[kessu]]'') alternate for 結する
| kun=<!--
-->[[結ぶ]] ([[むすぶ]] ''[[musubu]]''), <!--
-->[[結び]] ([[むすび]] ''[[musubi]]''), <!--
-->[[結ばる]] ([[むすばる]] ''[[musubaru]]''), <!--
-->[[結ばわる]] ([[むすばわる]] ''[[musubawaru]]''), <!--
-->[[結ぼる]] ([[むすぼる]] ''[[musuboru]]''), <!--
-->[[結ぼうる]] ([[むすぼうる]] ''[[musubōru]]''), <!--
-->[[結ぼれる]] ([[むすぼれる]] ''[[musuboreru]]''), <!--
-->[[結う]] ([[ゆう]] ''[[yuu]]''), <!--
-->[[結い]] ([[ゆい]] ''[[yui]]''), <!--
-->[[結わう]] ([[ゆわう]] ''[[yuwau]]''), <!--
-->[[結わえる]] ([[ゆわえる]] ''[[yuwaeru]]''), <!--
-->[[結える]] ([[いわえる]] ''[[iwaeru]]'') alternate for 結わえる, <!--
-->[[結く]] ([[いわく]] ''[[iwaku]]'') alternate for 結わえる, <!--
-->[[結く]] ([[すく]] ''[[suku]]'') to [[knit]] a [[net]], <!--
-->[[結なす]] ([[かたなす]] ''[[katanasu]]'') to gather or tie together into one bunch, <!--
-->[[結める]] ([[かためる]] ''[[kataneru]]'') to bind together; to open and read out the content of official documents, <!--
-->[[結ぬ]] ([[かたぬ]] ''[[katanu]]'') alternate for 結ねる
| nanori=
}}

The 結#Japanese example is plug ugly, and hard to read, but all of the information there is proper to include as best I can tell, and does indeed belong in the list of kun'yomi. What I'd like is for the {{ja-readings}} template to show readings in a bulleted list, for a cleaner presentation and easier usability.

Instead of this:

ReadingsEdit

... I'd rather see something like this:

ReadingsEdit

On:

Kan'yō:

Kun:

Ideally, the template would also allow folks to input multiple readings with each on its own line, as in the 結#Japanese code sample above but minus the crutch of <!-- --> HTML comments --- but that's probably asking too much, given what I've seen of template syntax (yech!).

I'm hoping there's someone reading this page who has the requisite template expertise to implement this change. If I hear nothing in, say, a week or two, I may have a go at making the change myself. :) -- Cheers, Eiríkr Útlendi | Tala við mig 23:12, 1 February 2011 (UTC)

There's been no comment for the last half-year, so I'll start looking into changing the template. -- Ta, Eiríkr Útlendi | Tala við mig 16:25, 4 August 2011 (UTC)
Turns out the template is locked. I've posted on Template_talk:ja-readings#Formatting_when_the_list_of_yomi_gets_long in an attempt to get some momentum going. -- Eiríkr Útlendi | Tala við mig 17:16, 4 August 2011 (UTC)
I've replied there. - -sche (discuss) 02:14, 7 September 2011 (UTC)
@Eirikr I'm sure everyone would agree that your proposed format would be a vast improvement to what the ja-readings template gives us. I've read the comments on the template page and it sounds like the change you are proposing to this template will not be forthcoming for various reasons which I can't understand, but is there anything that should prevent us from reformatting these without a template, as you have above? 馬太阿房 (talk) 16:35, 21 April 2017 (UTC)

romanizing -suru verbsEdit

I was wondering what the word is on how to romanize -suru verbs like 勉強する, that is, as benkyō suru or benkyōsuru. According to the supplied example, 監督する, there is a space, as there is for 勉強する. I would assume there should be one since there is a space for -na adjectives as well. On the other hand, when I casually looked at a number of other type-3 verbs, all of them had no spaces. Maybe I missed something but I couldn't find anything that explicitly says if there should be a space or not. I don't have a preference one way or the other, but it seems to me that the dictionary ought to be consistent, so is it safe to assume that the entries without spaces should be edited to include them? thanks! Haplology 16:29, 10 February 2011 (UTC)

A bit late in replying, but I'd put my 2p on including the space. This makes it clear that the core part of the word (the bit in kanji) is distinct. After all, only the する part conjugates, and both the core part and する are indeed distinct words unto themselves. A number of Japanese publications I've seen that use romaji will leave out the space, but I think this is primarily in reflection of the lack of spaces in Japanese writing. Latin-alphabet writing needs spaces for clearer visual parsing, in part as we don't have the nice kanji-vs.-kana visual distinction to rely upon. -- Cheers, Eiríkr Útlendi | Tala við mig 16:22, 4 August 2011 (UTC)
Really old thread that probably doesn't matter anymore but I for one like to attach -suru w/o a space to enforce the idea that 勉強する as a whole is a verb... —suzukaze (tc) 04:24, 28 August 2016 (UTC)

Ateji and rare readingsEdit

A question for the group, here --

Is there any consensus on how best to handle nonstandard ateji or otherwise rare pronunciations?

  • Another example is 神#Kanji, which includes the reading たましい. I've only ever seen たましい spelled in kanji as either or (more rarely) , but I could imagine 神 being used instead as an 意読.

So, do we remove such rarities? Do we keep them, but mark them? If so, how? Is there some sort of threshold for frequency of use before we include an 意読 for a particular kanji word?

Any insight appreciated. -- Eiríkr Útlendi | Tala við mig 16:14, 4 August 2011 (UTC)

Lemma forms for keiyōdōshiEdit

Can anyone elucidate the reasoning behind including the な on the end for keiyōdōshi lemma forms? This な is essentially a particle, and is in no way integral to the word, as can be seen by swapping this for に to create the adverbial, or for だ to create the terminal. It would seem to make much more sense to use the root form of a keiyōdōshi, i.e. the form without the な, as the lemma -- as, indeed, do all other dictionaries that I'm aware of. -- Eiríkr Útlendi | Tala við mig 20:19, 22 August 2011 (UTC)

I agree. Haplology 16:34, 23 August 2011 (UTC)

Work needed on Template:ja-naEdit

Please have a look at Template_talk:ja-na#Redesign needed to deal with adjectives that have no kanji and respond as appropriate. I am happy to implement the changes myself, so feel free to give your opinion even if you aren't up on template syntax. -- TIA, Eiríkr Útlendi | Tala við mig 17:36, 25 August 2011 (UTC)

Work NeededEdit

(Copied over from WT:Beer_parlour#WT:About_Japanese)

Following comments in various other threads, it appears that the WT:AJA page needs some work. The issues I'm immediately aware of:

  • Quasi-adjectives (な adjectives): WT:AJA insists on including the な in the headword, which does not appear to be the current consensus.
  • の adjectives: WT:AJA does not include any clear guidelines for these. (Relatedly, {{ja-adj}} doesn't include any way of handling these either.)
  • Suru compound verbs: WT:AJA calls for using the {{ja-suru}} template. However, する is a standalone verb, so including the する conjugation on each and every compound verb page seems excessive.
  • {{ja-kanjitab}}: WT:AJA describes including this under an === Etymology === section if there is one, but including under the main == Japanese == section produces largely identical results, unless there are multiple etymology sections, in which case repeating the kanjitab seems excessive.
  • The Transliteration subpage could also use some work, particularly with regard to spacing and what constitutes a single word in Japanese (i.e., particles should be separate, suru should be separate, etc. etc.).
  • 連体詞: WT:AJA states that this should be given a POS of "prefix", but that is really not what these words are -- a prefix is part of a word, whereas 連体詞 are clearly standalone words. They are less prefixes and more like true adjectives, in that they must precede a noun.
  • Single-kanji entries: WT:AJA has no clear instructions on how to specify okurigana in kun'yomi listings, nor any clear instructions on how to format these to link to verb forms. For instance, shows one way of clarifying okurigana and linking to kanji+okurigana entries, but is a bit visually messy; ja:食#日本語 looks a bit cleaner with the use of hyphens to show the break between the kanji and the okurigana, and this roughly matches the format I've most often seen in dead-tree dictionaries, but the entry doesn't link to any kanji+okurigana entries, just to the hiragana entries; and doesn't show okurigana or link to any kanji+okurigana entries.

This post is really just meant to get the ball rolling. Many of these changes listed above are a departure from what WT:AJA currently says, so I'm hoping to spark a bit of discussion before making any edits. -- TIA, Eiríkr Útlendi | Tala við mig 17:41, 6 September 2011 (UTC)

  1. Regarding your first point: you're proposing to remove from the address of the page, not just the headword, correct? (You're proposing to move 浅はかな to 浅はか, and to change the headword from 浅はかな (な-na declension, hiragana あさはかな, romaji asahaka na) to 浅はか (な-na declension, hiragana あさはか, romaji asahaka)?) Do any other changes need to be made to quasi-adjective entries? For example, do the declension tables need to be modified? I'm trying to ascertain how difficult it would be to make the change by bot. It seems it would be simple (move the page and eg change "な|rom" and " na}}" to "|rom" and "}}"), and you could write a bot or ask one of our technically-skilled editors to write one for you. The only comments I've seen in discussions of this subject have supported removal of the , so I would say there's consensus for the change.
  2. Regarding の adjectives: can you give an example of one?
  3. Regarding Suru compound verbs: is there any harm in giving the conjugation? On de. and en.Wikt, we give eg the conjugation of anhalten and zurückhalten, even though it is merely the conjugation of halten + an/zurück. The code to generate the conjugation table appears to use only information that is already elsewhere in the entry, so including the template seems not to require the creator of an entry to look up any more information than (s)he has already had to look up to determine the page title and write the {{ja-verb}} headword line. I would keep the conjugation tables in all of the entries.
    In a later point, you seem to suggest considering suru a separate word. Would you propose deleting the Suru verbs as SOP at that point?
  4. Isn't [[:ja:食#日本語]] an interwiki link to ja.Wikt? What did you mean to write? - -sche (discuss) 01:42, 7 September 2011 (UTC)
Hello -sche, I've taken the liberty of changing the bullets in your reply to numbers for easier reference. My correspondingly numbered replies below:
  1. Yes, the main lemma entry should be the form without the な - so 浅はか would be the main page, and 浅はかな would mostly just point to 浅はか, much as any other entry for a conjugated word form mostly just points to the main headword. As far as I can tell, the only changes needed would be to the headwords and related minutiae; it would probably be bot-able. Moving from [quasi]+な to just [quasi] would be the easiest option. I don't think declination tables need any changing at all; in fact, they're partly what got me thinking about the change, since they include the adjectival な forms, but also the adverbial に forms, among others, making a lemma with no following particle the more natural place to put such information. Moreover, all other dictionaries I've ever used do not include the な on the end in any headwords.
    Do you know of any good resource or tutorial pages in the MediaWiki universe here that describe how to make a bot?
  2. Just off the top of my head (entries I've worked on recently), の adjective examples include (とひと) and でぶでぶ. Conjugation would be mostly the same as for な adjectives, but I'd have to go through my references to tell you the exact differences.
  3. No harm in including the する conjugation. There are simply *so many* more of these types of verbs as there are of any one type of verb in German or English that things start to get kind of silly with the repetition, but no, there's no real harm in having it.
    And yes, する is a standalone verb in its own right, which simply means "to do", so by that measure, [noun]+する pages would indeed be SOP. However, it is important to be able to note which nouns can be used in verbal ways. From an aesthetic perspective, it'd be much more graceful to include [noun]+する information right on the [noun] page, and sending the user to the する page for information on how that verb is conjugated. That's perhaps too much to bother with for a bot, though, I'm not sure.
    FWIW, other Japanese dictionaries (either JA-JA or JA<>EN) list just the [noun] entries, and mark within them whether the noun can take する -- there are no [noun]+する headwords in any other dictionary that I've ever seen.
  4. The [[ja:食#日本語]] bit is indeed a link to the Japanese Wiktionary, specifically to the 日本語 (Japanese) heading on the 食 page. That was intended to provide an example of how the JA WT folks are formatting their entries with regard to okurigana - something that we don't have any official policy or plan for.
Hope this helps explain things. -- Cheers, Eiríkr Útlendi | Tala við mig 05:47, 7 September 2011 (UTC)
Thanks for the clarifications!
  1. Wikipedia has w:Wikipedia:Creating a bot. I myself know little about bots.
  2. Editing {{ja-adj}} to handle の adjectives seems to be the simplest of these issues (because the template requires relatively few parameters and displays relatively little information, for example no declined forms). I think the only change that needs to be made is to make the template accept "no" (and の?) as an answer to "decl=", and display "の-no declension"... right? I think you could go ahead and make that improvement to the template; we may still lack a template like {{ja-suru}} that produces the conjugated forms, but because many entries lack conjugation sections, I do not think it is necessary to design a の-conjugation-template before updating the headword-line template.
  3. I'd like to keep the definition-lines currently in the [noun]+する entries, because they do vary in form/meaning at least slightly (失礼する = "to be rude", but 旅行する = "to travel, to make a journey"). I do like the idea of listing such information in the noun entries (indeed, even if the compounds are kept!) — perhaps like this or this?
  4. Oh, sorry; I thought you meant [[:ja:食#日本語]] and {{l|ja|食}} were alternative ways of linking to entries! I misunderstood (and still do not understand, ha) that issue. - -sche (discuss) 07:48, 7 September 2011 (UTC)
Hallo noch einmal, bevor ich schlaffe --
Ich sehe auf Deiner Benutzerseite daß Du deutsch sprichst, aber vielleicht ließt Du auch japanisch? Ich weiß gar nicht ob ich diese Romaji auch schreiben soll, aber ich will doch nicht 失礼する wenn Du vielleicht Romaji brauchest. :) -- Eiríkr Útlendi | Tala við mig 08:10, 7 September 2011 (UTC)
And about the ja wikt and en wikt bits, that was just about contrasting how the en.wikt entry for looks for the on'yomi and kun'yomi versus how the ja.wikt entry looks. The ja entry clearly delineates where the kanji pronunciation ends and the okurigana begin, whereas the en entry doesn't -- which is a bit of a failing. -- Cheers, er, Tschüß, Eiríkr Útlendi | Tala við mig 08:10, 7 September 2011 (UTC)
I agree with the consensus on these changes to WT:AJA. There are a couple of other questions I want to add:
  • Is there some way we can indicate that an adverb takes the particle -と (-to)? It is so common that perhaps it ought to be in the headword template, but I don't think there's a field for it in ja-pos.
  • I don't have a preference either way but it would be nice if AJA were clear about how to format counters, specifically if they take a hyphen, like -匹, or if they have none. It says "e.g., -本", which looks like 本 plus a hyphen at first glance, but the link itself has no hyphen. At least it should be rewritten for clarity.
  • Speaking of bots, could we make a bot to add or fix hidx? It's completely mechanical and uncontroversial, but hard for newbies to pick up and easy for anyone to forget. I've noticed that there's a lot of variation in how it's used.
Thanks Haplogy 13:35, 9 September 2011 (UTC)
Hello,
Thanks to Eiríkr for pointing me to this discussion. If no one objects, I'd like to get the ball rolling on changing the な-type adjectives with this change to WT:AJA#Quasi-adjectives:
== Quasi-adjectives ==
The main entry for quasi-adjectives should be in the 'plain' or 'root' form:

 === Adjective ===
 {{ja-adj|k|decl=な|hira=(kana)|rom=(romaji)}}

E.g. 平安 (heian) has a level 3 section like this:
 === Adjective ===
 {{ja-adj|k|decl=な|hira=へいあん|rom=heian}}

平安 (hiragana へいあん, romaji heian)

This should be followed by the definition(s), and then the declension table using template {{ja-na}}.

Note that the “plain form” in this case is also a noun. This should not be a problem; just as bet is both a verb and a noun, 平安  is both a noun and an adjective.
Does this look good? (Sorry the formatting is awkward, I wanted it all to be in that grey box thing.) -MichaelLau 01:58, 10 September 2011 (UTC)
Looks good to me. Haplogy 05:19, 11 September 2011 (UTC)
I'm fine with this mostly, too, except for one sticking point -- many (most?) 形容動詞 are not nouns at all, such as 静か or でぶでぶ, and cannot be the subject of a sentence. I think 平安 is actually the exception here. With this in mind, I'd rework that last para as follows:
Note that the “plain form” in this case is also a noun for certain words. This should not be a problem; just as bet is both a verb and a noun, 平安  is both a noun and an adjective.
And then there's also the various ways in which they conjugate - some take な and some take の to become adjectives, some take に and others take と to become adverbs - which we need to build into the template (the な・に format is already built in). A few oddballs appear to do both in one way or another, such as 常・恒, for which I can find examples of use as an adjective with both な and の.
Food for thought, anyway. I'm glad this discussion is happening. What would folks say to one of us creating a copy of the current version of WT:AJA, maybe by creating a new page at WT:About_Japanese/Draft or somewhere similar and just copying the content of WT:AJA over, and then we can start collaboratively editing the draft version? -- Eiríkr Útlendi | Tala við mig 07:24, 11 September 2011 (UTC)
I made the /draft page. I think there are probably a lot of ways to change this to make it easier to navigate also. For instance, people should know whether they are interested in contributing to classical Japanese, so all those sections on classical Japanese can be extracted and made their own page or section without cluttering up the page for everyone else. -MichaelLau 14:26, 12 September 2011 (UTC)
I made what I think is the minor change of changing [[lemma]]: to {{ja-def|lemma}} to the links under section 3.1 Non-lemma forms. Haplogy 14:46, 12 September 2011 (UTC)
Brilliant, thank you Michael and Haplology! I'm creating the Wiktionary_talk:About_Japanese/Draft page to discuss edits to the draft. -- Cheers, Eiríkr Útlendi | Tala við mig 16:53, 12 September 2011 (UTC)

link to separate characters in the headwordEdit

Wiktionary:Feedback#.E7.AB.AF.E6.9C.AB could be implemented by adding a head= paramter to the Japanece templates, and then setting head=. We should keep the box, because it displays the kanji in a large, legible font, but is there a reason not to also link them in the headword? (Oh, maybe blue/red font is harder to read, especially if one character is blue and the other is red.) - -sche (discuss) 19:41, 10 October 2011 (UTC)

I think it would be more approprate for multipart terms (see my change to ロシア連邦, not for each character (kanji or kana) but that just IMHO. I changed ジェシカ because it looked ugly to me. We do have a kanji box, adding the same in the header would be redundant, again, IMHO. --Anatoli 23:23, 10 October 2011 (UTC)

~々Edit

How should we format words made up of two identical kanji, like 次々? "~々" is the only one you can find outside of dictionaries, hence probably more likely to be searched for. On the other hand, "次次" is the real word, in a sense, and all the dictionaries that I can find list these words as "次次" rather than "次々". I'm leaning toward making "次次" the lemma entry, and listing "次々" as an alternative form using {{alternative form of}}. The link at tsugitsugi would point to 次次. What does everyone else think? Haplology 05:27, 11 December 2011 (UTC)

I prefer 次々, as people search for the most common spelling, not the "correct" one. I wouldn’t write 次次, because it just looks wrong. (次次回 jijikai is okay though I prefer 次々回.) If we follow paper dictionaries, we should have つぎつぎ as main entry, but it is not the case here on Wiktionary. — TAKASUGI Shinji (talk) 00:42, 12 December 2011 (UTC)
I'm with Takasugi-san here that the lemma should be under the most common rendering, 次々 in this case. That said, I think we should also have a 次次 entry, pointing back to 次々, in the interests of completeness and in case anyone does look up the doubled form. -- Eiríkr ÚtlendiTala við mig 17:18, 7 March 2012 (UTC)

Fullwidth alphabet letters and digits, and halfwidth katakanaEdit

Atitarev and I had a discussion about fullwidth digits in User talk:Atitarev#Fullwidth digits. As I have explained there, fullwidth digits, namely , , , , , , , , , and , are considered obsolete by the Unicode standard, and I don’t think we should use them for main entries. What do you think? — TAKASUGI Shinji (talk) 10:20, 16 March 2012 (UTC)

I certainly don't see much utility in having these, since they're only a typographical mechanism for displaying the Arabic numerals in double-byte fonts. I would just recommend deleting them, except I know we have other WT pages for single characters. Maybe this is something to bring up in the WT:Beer parlor? -- Eiríkr ÚtlendiTala við mig 14:56, 16 March 2012 (UTC)
The ten character pages I listed above are all right, because they explain Unicode information. What Atitarev and I talked about is which is good for the main entry of 十日 written with Arabic numerals, the halfwidth 10日 or the fullwidth 10日. I think we should use the former naturally. — TAKASUGI Shinji (talk) 15:25, 16 March 2012 (UTC)
Ah, I'm with you now. I agree that half-width (i.e. single-byte) numerals should be used instead of full-width (i.e. double-byte). -- Eiríkr ÚtlendiTala við mig 19:27, 16 March 2012 (UTC)
If display is a concern, as Anatoli suggests, it is possible to put text like "10日" in a template that uses an appropriately monospace font, just as Hebrew text is put into a template so that it can be displayed in an appropriately legible font. - -sche (discuss) 18:37, 16 March 2012 (UTC)
Talk:CD is related to this issue. A discussion archived on that page reached the decision that Japanese words which are spelt in Latin script should be spelt in "regular" Latin letters, not in fullwidth ones, hence the Japanese word for a compact disc is CD, not CD. - -sche (discuss) 00:58, 28 July 2013 (UTC)

On ja-kanjitabEdit

Please see & comment: Template talk:ja-kanjitab#Links to Translingual. Thanks! --Μετάknowledgediscuss/deeds 12:44, 6 September 2012 (UTC)

Alternative readings headerEdit

There are several entries in the category Category:Entries with non-standard headers with the header "Alternative readings". Should the header be changed, or is that header OK (in which case, remove the cleanup template and inform Liliana). - -sche (discuss) 21:49, 29 September 2012 (UTC)

See also [1], ひょく, びょう, byō. - -sche (discuss) 01:13, 2 October 2012 (UTC)

Romaji entriesEdit

Does anyone object to changing romaji entries as per Wiktionary:Beer_parlour/2013/February#Stripping_extra_info_from_Japanese_romaji?

==Japanese==

===Romanization===
{{ja-romaji|hira=りゅうと|kata=リュート}}

# {{ja-def|隆と}} stylishly
# {{ja-def|リュート}} a lute {{gloss|the musical instrument}}

--Anatoli (обсудить/вклад) 04:16, 27 February 2013 (UTC)

Canned usage note for katakana in scienceEdit

How would everyone feel about a template like {{kata-bio}} that goes something like this?

As with all names of plants and animals, the katakana form of this term is always preferred in scientific contexts.

Feel free to reword this.

I think a note like this would be appropriate for any entry which is the name of a plant or animal. Since it's the same message each time, it would be nice to have it written in the best possible way, both for substance and for style. As for substance, I'm not sure if medical doctors treat katakana the same way--please add details if you know. There is a cluster of entries from long ago which have a similar usage note which was evidently copy-pasted between them (and its style could have been improved in my opinion.) --Haplology (talk) 03:19, 27 March 2013 (UTC)

  • A late reply: we do have {{U:ja:biology}}. This generates content like the following:

As with many terms that name organisms, this term is often spelled in katakana, especially in biological contexts.

... or ...

As with many terms that name organisms, this term is often spelled in katakana, especially in biological contexts, as サンプル.

HTH! ‑‑ Eiríkr Útlendi │ Tala við mig 19:58, 6 March 2015 (UTC)

Ordering etym sections in multi-etym JA entriesEdit

Hello anyone watching this page. I've recently found myself working on more JA entries with multiple etym sections, giving rise to the question of how to order the different sections.

  • For entries with both kun'yomi and on'yomi, which should come first?
⇒ My sense is that maybe on'yomi should come first, since these are often listed first in JA kanji dictionaries. On the other hand, on'yomi are essentially borrowings from Chinese (for the most part), so perhaps the kun'yomi should come first as the native Japanese derivations?
  • Among the on'yomi etyms, which should come first?
⇒ My thought here is chronological. If we have goon, that comes first, then kan'on, then tōon, then sōon.
  • Among the kun'yomi etyms, which should come first?
⇒ Here, I'm less certain.
  • One instinct is to also list these chronologically, starting from the oldest forms. See the kun'yomi etyms in this version of the 仮名 entry for one example. The oldest reading karina is listed first, then the derived kanna reading, then the derived kana reading.
  • But then again, perhaps we should start with the most common reading?
    And if we start with the most common, do we list the rest in order of most-used?
    Or do we list the rest chronologically?

I'm interested in any constructive feedback. Our current state is basically willy-nilly, which is starting to bother me. A more standard policy would be preferable. ‑‑ Eiríkr Útlendi │ Tala við mig 20:08, 6 March 2015 (UTC)

@Atitarev, Haplology, TAKASUGI Shinji, Wyang, エリック・キィ, Tsukuyone, Nibiko, Umbreon126, Kc kennylau Ping!

(Ping didn't work). In my opinion, most common senses and readings should come first, regardless of yomi. I personally hate the chronological order, which causes problems with translations, for example @truth. --Anatoli T. (обсудить/вклад) 05:47, 13 March 2015 (UTC)
(Ping absolutely didn't work) I agree with Anatoli. —suzukaze (tc) 04:28, 28 August 2016 (UTC)
@Eirikr: you probably know now, but anyway this is important: you must write your signature in order for ping to work (no signature). — TAKASUGI Shinji (talk) 05:04, 11 March 2018 (UTC)

{{kanji}}Edit

has been non-existent for just under 10 years now. I think this page needs an overhaul. —suzukaze (tc) 10:28, 22 January 2017 (UTC)

User who merits blocking / nuking on sightEdit

See Talk:御御籤#RFC discussion: April 2014.

Context-dependent sort keyEdit

@Atitarev, Eirikr, Haplology, Suzukaze-c: I proposed a new sort key function in meta:2017 Community Wishlist Survey/Wiktionary#Context-dependent sort key, with no reaction yet. It will be easier to apply different sort keys for Japanese and Chinese entries in the same page. I’m not sure how English Wiktionary handles conflicting sort keys, though. What do you guys think? — TAKASUGI Shinji (talk) 09:46, 18 November 2017 (UTC)

I think it is a good idea. I think it could also cut down on redundant module calls (@Erutuon once mentioned that it was silly how Module:vi-sortkey is executed multiple times for a single entry, IIRC). —suzukaze (tc) 09:50, 18 November 2017 (UTC)
Some time ago, when I was considering posting a request on Phabricator for my sorting idea, I encountered a request that was already up and someone was working on it. I'll have to dig it up. It was similar to this ("Allow collation to be specified per category"). Aha, it's here ("Support collation by a certain locale (sorting order of characters)"). Has a lot of technical discussion, which I skimmed at the time.
Among the ideas I recall was having multiple sortkeys connected to languages on each page, and each category has some magic word that determines which language's sortkey it should use, or which language's sort order. I think there was something about using functions created by Unicode to create language-specific sortkeys or to implement sort orders in some fashion. I might be misremembering stuff. The task is now closed, whatever that means.
Anyway, I think the idea of tying categories to specific languages in the server and using multiple sortkeys or server-implemented sort orders sounds like it would solve the problem of CJKV on a single page. You could use a {{DEFAULTSORT:}} type magic word, but it would specify sorting for a specific language's categories, or leave it to the server. I wonder if they're working on that. — Eru·tuon 10:27, 18 November 2017 (UTC)
@TAKASUGI Shinji: I think it's a good idea. Please note that Chinese entries are no longer sorted by pinyin (which is only relevant to Mandarin, anyway) but by radicals. User:Wyang may tell you more about how sorting is done in Module:zh and Module:zh/data/sortkey. I believe language-specific sorting can be done inside Wiktionary by headword modules, which is already the case for Chinese and various other languages, which require sorting different from default but I'm not a Lua guru. --Anatoli T. (обсудить/вклад) 10:58, 18 November 2017 (UTC)
Chinese sorting is now done automatically by Module:zh-sortkey. Japanese could definitely benefit from a sorting cleanup - it's silly to enter the same sortkey multiple times. If the SECTIONSORT functionality (which gives a sortkey for subsequent text until the next SECTIONSORT is encountered) were to be actualised, it may be technically feasible for the SECTIONSORT keys to be generated automatically, every time the {{ja-pron}} template is called, thus leaving no trace of the sortkey in the entry code at all and making the sorting even more intelligent. Wyang (talk) 12:25, 18 November 2017 (UTC)
I read the post finally. I don't think sortkeys should be tied to sections. For instance, the Chinese section often contains categories for many Chinese varieties in addition to (written) Chinese. Each variety needs a different sortkey, and usually there are multiple categories for a single Chinese variety: Mandarin lemmas and Mandarin nouns in , for instance. A SECTIONSORT would only be able to give one sortkey. So, for example, if SECTIONSORT were the radical-stroke sortkey used in Chinese categories, the Chinese categories would not need manual sortkeys, but the categories for Mandarin, Wu, Cantonese, Min Nan, and so on would.
What would work well for Chinese sections is to tie sortkeys and categories to language codes. That is, each language in a page gets a sortkey, and the category looks for the sortkey of a particular language and uses it. (Sortkeys could still be specified manually for an individual category, of course.) Then, the Mandarin categories could use pinyin sortkeys, Chinese categories use radical-stroke sortkeys, and so on, and each of these could be specified only once on the page. This would require two magic words: a language-specific DEFAULTSORT-type magic word in the entry, and a magic word in the category page (which could be added automatically by Module:category tree) specifying which language's sortkey the category should use, if it is present. So to make up names for the magic words, the entry could contain {{LANGSORT:cmn|pinyin}} and {{LANGSORT:zh|radical stroke}}, and then the Mandarin categories could contain {{SORTLANG:cmn}} and the Chinese categories {{SORTLANG:zh}} to tell the server to use the cmn or the zh sortkey respectively. — Eru·tuon 22:54, 18 November 2017 (UTC)
The Chinese situation is kind of abnormal though. I think that Chinese+radical could be the section default, and Mandarin/Cantonese, etc. could have classic manual sortkeys in the [[Category:_______ lemmas|_______]] style. —suzukaze (tc) 23:05, 18 November 2017 (UTC)
Well, yes. That's what {{zh-pron}} currently does. My more theoretical concern is that really sortkeys pertain to categories, not to sections. Sections don't have sortkeys; they have a language to which most categories used in the section pertain. But more practically, there are also categories that are not specific to the language and perhaps shouldn't use a language-specific sortkey (Tea room). — Eru·tuon 00:21, 19 November 2017 (UTC)
Many years ago, I posted somewhere at a Meta site regarding the unusability of sortkey functionality for Japanese entries, inasmuch as 1) a single Japanese entry spelling often needs multiple sortings based on different phonetic realizations, and 2) at the time (and I suspect still) the Mediawiki software handled multiple sortkeys for a single category specification on a single page by ignoring all but the last sortkey. My post (ah, here it is) got no reply at all. That was 5.5 years ago. ご参考まで. ‑‑ Eiríkr Útlendi │Tala við mig 04:16, 20 November 2017 (UTC)
Thank you. Meta is not a popular place for discussion. As you point out, it is good if multiple sort keys are possible for Japanese (ex. fr:Catégorie:Homographes non homophones en japonais). Is there any language that needs multiple sort keys other than Japanese? Some Chinese characters can be categorized in more than one radical, depending on dictionaries. — TAKASUGI Shinji (talk) 00:12, 22 November 2017 (UTC)

Middle JapaneseEdit

Which language code do I give to words marked as "Middle Japanese" in literature? Crom daba (talk) 18:49, 15 December 2017 (UTC)

@Crom daba: Unfortunately, no such language code exists, at least in the ISO standard. There's OJP for Old Japanese, usually regarded as ending around 800 CE or so, and then there's JA for everything after that -- which is an awfully big bucket. I'm open to the creation of such a code. Middle Japanese, sometimes a.k.a. Classical Japanese, is different enough from the modern language to warrant different treatment here -- differing usage, conjugation patterns, etc. That said, any such initiative should probably get hashed out in the Beer Parlor first. :) ‑‑ Eiríkr Útlendi │Tala við mig 00:14, 16 December 2017 (UTC)
I just want to know the proper way (the wiktionary way) to mention the Japanese words given here (if you could find it in its original script I would be very thankful), I'm not suggesting to introduce Middle Japanese we don't have a need for it. Crom daba (talk) 00:26, 16 December 2017 (UTC)
ふるき. —suzukaze (tc) 00:33, 16 December 2017 (UTC)

Conjugation tableEdit

I always find those Japanese conjugation tables proscriptive and dated. For example, in a situation where English speakers would use a bare imperative in a friendly manner, Japanese speakers will use a non-polite requestive (して, 食べて, etc.) or a non-polite “advisory” imperative (しな, 食べな, etc.) but you don’t see them anywhere. Similarly, before n very often becomes such as 分かんない but you see only 分からない.

I have created a list of forms we should have in a conjugation table: Wiktionary talk:About Japanese/Conjugation. What do you think? — TAKASUGI Shinji (talk) 12:58, 10 March 2018 (UTC)

I completely support this. —suzukaze (tc) 05:08, 11 March 2018 (UTC)
Thanks, Suzukaze-c. @Atitarev, Eirikr, Haplology, Wyang, エリック・キィ: what do you think of modernizing Japanese conjugation tables? The list of forms must reflect community opinions as much as possible. — TAKASUGI Shinji (talk) 15:23, 13 March 2018 (UTC)
@TAKASUGI Shinji I generally support your idea, but we can put a qualifier denoting dated, colloquial, etc. in each cell. I have felt a gap between a model Japanese and what I, as one of the native Japanese speakers, actually speak, hear and see today!--エリック・キィ (talk) 15:52, 13 March 2018 (UTC)
@TAKASUGI Shinji I agree with Eryk: I generally support this idea, with some adjustments -- more contextual data as above, and some terminology tweaks. For instance, the 連用形 in the sample tables is listed as the "Conjunctive", whereas I learned the 連用形 as the "continuative", and the て-form as the "conjunctive". I also don't understand how this will look in the end -- is the sample page intended for inclusion as-is on the WT:AJA page? Or will the relevant rows be extracted and recombined in a conjugation table specific to each verb form (such as, all the rows for 書く will be recombined into a single table, and that will go on the 書く page)? ‑‑ Eiríkr Útlendi │Tala við mig 18:26, 13 March 2018 (UTC)
Wow, great job, Shinji. ぬ and ねえ forms are included. Is it worth to include ず forms as well? Anatoli T. (обсудить/вклад) 20:50, 13 March 2018 (UTC)
@Eirikr: That was just an error, and I fixed it. Everyone can modify Wiktionary talk:About Japanese/Conjugation freely. I’d like to show the relevant forms in each verb entry and the entire table in Wiktionary:About Japanese and Appendix:Japanese verbs. Layout is to be discussed.
@Atitarev ず is really archaic as a sentence-final form, but ずに is still common in literary Japanese. We can show both or only the latter. — TAKASUGI Shinji (talk) 00:00, 14 March 2018 (UTC)
I think both show up often enough to merit inclusion. I also bump into -ざる endings in fixed phrases, such as -ざるを得ない, that kind of thing. ‑‑ Eiríkr Útlendi │Tala við mig 00:11, 14 March 2018 (UTC)
Support. Wyang (talk) 01:34, 14 March 2018 (UTC)
Yes, support and include both ず and ずに forms with proper labels. Is ねえ really "vulgar" or sloppy/dialectal or something else? :) --Anatoli T. (обсудить/вклад) 02:06, 14 March 2018 (UTC)
There are many conservative people who don’t like the adjective-final /eː/ especially if the speaker is a woman. [2]TAKASUGI Shinji (talk) 23:39, 14 March 2018 (UTC)
I agree. It's interesting that some mispronunciations (might be a wrong word here) in Japanese make words sound impolite or even vulgar. てめぇ (temē) must sound much worse than てまえ (temae). --Anatoli T. (обсудить/вклад) 23:56, 14 March 2018 (UTC)
(I really, really like this table. I need to express my support a second time. —Suzukaze-c 09:26, 25 January 2019 (UTC))

@TAKASUGI Shinji, Atitarev, Suzukaze-c What about giving both 学校文法 and 日本語教育文法 info, like this?

Traditional description (Japanese school grammar)
Conjugation type 語幹
Stem
語尾
Ending
五段活用
Five-grade conjugation
はし (hashi) (ru)
???
Conjugation type 語幹
Stem
語尾
Ending
グループⅠ(子音語幹動詞・ウ動詞)
Group I (consonant stem, -u verb)
/hasir-/ (hashir-) /-u/ (-u)

--Dine2016 (talk) 15:15, 29 September 2019 (UTC)

+1; I have also tried to do this in previous attempts to make a new conjugation template (if I understand correctly). —Suzukaze-c 17:33, 29 September 2019 (UTC)
That’s good but we don’t need Japanese words like 五段活用 and グループ Ⅰ. — TAKASUGI Shinji (talk) 23:00, 29 September 2019 (UTC)
Agreed that we probably don't need the 五段活用 and グループ I Japanese labels in the table -- this is intended for an English-reading audience, after all. :) Also, my impression is that the "group" nomenclature for talking about Japanese verb conjugation patterns is mostly used in English-language materials, so having that in Japanese seems a bit odd.
I'm reminded that Ainu uses special smaller kana to represent final consonants. Using these, for example, hashir- could be spelled as ハシㇼ- in kana. @Shinji, I'm curious if you (or anyone else) have encountered such kana spellings for Japanese? ‑‑ Eiríkr Útlendi │Tala við mig 19:55, 4 October 2019 (UTC)
The small kana are specifically for the Ainu language. — TAKASUGI Shinji (talk) 08:24, 5 October 2019 (UTC)

字音語素Edit

Some monolingual Japanese dictionaries include the so-called 字音語素 or 字音語の造語成分 along with words. The current practice on the wiki seems to list the definitions and compounds of single kanji in the "Kanji" section, but there seems to be currently no standard way to indicate which pronunciations apply to which definitions for cases like (アク・オ) or (ラク・ガク). Also, when there is only one etymology, the practice of putting "Kanji" and "Alternative forms", "Pronunciation", "Noun" headers on the same L3 level seems a little odd to me. Any ideas on how 字音語素 should be presented? --Dine2016 (talk) 04:03, 27 April 2018 (UTC)

It's been solved - use the Affix POS. --Dine2016 (talk) 16:42, 7 October 2018 (UTC)

Japanese grammar terminologyEdit

Do you think we need to unify the terminology of Japanese grammar? For example, 未然形 is translated as “irrealis or incomplete form” in running text but “imperfective” in the verb conjugation table. And 五段活用 is “godan” in the verb headword-line but “type 1” in the category name (though it's “Group I” in textbooks like Minna no Nihongo). Personally, I prefer terms like “consonant-stem”, “vowel-stem”, “infinitive” or “gerund” over traditional grammar (aka Hashimoto grammar or school grammar) terms like “five-grade”, “monograde”, “continuative form” or “te form”, but anything that can be settled on is OK.

I suggest creating a template called {{ja-term}} and use it for grammatical terms. For example, {{ja-term|infinitive}} could display “infinitive”.

(Notifying Eirikr, Wyang, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Fumiko Take, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4): --Dine2016 (talk) 16:42, 7 October 2018 (UTC)

(I'm glad that I have two hands.) On the one hand, I absolutely agree that some consistent set of labels is desirable. On the other hand, I don't think there is a consistent and widely adopted set of labels, either across Japanese sources or English-medium Japanese grammars. I guess my bottom line is that I would be happy to have a set of labels for use in Wiktionary. Cnilep (talk) 23:41, 7 October 2018 (UTC)
  • From what I've seen, the larger part of the inconsistency is due to so many different writers using their own set of labels. This is likely due to the way that Japanese grammar does not conform well to English-language labels. For instance, I quite dislike either label gerund or infinitive, presumably mentioned above in reference to the 連用形 (ren'yōkei), as both English-language labels point to grammatical constructs that don't quite exist in Japanese, while also failing to express the function of the actual form in Japanese. The label "gerund" in English would fit for utterances such as "I like walking", whereas the same statement in Japanese -- 歩くのが好き -- doesn't use the ren'yōkei at all, but rather the dictionary or plain form (or what have you). Meanwhile, although the label "infinitive" would fit a statement such as "I like to walk", which then also broadly fits the Japanese 歩くのが好き and uses the dictionary or plain form in both languages, the Japanese again doesn't use the ren'yōkei.
There's also the problem of historical context. Modern grammars describing classical and older Japanese generally call the -e- verb stems the 已然形 (izenkei), often glossed as realis in contrast to the irrealis or 未然形 (mizenkei). However, the modern language uses this form differently, leading to a change in labels even in Japanese, where the -e- verb stems are instead called the 仮定形 (kateikei).
I mostly agree with Cnilep [except that I have seldom run into divergent labels in Japanese-language grammars, aside from historical terms such as 既然言 (kizengen) or 将然言 (shōzengen)]. We should coordinate on a standardized set of labels, and also make sure to build in some way for users accustomed to other common labels to find out how our labels correspond to theirs. ‑‑ Eiríkr Útlendi │Tala við mig 19:33, 8 October 2018 (UTC)
@Eirikr: Thanks for your reply. I currently have copies of three Japanese grammars by western linguists – A Reference Grammar of Japanese by Samuel E. Martin, A History of the Japanese Language by Bjarke Frellesvig, and A Descriptive Grammar of Early Old Japanese Prose by John R. Bentley – all of which use the labels “infinitive” and “gerund”. I have no objections to calling them “continuative” and “conjunctive”, though. What do you think of the other names of modern verb forms in Frellesvig (2010)?
Finite
Nonpast kaku
Past kaita
Volitional kakoo
Imperative kake
(Past conjectural) (kaitaroo)
Non-finite
Infinitive kaki
Gerund kaite
Conditional-1 kaitara
Representative kaitari
Conditional-2 kaitewa
Provisional kakeba
Concessive kaitemo
Note: I agree that the conditional-2 (kaitewa) and the concessive (kaitemo) should be removed and treated as kaite + wa/mo, which is also how Martin (1975) treats them. By the way, what about renaming the conditional(-1) and the provisional as something like “-tara conditional” and “-ba conditional”?
One reason I don't like the traditional description of Japanese grammar is its paradigm of verbs, which is unsuitable for Modern Japanese. It unnecessarily keeps the distinction between 終止形 and 連体形, while failing to point out the present/future tense they have acquired due to the rise of past -ta (from stative -tar-). The variant of 未然形 which ends in オ段 is also a back-formation, an artifact of writing, rather than a true stem. Another disadvantage of traditional grammar is the segmentation of verbs like 読む and 食べる into よ・む and た・べる; it is better to posit stems such as yom- and tabe-, respectively. (Can 二段 verbs like 尋ぬ be described as varying between tadune- and tadunu-?) For this reason, I still prefer “linguistic” terms such as “(regular) consonant-stem” over “five-grade”, which is based on a clumsy analysis of verb forms hindered by the moraic orthography. Another reason I don't like traditional Japanese grammar is its classification of 付属語. For example, it lumps から・と and て・ば both as 接続助詞, while romanization will show their difference: yonda kara and yomu to, vs yonde and yomeba. --Dine2016 (talk) 16:11, 9 October 2018 (UTC)

@Eirikr I found this paper: 日本語教育の文法体系と寺村秀夫 : 活用の場合, which states that:

またバーナード・ブロックの活用表は次の如くである。
[...]
Hypothetical (Provisional) (kak-eba, mi-reba など)
Hypothetical (Conditional) (kaitara, mi-tara など)
Participial (Infinitive) (kak-i, mi-ø など)
Participial (Gerund) (kaite, mi-te など)
Participial (Alternative) (kaitari, mi-tari など)

It seems that the terms "infinitive" and "gerund" are indeed in use, and the distinction between "provisional" and "conditional" is established, in English scholarship. On the other hand, the same paper states that

なお、寺村自身が編纂にかかわった Basic Japanese(大阪外国語大学、1967)においては、Conjunctive form(連用形)が活用形として導入されており、[...]

This is also the terminology used in Bentley (2001), in which the six "forms" in traditional grammar are listed as imperfect, conjunctive, conclusive, attributive, evidential, and imperative. (From a western point of view, it's incorrect to call them "forms": the mizenkei is a stem, and the ren'yōkei is either an inflected form or a stem depending on how it is used. Also 書かない should be considered one words instead of two, as shown by romanization.) --Dine2016 (talk) 03:35, 12 October 2018 (UTC)

What do you mean by evidential? In linguistics evidential is a totally different thing (SIL). The six traditional “forms” (mizen, ren’yō, shūshi, rentai, katei/izen, meirei) can’t explain modern Japanese morphology well and it is misleading to treat mizen and katei as real forms. Traditional Japanese grammarians tried to explain Ancient Japanese and Modern Japanese in a unified frame because of diglossia at that time, but now let’s just make it clear that they are two different languages. — TAKASUGI Shinji (talk) 22:53, 12 October 2018 (UTC)
@TAKASUGI Shinji: Thanks for your reply. “Evidential” is the label for the izenkei in Bentley (2001):

The traditional label for the evidential literally means ‘already thus’ (izenkei), or what can be translated as the perfective (the state of completion). This label is misleading, and since this conjugation usually implies evidence of a condition, a provision, or a concession (Martin 1988:229, 556-7, 785), I have chosen the label ‘evidential’.

A History of the Japanese Language by Bjarke Frellesvig uses the label “exclamatory” for the same form, and A Reference Grammar of Japanese by Samuel E. Martin uses “literary concessive” for the same form optionally suffixed with -do.
By the way, I totally agree that the traditional analysis of Japanese grammar (aka Hashimoto grammar or school grammar) is unsuitable for Modern Japanese. That is why I suggest a clean break from traditional grammar terms like “irrealis”, “continuative” and “conjunctive”. --Dine2016 (talk) 00:48, 13 October 2018 (UTC)
In a discussion above (#Conjugation table) I have listed up almost all the forms in Modern Japanese in Wiktionary talk:About Japanese/Conjugation. They need to be properly named. — TAKASUGI Shinji (talk) 01:13, 13 October 2018 (UTC)

Old JapaneseEdit

Please join in the discussion at Wiktionary talk:About Old Japanese. —Μετάknowledgediscuss/deeds 05:27, 6 February 2019 (UTC)

Japanese linking templateEdit

Discussion moved from User talk:Suzukaze-c.

Hi. I'm considering making a Japanese counterpart of {{zh-l}}. Which format do you think would be better?

太陽 / たいよう (taiyō, “sun”) and われ / / (ware, “I; me”)

or

太陽たいよう (taiyō, “sun”) and われ (ware, “I; me”)

or

太陽 (たいよう, taiyō, “sun”) and われ (, , ware, “I; me”)

--Dine2016 (talk) 12:37, 8 February 2019 (UTC)

I personally like the third one. It's also similar to {{ko-l}} and {{vi-l}}. —Suzukaze-c 16:01, 8 February 2019 (UTC)
Thanks. Given that Eirikr has been inactive for a while, what additional parameters do you think will be helpful besides |tr=, |gloss= and |lit= and |note=? I have once seen the elaborate format (ao, historically awo) in etymology sections, which suggests the new format あお (, ao, historically あを, awo), but I'm not sure whether such a format is desirable. Looking at the English etymologies on the wiki, it seems that Old English, Middle English and modern English are never lumped together. A compound formed in Old English is treated as “From Middle English ab, from Old English αβ, from α + β. Surface analysis A + B.” and never “From A (historically α) + B (historically β)” or anachronistically “From A + B”. Given that Old, Middle and modern Japanese have different orthography, I think Japanese should be handled in the same way. --Dine2016 (talk) 04:06, 9 February 2019 (UTC)
I agree with your ideas. —Suzukaze-c 03:04, 13 February 2019 (UTC)
Same here. I agree that orthography for Middle and Old Japanese has to be separated. Take a look at Category:Middle Vietnamese lemmas. By the way, good job on the new {{ja-see-kango}}. KevinUp (talk) 12:25, 13 February 2019 (UTC)
@Dine2016 I prefer the third one as well. So will this be implemented to {{ja-l}} or another separate template? User:Poketalker has been using {{m|ja|漢字|tr=kanji}} for some time so maybe we can have {{ja-m|漢字|かんじ|[[gloss]]}} instead. Not sure if this is going to break {{ja-l}} because so many combinations are possible for that template. Maybe the gloss can be entered using |gloss= in {{ja-l}} ? KevinUp (talk) 08:01, 13 February 2019 (UTC)
@KevinUp: Thanks for the reply. My suggestion is to extend {{ja-l}} in a way that does not break existing usages, and make a {{ja-lx}} which works like {{zh-l}}. The former template only formats its arguments and generates nothing else, while the latter template can support auto-completion such as {{ja-lx|太陽}}太陽 (たいよう, taiyō). The latter would be very tricky to implement so I'll probably only do the former I suggest doing the former first. --Dine2016 (talk) 08:48, 13 February 2019 (UTC)
@Dine2016: Sounds good to me. Some possible combinations for the new format that I could think of (to verify its output):
{{ja-l|太陽|たいよう|[[sun]]}}太陽 (たいよう)
{{ja-l|太陽|たいよう|taiyō|[[sun]]}}太陽 (たいよう, ​taiyō)
{{ja-l|太陽|たいよう|gloss=[[sun]]}}太陽 (たいよう)
{{ja-l|太陽|たいよう|tr=taiyō|[[sun]]}}太陽 (たいよう, ​sun)
{{ja-l|太陽|たいよう|tr=taiyō|gloss=[[sun]]}}太陽 (たいよう)
{{ja-l|太陽|たいよう|tr=taiyō||[[sun]]}}太陽 (たいよう)
Legacy usage for comparison:
{{ja-l|太陽|たいよう}}太陽 (たいよう)
{{ja-l|太陽|taiyō}}太陽 (taiyō)
Perhaps this conversation ought to be moved to Template talk:ja-l. KevinUp (talk) 12:25, 13 February 2019 (UTC)
@Dine2016, KevinUp: I wrote Module:User:Suzukaze-c/jpx-links instead of doing other important things —Suzukaze-c 05:14, 21 February 2019 (UTC)
Good job on the template. For the CSS part, I hope we can make the Japanese script in running text a little bigger (but not too big like the headwords), and use Meiryo instead of MS PGothic on Windows, as the former is optimized for ClearType while the latter embeds bitmaps and does not look good. I remember there is a way to reduce the vertical space of Meiryo, which is used on some Vocaloid-related wikis on Wikia.
Unfortunately, I've lost interest in Japanese once I realized there was no way to eliminate all duplication of information in the source code of Japanese entries. The two “final bosses” which made it impossible, I think, would be: (1) the repetition of the reading in headword templates and (2) the repetition of the inflection type in the headword template and the inflection table. Chinese was able to eliminate the major repetitions because Unified Chinese moved the romanizations to the pronunciation template and there was no inflection. Japanese was not so lucky (although we can follow the French Wiktionary's handling of inflection to eliminate the second problem). --Dine2016 (talk) 11:02, 21 February 2019 (UTC)
re: CSS: See also User:Suzukaze-c/sandbox#2, I guess. Maybe I should ask for interface administrator rights? Perhaps we could remove the inline CSS from {{ja-r}} and {{ja-usex}} as well. —Suzukaze-c 17:44, 21 February 2019 (UTC)
re: vocaloid wikia: probably me lmao which one
re: repetition: I have wondered if allowing global variables (currently not allowed) would help with this sort of problem. —Suzukaze-c 17:28, 21 February 2019 (UTC)
re: vocaloid wikia: ah, sorry, it was an ugly hack. --Dine2016 (talk) 07:55, 26 February 2019 (UTC)
yeah but it's my ugly hack ;) —Suzukaze-c 08:07, 26 February 2019 (UTC)
ah sorry again (*/ω\*) seldom 逛英語ACG圈
I wonder if Unified Japanese could justify omitting (1) the romanizations in headword templates and (2) the inflection tables. --Dine2016 (talk) 09:51, 26 February 2019 (UTC)
it's totally an ugly hack, i'm just teasing you (´ε` )
Your idea reminds me that Module:th-headword reads the content of an entry's own page. Maybe that could be a source of inspiration.
Also, what do you think of User:Suzukaze-c/p/ja#Japanese {entry format reform} as my idea of "unified Japanese"? (There's certainly a lot of redundancy regarding definitions and such, but I don't think things would be any better if we split ja.) —Suzukaze-c 04:32, 27 February 2019 (UTC)
@Suzukaze-c I just discovered an example of “same kanji, same modern kana, different historical kana”: 法律 (ほうりつ < はふりつ, hōritsu < fafuritu, ほうりつ < ほふりつ, hōritsu < fofuritu). Hope it's useful. --Dine2016 (talk) 16:39, 4 March 2019 (UTC)
@Suzukaze-c Does your template support the alternative format {{jpx-m|変化:へんか}} in addition to {{jpx-m|変化|へんか}}? If this is supported, we can
  • code templates like {{ja-syn}} and {{ja-synonym}} in the simplest way ({{jpx-m|{{{1}}}}}), and
  • use them like {{ja-syn|変わる:かわる|変化:へんか|チェンジ}} and {{ja-synonym|変化:へんか|[[change]]}}
Furthermore, if automatical fetching of the reading is implemented so that {{ja-m|変化}} yields 変化 (へんか, henka), we can further
to get the same results. --Dine2016 (talk) 05:39, 6 March 2019 (UTC)

re: unified JapaneseEdit

I think Etymologies 1–3 could be grouped like

Etymology 1
Pronunciation
Verb

ける (transitive) // no need of romaji or conjugation type as it changes over time, one of the advantages of unified Japanese :)

Conjugation
Inflected forms of ける [godan] in Modern Japanese
Inflection Hiragana Romanization
Stems
Basic stem ker-
a- stem (未然形*) けら kera-
onbin stem (音便形) けっ keQ-
e- stem (仮定形*) けれ kere-
Finite forms
Nonpast (終止形*/連体形*/基本形/ル形) ける keru
Past (過去形/タ形) けった ketta
Volitional (意向形/推量形) けろう kerō
Imperative (命令形*) けれ kere
Non-finite forms
Infinitive (連用形*) けり keri
Gerund (て形) けって kette
Conditional けったら kettara
Representative けったり kettari
Provisional (ば形/条件形) ければ kereba
Key constructions
Passive (受身) けられる (kerareru, stem ker-are-, ichidan conjugation)
文法体系はおおむね Frellesvig (2010) に従う
* 学校文法における活用形
Inflected forms of ける [shimo ichidan] in Late Middle Japanese
Inflection Phonemic
Stems
Basic stem (未然形*/連用形*) ke-
e- stem (已然形*) kere-
Finite forms
Nonpast (終止形*/連体形*) keru
Past keta
Intentional kyoozuru
Volitional kyoo
Past conjectural ketarɔɔ
Imperative (命令形*) kei ~ keyo
Non-finite forms
Infinitive ke
Gerund kete
Conditional keba (~ ketewa)
Provisional kereba
Concessive keredomo ~ ketemo
Past conditional ketara(ba)
Past provisional ketareba
Past concessive ketaredomo
Intentional provisional kyoozureba
Intentional concessive kyoozuredomo
Key constructions
文法体系はおおむね Frellesvig (2010) に従う
* 学校文法における活用形
Inflected forms of ける [shimo ichidan] in Early Middle Japanese

--Dine2016 (talk) 09:47, 27 February 2019 (UTC)

Intriguing. I like how adding a new conjugation table for older stages is reminiscent of our (very convenient) current approach for Chinese (adding pronunciation to {{zh-pron}}). Would we still use romaji for historical forms in etymology sections? —Suzukaze-c 06:05, 1 March 2019 (UTC)
Um… if you're citing a word in a specific stage of the language, a transcription appropriate to the stage can be added. For example, the topic marker of Old Japanese is (pa), the one of Early Middle Japanese is (fa), and the ones of Late Middle Japanese and Modern Japanese is (wa). On the other hand, if the stage is unknown, you can just use the kana spelling and refer to the topic marker as , which is similar to how you cite Chinese characters rather than words using {{zh-l|*...}}. In the latter situation, you sometimes need to choose between old and new orthography (あを vs あお), or classical and modern forms (あり vs ある), but printed dictionaries have the same problem.
For synonym sections, maybe we can group the words by stage, effectively constituting a historical thesaurus like the 三省堂 現代語古語類語辞典 and the Historical Thesaurus of English? --Dine2016 (talk) 10:46, 1 March 2019 (UTC)
By developed from Dine2016's idea I made blueprint of the new template.
In this draft, phonetic Japanese texts are spelled in Katakana following the academic custom about Japanese phonology; there are downsteps as well as upsteps for the some dialects and old variations; difference between the nasal and stop consonants on ガ行 exists as the difference between the dialects, and /ŋ/ is indicated in the 半濁点 (゜) over the letters; as the examples of romanised spellings of Middle Japanese, Nippo Jisho is given. Feel free to use this as a tentative plan.--荒巻モロゾフ (talk) 10:38, 10 April 2019 (UTC)
@荒巻モロゾフ: Very nice. How do you think data should be input? (what would the wikitext/template syntax look like?) —Suzukaze-c 05:16, 12 April 2019 (UTC)
It's desirable to input IPA directly, because diversity of the Japanese dialectal phonology is too large to output automatically. However regarding the phonetic katakana with accent annotation, it's hard to write html codes directly. From this paper[3], I picked up a symbolic system that can be used for conversion by the template.
regend example
[ Upstep between the moras ア[ア
[[ Rising in the next mora [[ア
] Downstep between the moras ア]ア
]] Falling in the previous mora ア]]
! Loose descent between the moras without downstep ア!ア
% Loose ascent between the moras without upstep ア%ア
Example:
/ꜜhàrû ꜜnàtté núkúnátté kíꜜtàkàrà ꜜkjǒːꜜwà ꜜùmí íkóká/
]ハ%ル]] ]ナッ%テ %ヌクナッテ %キ]タカラ ]キョ%ー]ワ ]ウ%ミ %イコカ
 ナッ ヌクナッテ タカラ キョワ  イコカ
(はる)なって(ぬく)なって()たから今日(きょう)(うみ)()こか。
Haru natte nuku natte kita kara kyō wa umi iko ka.
As it becames spring and is getting warmer, let's go to the sea today.
Could you make a conversion program from this?--荒巻モロゾフ (talk) 10:21, 23 April 2019 (UTC)

Sino-Japanese etymologiesEdit

There're currently several ways to express Sino-Japanese etymologies:

This may be unified to the general form(s), served via a single template:

--115.27.198.88 22:01, 8 April 2019 (UTC)

Some thoughts.
  • Most on'yomi terms in Japanese are old borrowings from Middle Chinese, hence the ltc language code.
  • Some terms were borrowed from modern Chinese.
→ Ideally, these should explicitly state which variety they came from -- Mandarin (cmn), Cantonese (yue), etc.
That said, some terms only borrowed the spelling from Chinese and use the Japanese (though Chinese-derived) on'yomi. These might account for some of the zh instances.
  • Some entries haven't been updated in many years, which is where most (perhaps all?) of the vague zhx comes from.
Personally, I think it's useful to include the Middle Chinese reading to show where things started from, and then also show the phonological changes after the term arrived in Japanese. For instance, 長#Etymology_1 shows how this started as Middle Chinese /ʈɨɐŋX/, becoming in turn Japanese /tjau//t͡ɕjau//t͡ɕɔː//t͡ɕoː/. This kind of phonological development is part of the history of the term and can be quite interesting, showing how the sounds of Chinese and Japanese have diverged over time.
There are also not a few Japanese terms that originated in Middle Chinese with one sense, and then got repurposed during the Meiji period with a different or altered sense. Such terms include 世界, 社会, 自由, etc. I'm not sure if that class of terms would fit into the proposed template? ‑‑ Eiríkr Útlendi │Tala við mig 00:43, 9 April 2019 (UTC)
@Eirikr: Yes, thanks, please take a look at 再見, which has two etymologies, both Chinese- Middle Chinese and modern Mandarin. The modern Mandarin might need citations :) Thanks to User:Justinrleung for improving. --Anatoli T. (обсудить/вклад) 01:15, 9 April 2019 (UTC)
Note this section only concerns words using regular on’yomi, not including things like 再見#Etymology_2.--115.27.198.88 12:46, 9 April 2019 (UTC)

The template {{etyl}} is being phased out. I have no preference as regards the discussion above. Cnilep (talk) 06:02, 23 May 2020 (UTC)

Why is there a pronoun header?Edit

What's the point in having it? They act fully like any other noun or am I missing something? Korn [kʰũːɘ̃n] (talk) 08:59, 9 April 2019 (UTC)

Meh. Japanese sources also list 代名詞. As another example of distinctions made by native speakers, there's nothing particular about よう's grammatical functioning to set it apart from a 形容動詞, but Japanese sources consistently list it as a 助動詞. Having the "pronoun" label is also arguably useful for cross-language comparisons; without it, someone's bound to claim that Japanese "doesn't have pronouns", which is silly. ‑‑ Eiríkr Útlendi │Tala við mig 20:58, 11 April 2019 (UTC)

Romanization questionEdit

How to romanized the following terms:

  • / 手の平 (てのひら) - te no hira or tenohira?
  • 言の葉 (ことのは) - koto no ha or kotonoha?

Marlin Setia1 (talk) 11:24, 11 April 2019 (UTC)

My preference and personal practice is with spaces. By comparison, one doesn't write flatofone'shand. ‑‑ Eiríkr Útlendi │Tala við mig 20:59, 11 April 2019 (UTC)

"Neoclassical pronunciations"Edit

There is a modern pronunciation of Classical Japanese where 買ふ is pronounced , different from Early Middle Japanese kafu, and different from Modern Japanese (the spoken language) where it's kau.

What is that modern pronunciation called? --Backinstadiums (talk) 13:38, 11 July 2019 (UTC)

As an example, in the Jewel Voice Broadcast (audio), 惟ふに (3:00) is pronounced omō ni, and 失ふ (3:54) is pronounced ushinō. --Dine2016 (talk) 15:04, 11 July 2019 (UTC)
If we (the EN Wiktionary community) are to come up with a label, I think "neoclassical" makes sense and clearly captures what's going on.
That said, I'm sure we're not the first to discuss this phenomenon, and someone has probably come up with a label for this elsewhere. I'm not familiar with any such verbiage, however, so we must either hope that someone else chimes in who does know, or do some more research to find out for ourselves. ‑‑ Eiríkr Útlendi │Tala við mig 16:11, 11 July 2019 (UTC)

@Eirikr: According to Prof. Victor Mair,

From my colleague Linda Chance, who is a specialist on Classical Chinese: The technical term for this is ハ行転呼音・はぎょうてんこおん.

It refers to the fact that from sometime in the Heian period the "ha" line changed to the same pronunciation as the "wa" line, but the "ha" line spellings continued in use. (Interesting examples--if you write these in modern Japanese with 'u' for 'fu,' 惟うに is still pronounced omō ni, but 失う becomes "ushinau" (except in some dialects.) This "modern pronunciation" is potentially centuries old. We read classical texts this way because we can't retrieve that original early Heian pronunciation. --Backinstadiums (talk) 14:17, 12 July 2019 (UTC)

@Backinstadiums, that refers to a specific phonological development earlier in the language's history. Unfortunately, that is not the Japanese term for "neoclassical pronunciation", which includes phenomena such as where modern /au/ and 1603 /ɔː/ is pronounced instead as /oː/. ‑‑ Eiríkr Útlendi │Tala við mig 04:44, 14 July 2019 (UTC)
@Eirikr: Please send an email to Prof Mair explaining it: vmair@sas.upenn.edu --Backinstadiums (talk) 11:15, 14 July 2019 (UTC)
@Backinstadiums:, why? I'm honestly curious why you think I should. Also, who is Professor Mair? And why would Linda Chance, a specialist in Classical Chinese, be considered an expert on the Japanese-language terms used to describe the neoclassical Japanese pronunciation in evidence in the mid-1900s Jewel Voice Broadcast?
(Honestly not intending insult or aggression. Your response just confuses me.) ‑‑ Eiríkr Útlendi │Tala við mig 04:38, 15 July 2019 (UTC)
@Eirikr: According to David Lurie:

The technical term for these changes is tenko-on 転呼音, but they are not applied consistently in words where the first mora ends in 'a.' I don't know if there is a specific term for those exceptions. --Backinstadiums (talk) 23:04, 15 July 2019 (UTC)

@Backinstadiums: who is David Lurie? I wouldn't expect a South African photographer to have much to say about Japanese linguistic terminology...?
I note that the term 転呼音 (tenko-on) refers generally to the phenomenon of sound shift, or more specifically to the sounds that have so shifted. This Japanese term could apply just as well to describe how English don't you becomes doncha in informal speech in certain dialects. While "sound shift" or tenko-on is a useful way of describing the change from Old Japanese readings through to modern Japanese, it doesn't capture the specific "neoclassical pronunciation" sense at issue at the start of this thread. ‑‑ Eiríkr Útlendi │Tala við mig 23:33, 15 July 2019 (UTC)
In the Standard Japanese, verbs don't cause vowel fusion in the inflection. Examples like () () are allowed in some dialects.--荒巻モロゾフ (talk) 23:05, 14 July 2019 (UTC)
Actually, it is only the -au and -ou verbs which do not cause vowel fusion in the dictionary form. 食う can be alternatively pronounced クー, 言う has stems alternating between い~ and ゆ~, and 酔う has the stem changed from え~ to よ~ in all forms. --Dine2016 (talk) 14:09, 15 August 2019 (UTC)

Another example of this "neoclassical pronunciation" is fossilized words like 逢瀬(おうせ) (ōse) and 逢魔(おうま)(とき) (ōmagatoki) as well as names beginning with (おう) (ō). Jisho.org search --Dine2016 (talk) 13:57, 15 August 2019 (UTC)

@Eirikr: Interestingly, this video (0:26) shows that 給う was pronounced タモー even in an otherwise 口語 text. --Dine2016 (talk) 04:03, 20 October 2019 (UTC)

Add HSK level of the Hanzi charactersEdit

The Japanese section shows the 常用漢字 level of the Kanji, so I'd like to propose adding the HSK level of the hanzi too. Yet, I have encountered some objection that levels from other tests would also have to be added, to which I replied that in any case, most tests classify the same characters in the same levels, so only a group of characters would have two different levels at most.

Why do kanji only show the levels of 常用漢字? Secondly, where should I propose adding HSK levels for hanzi? --Backinstadiums (talk) 01:15, 17 July 2019 (UTC)

@Backinstadiums: I'm not familiar with what "HSK" means, but since you're using the term hanzi, I infer that you're talking about Chinese and the Hanyu Shuiping Kaoshi, so presumably you should strike up a thread at Wiktionary talk:About Chinese. ‑‑ Eiríkr Útlendi │Tala við mig 20:28, 17 July 2019 (UTC)
@Eirikr: Why do kanji only show the levels of 常用漢字? --Backinstadiums (talk) 23:15, 17 July 2019 (UTC)
@Backinstadiums: I'm not certain, but if I had to guess, I'd say that it's because the 常用 levels tell us something about expected literacy in Japanese, as defined by the Japanese government. This also tells us something about how likely we are to encounter a given word or reading, since I believe that major publications like newspapers generally limit their main content to just the 常用漢字. I'm not sure what other levels would be useful or appropriate to show. I suppose an argument could be made that the 常用 levels are more "encyclopedic" information, but then again, it's information about the word itself, which would seem to be relevant to Wiktionary content. ‑‑ Eiríkr Útlendi │Tala við mig 00:13, 18 July 2019 (UTC)

Wiktionary:About Japanese#Translations into JapaneseEdit

I wonder who has "deprecated" inclusion of kana (haven't checked the history but I disagree). I have always provided unlinked kana in translations at the English Wiktionary. It has been my practice for many years. --Anatoli T. (обсудить/вклад) 01:32, 23 August 2019 (UTC)

Drawing attention: (Notifying Eirikr, TAKASUGI Shinji, Nibiko, Suzukaze-c, Dine2016, Poketalker, Cnilep, Britannic124, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!):
Also, I think kana and most definitely rōmaji should be unlinked in translations - 環境 (かんきょう, kankyō), not 環境 (かんきょう, kankyō).
I should also point out that language-specific templates are always more advanced than generic. We should strive to use kana to automatically romanise Japanese, just how {{ja-r}}, {{ja-x}} are implemented, e.g. 環境(かんきょう) (kankyō) (with or without furigana). The generic {{t}} or {{t+}} can't do that. It's all the more important to include kana. --Anatoli T. (обсудить/вклад) 05:07, 23 August 2019 (UTC)
I prefer 環境(かんきょう) (kankyō) or just 環境 (kankyō) to the redundant 環境 (かんきょう, kankyō), but non-native speakers might have different opinions. — TAKASUGI Shinji (talk) 08:00, 23 August 2019 (UTC)
@Anatoli, that would be me who last updated the page. As I noted in the edit comment, "+long-overdue rewrite to match current best practice". I hadn't seen any recent edits by you that added such formatted links, so I apologize for missing your modus operandi. The links I've seen added by other experienced JA editors have used the format described in the page, that is, {{t|ja|環境|tr=kankyō}}. Similarly, I haven't seen any experienced editors adding any other links, such as using {{m}} or {{l}}, that include kana in the tr= parameter. My recollections of past discussions about this issue were all of a general theme to no longer include kana in the tr= parameter; such discussions gave rise to the current {{ja-r}} template, for instance.
In response to your points:
  • For transliterated text as links, I wholly agree that this is sub-optimal, and I tried to write the update to indicate that transliterations should not be links. If that was unclear, we should edit the page again to clarify this point.
  • For translation tables using {{t}} or {{t+}}, etc., I disagree that these should include kana, for several reasons.
For starters, this is the EN Wikt, and readers are not expected to read anything other than English, which is written using the Latin alphabet. We don't include non-Latin-alphabet phonetic guides for any other language's listings in translation tables: links consist of the linked text to the term itself, plus optionally a Latin-alphabet phonetic guide. I believe that Japanese listings in translation tables should be consistent and follow this same format.
Additionally, kana only provide a phonetic guide, so including both kana and romaji in the tr= parameter is redundant. IFF we are including kana as part of {{ja-r}}, I have no strong objection: 1) this doesn't display the kana visually within the same parenthetical transliteration guide, 2) we generate the romaji from the kana, so we're not duplicating information in the wikicode, and (more importantly, in my view) 3) if used properly, {{ja-r}} indicates which kana are used for which kanji, which is useful information. However, the {{t}} or {{t+}} templates, as you also note, cannot (currently) generate ruby text this way, so the only way to add kana is 1) redundantly, alongside and essentially duplicating the information in the romaji, and 2) without any ability to tie the kana to their specific kanji.
If a reader of a translation table wishes to see more information about a Japanese term (or indeed any term), they have only to click the link in order to see the full entry, including the kana. Translation tables are intended to be tight, compact, and succinct. Adding kana to translation table listings instead makes the tables bigger, harder to read, and visually messy, and it makes the Japanese listings inconsistent from all the other entries. I view these as negatives, and for me, this makes kana extraneous unnecessary information in this context.
You state that, "it's all the more important to include kana" -- presumably you mean in the translation table listings themselves? I don't understand your reasons for making this statement. Could you explain in more detail?
Do you perhaps mean that we should update the module for {{t}} and {{t+}} to use a similar ruby feature as {{ja-r}}? If so, I again have no real objection, provided that the module can be tweaked to account for unwritten kana that don't belong to any kanji -- such as the possessive (no) or (tsu) that appears in the readings of certain compounds. ‑‑ Eiríkr Útlendi │Tala við mig 17:38, 23 August 2019 (UTC)

two-level hierarchy of alternative spellingsEdit

Hi everyone. I have just created {{ja-gv}}, a soft-redirection template designed specifically for kyūjitai forms. Please take a look at 天道蟲, てんとう蟲 and for examples.

Note that this template is used to soft-redirect kyūjitai forms to shinjitai forms, unlike {{ja-see}} which is used to soft-redirect within the shinjitai world. For example, in the entry てんとう蟲, the editor only needs to prodive the shinjitai form as {{ja-gv|てんとう虫}}. It is up to the template to find the lemma form in the shinjitai world by fixing double redirects. There are sometimes multiple lemma forms as shown by the example of , and {{ja-gv}} can handle that. Another difference is that {{ja-gv}} automatically copies {{ja-kanjitab}}s from the shinjitai entry because the shinjitai and the kyūjitai spellings will have the same pattern of readings. {{ja-see}} can't handle that as different spellings within the shinjitai world (てんとう虫 and 天道虫) have different reading patterns.

What do you think about this two-level hierarchy solution?

(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 09:06, 18 September 2019 (UTC)

Shinjitai standardEdit

Which shinjitai standard should we use for lemma entries? Currently we use 躯, but Daijirin uses 軀. --Dine2016 (talk) 04:57, 30 January 2020 (UTC)

@Dine2016, which edition of Daijirin? My local copy uses , as does my local copy of the KDJ. My dead-tree copy of the SMK5 has instead. The NHK Hatsuon dictionary only has the kana form むくろ (mukuro), with no kanji listed, and compounds using this kanji are likewise missing.
Meanwhile, I see at Kotobank that their Daijirin entry lists multiple kanji spellings, with the shinjitai 躯 as the first (and presumably preferred?) spelling.
By way of comparison, I note too that the JA Wiktionary for 躯 explicitly lists 軀 as kyūjitai, and their むくろ entry lists as a kanji spelling, but not .
I note that my SMK5 was published in 2000. Is it possible that the shinjitai character has become more accepted, or has been officially endorsed, in the past 20 years? ‑‑ Eiríkr Útlendi │Tala við mig 18:06, 30 January 2020 (UTC)
Sorry, I meant the weblio version of Daijirin. --Dine2016 (talk) 03:33, 31 January 2020 (UTC)

Proposal for a lossless transcription of Old Japanese using hiragana + hentaigana as found in UnicodeEdit

Sketch found here --Backinstadiums (talk) 11:44, 22 December 2019 (UTC)

@Backinstadiums: It's interesting, but defective -- where are the voiced obstruents? The proposal confusingly uses dakuten for (to) to produce (do), but fails to do so with (pe2), rendering this in romaji as nbe in a way that is definitely not lossless, but rather instead problematically additional.
I also don't agree with their explicit addition of -n- before voiced obstruents. While this appears to be the general consensus for how these voiced consonants evolved, I don't think this is a correct reconstruction, nor is it clear notation, not least due to the potential for confusion with later moraic (n).
As a minor point, some of their glyph choices strike me as a bit odd -- for ⟨mi2, for instance, it seems very odd for them to choose the quite-complicated U+1B0CA 𛃊 rather than the comparatively simpler U+1B0CF 𛃏. Or for ⟨no1, U+1B09A 𛂚 seems an odd choice given the existence of simpler U+1B09D 𛂝. And while ⟨me2 is lacking a hentaigana, is a slightly more complicated character than , and is also (subjectively, marginally) more difficult to write due to the direction of the strokes, and more difficult to type.
⇒ In functional terms, if we could adapt the proposal to 1) add dakuten to all voiced obstruents, and 2) get rid of that confusing extra -n- preceding voiced obstruents in romanization, I think we'd have something that we could use. Ideally, we'd also 3) swap out some of the glyphs for simpler alternatives.
That said, I think this is a non-starter until we have much wider availability of fonts that support these codepoints. If our users only get tofubake (c.f. this StackExchange page), we're not helping anyone. ‑‑ Eiríkr Útlendi │Tala við mig 17:24, 23 December 2019 (UTC)

Diphthongs in current colloquial JapaneseEdit

Why isn't any of the following mentioned in the respective entries?

In colloquial Japanese, many diphthongs disappeared. So, words like でかい (DEKAI) and やばい (YABAI) became でけえ (DEKEE) and やべえ (YABEE), and words like わるい (WARUI) and さむい (SAMUI) became わりい (WARII) and さみい (SAMII). With these changes, words like はやい (HAYAI) became (HAYEE), and words like か ゆい (KAYUI) became (KAYII)

--Backinstadiums (talk) 10:57, 16 January 2020 (UTC)

Because that's still considered slang, possibly even dialectal. Even in colloquial speech, it's non-standard and primarily limited to certain demographics and social contexts. I don't know where you got that quote, but it's not quite correct. ‑‑ Eiríkr Útlendi │Tala við mig 17:28, 16 January 2020 (UTC)
On top of that, I don't think most Japanese people would recognize /yi/ as a sound in their language, nor would they recognize the odd hentaigana in that KAYII image of yours. ‑‑ Eiríkr Útlendi │Tala við mig 17:32, 16 January 2020 (UTC)
Yi or wu has never existed in the Japanese language. Listing the colloquial conjugated forms is not a bad idea, though. — TAKASUGI Shinji (talk) 02:21, 18 January 2020 (UTC)
The quote is from this proposal sent to Unicode by a certain "Abraham Gross". —Suzukaze-c 08:02, 18 January 2020 (UTC)
I had a look. Ugh. Poorly written. I dispute his wording that "many diphthongs disappeared". This phenomenon represents a specific kind of social signalling, whereas he presents it as a past historical sound shift that is complete. He also misleadingly states, "Having these characters [YI and YE] available for writing would be invaluable as a way to represent these sounds in Japanese, for transcribing into Japanese, for digitizing old books, and for Japanese scholars." But as noted above, /ji/ is not a sound that Japanese speakers use or can pronounce, so the utility of the proposed hiragana YI is ... questionable at best. His phonological argument for the existence of /wu/ is tenuous at best.
The evidence of use he presents is all from the 1890s, when various groups were trying to standardize and "fill in" the otherwise empty items in the so-called 五十音 chart. This strikes me as unnecessary completionism driven more by aesthetics than utility. Notably, none of these glyphs proposed in the 1890s caught on -- they're just not that useful when they represent sounds that are either 1) unused in the modern language, such as /je/, or 2) effectively impossible for native speakers to pronounce or distinguish, such as /ji/ and /wu/.
Meh.
Anyway, back on the topic of monophthongization, I would support adding these to the WT:AJA page, so long as the social context is clearly explained. ‑‑ Eiríkr Útlendi │Tala við mig 18:51, 21 January 2020 (UTC)
(@Eirikr: Apparently tentatively accepted. Yuck. —Suzukaze-c 02:04, 13 March 2020 (UTC))

It seems that this contraction only affects -i adjective and a single word "みたい". Can anyone think of another non-"-i adjective" that contracts this way? It is kind of strange with "みたい" being a sole exception. -- Huhu9001 (talk) 08:47, 26 January 2020 (UTC)

  • @Huhu9001, that's a very good point. I cannot think of many other examples; the only ones that come to mind are words that end in /-ai/ in "standard" polite speech -- pretty much all inflecting as _-i_ adjectives, with the exception of _-na_ adjective みたい (mitai).
Come to think of it, I have seen rare instances of 世界 (sekai) flattening to /sekeː/.
I think the key is that the resulting monophthongized pronunciation must still be unambiguous enough to impart the correct meaning. For example, _-na_ adjective 嫌い (kirai, hated, extremely disliked) cannot monophthongize without becoming homophonous with 綺麗 (kirei, pretty, attractive; clean, clean-cut). The utterance /ano ko ɡa kireː/ can generally only be parsed as あの綺麗 (ano ko ga kirei, that girl is pretty), rather than as a monophthongized version of あの嫌い (ano ko ga kirai, I hate that girl). Admittedly, the pitch accents are different, with 綺麗 (/kíꜜrèː/) vs. 嫌い (/kìráí//kìréː/), so theoretically the two would still be distinguishable. However, I cannot find any instances of monophthongized 嫌い (kirai), which I suspect is due to the potential for confusion and the nearly opposite connotations of the two words.
I also cannot find instances of medial monophthongization.
However, I grant that my fruitless searches might be due more to my weak Google-fu and unfamiliarity with non-standard Japanese. ‑‑ Eiríkr Útlendi │Tala við mig 18:39, 30 January 2020 (UTC)
「『ちげぇ』よ」? @Eirikr, Huhu9001: —Suzukaze-c 08:12, 21 February 2020 (UTC)
@Suzukaze-c: I believe ちげぇ is from ちがう, not ちがい. "ちがいよ!" (??) "ちがうよ!" (✓) -- Huhu9001 (talk) 09:31, 21 February 2020 (UTC)
Yes, I agree with you. I misread the conversation while needing sleep. —Suzukaze-c

Headword formatEdit

Discussion moved from User talk:Suzukaze-c#bot request.

It's hard for me to run a bot since I access Wiktionary via a proxy. Do you have any interest in reforming the Japanese entry layout? As a first step, simply changing the headword format to match other inflecting languages should be enough.

使(つか) (tsukautr godan (stem 使(つか)(tsukai), past 使(つか)った(tsukatta))
()べる (taberutr ichidan (stem ()(tabe), past ()べた(tabeta))
(たか) (takai-i (adverbial (たか)(takaku))
有名(ゆうめい) (yūmei-na (adnominal 有名(ゆうめい)(yūmei na), adverbial 有名(ゆうめい)(yūmei ni))

The main work involving bots is moving the kyūjitai and historical kana to some other place. Everything else can stay in the headword templates. While this new format doesn't reduce the complexity of source code of entries, it is a great improvement for the reader. --Dine2016 (talk) 10:55, 14 January 2020 (UTC)

I would be glad to help out with AutoWikiBrowser.
I think it is not a bad idea. I like the format you demonstrate here. —Suzukaze-c 06:17, 15 January 2020 (UTC)
Everyone, what do you think of the headword format above? I think this is what we should have had from the beginning (see the documentation of Module:headword).
The first step in my plan is to modify Module:ja-headword to move kana and romaji to the left. Using furigana in Japanese headwords is analogous to adding the vowels to Arabic headwords or acutes to Russian headwords. But my main concern is that it may look ugly without manual % hints, e.g. 紫式部(むらさきしきぶ). Other edge cases like 出づ(いず) and ()(のあわ) only affect a small number of entries and can be manually fixed.
(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 15:18, 15 January 2020 (UTC)
I do think that the conjugation tables in current use are unnecessarily intimidating. I think the format shown above is nicely simplified, and the forms included (stem, past tense, adverbial) are probably most useful to include. (For -na adjectives, perhaps even adnominal & adverbial are unnecessary, but I don't really object to including them.) I also like that the conjugation classes are named. Perhaps those could link to something like an appendix for readers who want all the gory details? Cnilep (talk) 01:25, 16 January 2020 (UTC)
I generally agree. But we need to first determine where the kyūjitai and historical kana are going. -- Huhu9001 (talk) 08:03, 16 January 2020 (UTC)
  • General thoughts:
    • Agreed that we need to decide where to put existing data that doesn't fit in the new proposed format -- kyūjitai and historical kana. Presumably kyūjitai should go in {{ja-kanjitab}}, but I'm not sure where historical kana would go if not the headword. FWIW, monolingual JA dictionaries include historical kana in their entry headlines.
    • Agreed that the proposed subset of verb forms for the headline are likely the most immediately useful: 終止形・連用形・過去形. Notably, the Nippo Jisho / Vocabulario da Lingoa de Iapam presents these same forms on its verb-entry headlines.
    • @Cnilep, not sure what you mean by "the conjugation tables in current use are unnecessarily intimidating". I'm certainly open to the idea of reworking them, and I've long thought the current state presents a seemingly arbitrary subset of forms, but I don't think the conjugation tables should be removed altogether.
      • I guess the main thing I had in mind is that if one wants, say, the past form of 遊ぶ and that information is only available in the conjugation table, one needs to look through 14 other forms before coming upon 遊んだ. I agree that the tables are useful, and I appreciate the fact that they are collapsed by default. I see proposed changes as a useful addition, not as a replacement. Cnilep (talk) 00:13, 17 January 2020 (UTC)
    • Also, if we include adnominal forms for -i adjectives (which we kinda have to, since that's also the terminal / dictionary form), it makes sense to include that for -na adjectives as well. Principle of least surprise, not confusing users, etc.
    • @Dine2016, your ruby for 物の哀れ are a bit off -- was that meant to show the current incorrect output of {{ja-r}} when not using % to manually specify kana string breaks?
    • I don't agree with hiding the link to Wiktionary:Japanese transliteration in the ・. This is poor usability -- I had no idea this link was there until I looked at the wikisource, and even then, the ・ is so small that it's not easy to click on when using a mobile device, and challenging even using a mouse, especially for any of our mobility-impaired users.
    @Dine2016, what is your use case for including this link? If it's to explain our approach to romanization, then perhaps all of the romaji strings should have this link? That seems a more obvious and hard-to-miss presentation to our readers.
    • Separately, it is confusing to reverse which text is italics and which is not. I would prefer for all romaji to be in italics, and descriptive text to be non-italicized. This aligns with the output of various other templates like {{m}}, {{ja-r}}, and {{ja-readings}}, and would be more visually consistent. Example of what this might look like for the entry headlines:
    使(つか) (tsukautr godan (stem 使(つか)(tsukai), past 使(つか)った(tsukatta))
    ()べる (taberutr ichidan (stem ()(tabe), past ()べた(tabeta))
    (たか) (takai-i (adverbial (たか)(takaku))
    有名(ゆうめい) (yūmei-na (adnominal 有名(ゆうめい)(yūmei na), adverbial 有名(ゆうめい)(yūmei ni))
    Presumably we could implement the above using careful tweaks of our CSS.
    • I notice that the (na) and (ni) for 有名 are linked to the entries. That is good. I suggest that these links be made more specific, to deliver the user directly to the relevant etymology / sense.
    For that matter, I notice now that we don't have any adverbial sense for (ni). We'll need to add that.
Again, on the whole, I support this idea of changing the entry headlines for Japanese. The above is intended as feedback to ensure that our results are optimal. ‑‑ Eiríkr Útlendi │Tala við mig 18:43, 16 January 2020 (UTC)
@Eirikr: The bullet (🙃) and italics are part of the standard Wiktionary Template:head... {{l}} also does not italicize romanizations, and {{ja-readings}} did not use italics until the recent discussion. —Suzukaze-c 00:25, 17 January 2020 (UTC)
@Suzukaze-c: Hmm, thank you. The interpuncts may be part of the default {{head}}, but they are new to the JA entries since (I believe) Dine2016's edits earlier today. I don't think I like that particular part of this layout -- very poor discoverability. I agree with your linked comment, that definitely needs a rethink and redesign. And regarding {{l}}, I've never agreed with the decision to differentiate italicization behavior between {{m}} and {{l}}. Cheers, ‑‑ Eiríkr Útlendi │Tala við mig 01:25, 17 January 2020 (UTC)
@Eirikr: Not sure what you mean by "if we include adnominal forms for -i adjectives ..." The new format I proposed and implemented shows adnominal forms for -na adjectives (which is dictionary form + na), but not -i adjectives, since they are the same as the dictionary form. As for the off-placed kana, that's exactly the point -- the headword templates must fail rather than choose one candidate when there are several ways to match kanji and kana. --Dine2016 (talk) 08:36, 17 January 2020 (UTC)
@Dine2016: I think he meant he wants (もの)(あわ) instead of 物の哀れ(もののあわれ). -- Huhu9001 (talk) 11:42, 17 January 2020 (UTC)
  • @Dine2016, Huhu9001, re: {{ja-r}}, it should never put ruby on kana (unless maybe the ruby provided is completely different from the kana? That happens sometimes in manga...). And the template doesn't currently put ruby on kana, so that's a non-issue. The problem is simply that the template currently mis-parses certain strings unless the editor manually inserts the % markers to delineate the breaks. ‑‑ Eiríkr Útlendi │Tala við mig 17:50, 17 January 2020 (UTC)
  • @Dine2016, re: adnominal forms, yes, we get that by default for -i adjectives since the adnominal and terminal / dictionary forms are identical. I was mostly responding to Cnilep's comment about possibly removing the adnominal from the headline for -na adjectives. ‑‑ Eiríkr Útlendi │Tala við mig 17:50, 17 January 2020 (UTC)
Is there any good way to make kanji-kana matching more precise? I'm thinking about using a 漢字音訓表. --Dine2016 (talk) 13:40, 17 January 2020 (UTC)
  • @Dine2016, so long as the table is fully populated. A lot of lesser-used readings are omitted from many dictionaries, which often focus just on modern usage. We try to cover the whole language, even historically, so our entries are likely to include many more readings. ‑‑ Eiríkr Útlendi │Tala við mig 17:50, 17 January 2020 (UTC)
Maybe we can make the code automatically insert "%" to the kanji-form using a default rule when it detects there are some "%" in the kana-form but none in the kanji-form. -- Huhu9001 (talk) 14:19, 17 January 2020 (UTC)
  • @Huhu9001, I can imagine many failure modes for that. For instance, what if the editor only added a few % markers, just enough to get the desired behavior from the current (or some past) state of {{ja-r}}? What about jukujikun, which sometimes appear in compounds with other kanji with regular on or kun? I'm not sure that automatically inserting % in the right places is a realistic goal. ‑‑ Eiríkr Útlendi │Tala við mig 17:50, 17 January 2020 (UTC)
Well, perhaps a much simpler idea is to add a new parameter for Japanese headers, with which you can add "%" or spaces directly to the kanji-form: {{ja-noun|もの の あわれ|kanji=物 の 哀れ}} -- Huhu9001 (talk) 18:24, 17 January 2020 (UTC)
Oh, we've already got this parameter. It is head=. We can just give this function to it. -- Huhu9001 (talk) 18:27, 17 January 2020 (UTC)
  • @Huhu9001, Dine2016, not sure what's going on, but over on the こし entry, {{ja-noun|コシ}} is generating awfulness -- instead of presenting コシ as an alternative spelling, it's showing this:
こし or こし(コシ) (koshi)
Definitely not expected or desired output, and highly likely to confuse users. As a general approach, kana should never have kana ruby, unless the ruby string is provided as a manga-style way to provide a completely different gloss for the main term. Adding katakana ruby over hiragana with the same phonetic values is just ... extremely confusing. ‑‑ Eiríkr Útlendi │Tala við mig 00:32, 18 January 2020 (UTC)
I've made a mess. @Eirikr: How is it now? -- Huhu9001 (talk) 00:39, 18 January 2020 (UTC)
@Huhu9001, much better, thank you! ‑‑ Eiríkr Útlendi │Tala við mig 01:04, 18 January 2020 (UTC)
@Huhu9001 instead of having editors manually insert %, what about having the template fetch the kanji-kana matching from {{ja-kanjitab}}s on the same page? --Dine2016 (talk) 03:37, 18 January 2020 (UTC)
It may be a good idea. -- Huhu9001 (talk) 05:34, 18 January 2020 (UTC)
  • Checked the layout for the 帰る entry today out of curiosity and saw further changes.
I'd like to suggest that the verb conjugation type notation include both the 教育 labels ("ichidan conjugation, godan conjugation") and the common English-language labels ("type II, type I"). We must remember that our readership is 1) users reading in English, where English-language terminology should at least be offered, and 2) probably learners of Japanese, so labels and other meta-information should provide appropriate context for learners. I'm a fan of using both labels if possible, as the "type XX" notation is extremely common in English-language materials for Japanese learners, whereas the XXdan notation is really the only notation used in Japanese-language materials, which Japanese learners will eventually start using (so long as they continue in their studies, of course).
(Also, should we move this thread to somewhere more specific for this discussion? At the bare minimum, for posterity?)
Cheers, ‑‑ Eiríkr Útlendi │Tala við mig 17:08, 21 January 2020 (UTC)
In addition, a link through to w:Japanese verbs or some similar explanatory page would likely be a good idea. ‑‑ Eiríkr Útlendi │Tala við mig 17:11, 21 January 2020 (UTC)
moved ✓ —Suzukaze-c 17:53, 21 January 2020 (UTC)
@Eirikr: Godan and ichidan are 学校文法/国文法 labels. The 日本語教育文法 equivalents are Group I and Group II. --Dine2016 (talk) 09:39, 22 January 2020 (UTC)
@Dine2016, no real argument there. My only quibble is that I'm more accustomed to seeing "type I / II" than "group I / II", and the WP page at Japanese verb conjugation similarly uses the "type" terminology.
Separately, I think I remember you suggested putting shinjitai and kyūjitai information in {{ja-kanjitab}} rather than the POS headline. I believe it would be much more appropriate in {{ja-kanjitab}}, and putting it there would also mean we wouldn't need to duplicate this for each POS headline. ‑‑ Eiríkr Útlendi │Tala við mig 00:44, 23 January 2020 (UTC)

@Everyone: do we need new formats for -no adjectives (and -na/no adjectives) as well as adverbs that are optionally or mandatorily followed by to? --Dine2016 (talk) 03:12, 23 January 2020 (UTC)

  • @Dine2016:: broadly, yes. Not sure yet how best to handle -no adjectives, since these are 1) a relatively new class of word, historically speaking, and 2) they appear to be either a shift in usage of nouns, or a shift in particles for -na adjectives. For the (to) adverbs, we should have some means of clarifying for readers whether the particle is required or optional. ‑‑ Eiríkr Útlendi │Tala við mig 16:52, 23 January 2020 (UTC)
I do not know enough and decline to give an opinion. —Suzukaze-c 23:25, 23 January 2020 (UTC)

Recently I noticed there is a Template:ja-altread. It may be related to this topic. -- Huhu9001 (talk) 09:32, 23 January 2020 (UTC)

  • @Huhu9001: In almost all cases, {{ja-altread}} is misused: in almost all cases, the separate readings should be split out into separate etymology sections. See examples of this at , where the Chinese-derived ta and native-Japanese hoka readings require splitting out; 候#Etymology_4 where again each of the three readings here have their own phonological derivations and other specifics; 妹#Etymology_2 and 弟#Etymology_2 where the "alternative" reading is actually dialect, and where Kagoshima is arguably distinct enough to be considered a separate language (we need a bigger discussion of how to handle 方言 before diving in to this degree); where the "alternative" reading appears to be a fusion + vowel shift deriving from お + ゆび and is obsolete, where our current layout using {{ja-altread}} misleadingly suggests that this is a regular everyday reading for this kanji; 札#Etymology_2, where again we misleadingly include two archaic, borderline-obsolete readings and present them as equivalent to modern usage; 私#Etymology_1, where a dialectal form is included under a kanji spelling, but without textual evidence for that kanji spelling; 面#Etymology_2_2, where the reading omo is incorrectly given as simply an "alternative" for omote, despite omote in fact deriving from (omo, face; front) + (te, in reference to direction); etc. etc.
There are occasional instances where this template has been used in ways that seem more correct and justified, such as at 御#Etymology_3 or (and even here, the entry needs more explanation, such as the explanatory notes about the readings at 御#Etymology_3). However, in almost all cases, {{ja-altread}} has resulted in incomplete and even sloppy lexicography. I currently count 236 entries linking to this template. I expect that most of these will require reworking. ‑‑ Eiríkr Útlendi │Tala við mig 16:52, 23 January 2020 (UTC)
I agree with Eirikr. —Suzukaze-c 02:57, 24 January 2020 (UTC)
@Eirikr: Since 御#Etymology_3 is an obsolete word, it appears that romaji would reflect the neoclassical pronunciation. We don't know much about this kind of pronunciation except:
  • Historical kana are translated to modern pronunciation according to the rules.
  • /Vu/ monophthongization rules can be optionally applied between verb 語幹 and 語尾. For example, 会ふ can be read either アウ or オー, and 変ふ (classical form of 変える) can be read either カウ or コー.
  • The volitional "auxiliary verb" む is pronounced ン (not sure if this is optional or mandatory).
However, it is unsure whether む in other words that developed into ん is pronounced ン. If this is mandatory, then we shouldn't have the ōmu- romaji (which would be a kind of spelling pronunciation, like konnichi ha). Bjarke Frellesvig's A History of the Japanese language mentioned this "neoclassical" pronunciation in section 1.1.6, where it is called "NJ reading", and gave an example where 神主 (Old Japanese: kamunusi) is read kannusi in "NJ reading". but as that book focuses on the spoken language, it does not study this kind of reading. --Dine2016 (talk) 06:29, 25 January 2020 (UTC)

Where to put historical kanaEdit

hiragana (modern) こい
hiragana (historical) こひ
kanji (shinjitai)
kanji (kyūjitai)
kun’yomi

@Suzukaze-c, Eirikr, Huhu9001 It's easy to agree that kyujitai should be moved to {{ja-kanjitab}}. As for where historical kana would go, one option is to move them to the pronunciation section (see below) and another is to move them to a new version of {{ja-kanjitab}} intended as a morpheme template (see right).


  • (Tokyo) IPA(key): /kóꜜì/, [kó̞ì]
  • (Kyoto, Osaka) IPA(key): /kóꜜì/, [kó̞ì]
  • (Kagoshima) IPA(key): /kòí/, [kò̞í]
  • (Kyoto) コヒ IPA(key): /kòɸì/
  • (Central) 故飛, 故非, 故悲 /*kǒpᶤì/
  • (Central, Eastern) 胡非, 孤悲 /*kòpᶤì/
  • (Central, Eastern) 古飛, 古非 /*kópᶤì/
  • (Eastern) 古比 /*kópʲí/

(Example adapted from User:荒巻モロゾフ/draft. Modern terms would have both modern and historical kana but only modern romaji.)

I prefer to put it in the pronunciation section because:

  1. The kana spellings are important information so it should be displayed in the main "flow" of the page. The morpheme templates on the right of the page are easily ignored because today's computers have wider screens, so they should be reserved for supplementary information (alternative kanji spellings, reading patterns, etc.)
  2. The kana spellings reflect modern and historical pronunciation, so it is more logical to put them in the pronunciation section. It doesn't hurt to duplicate them in the morpheme templates where they are distributed to the kanji or morphemes, but doing this to modern kana is sufficient.
  3. Templates on the same page are processed separately and cannot know the arguments supplied to each other. If the kana spellings and pronunciation are placed in different templates, there is no way for them to generate romanizations that depend on both the modern kana spelling and the pronunciation, such as Nihon-shiki and Waapuro Hepburn (Hepburn with long sounds expanded according to kana spelling, common on manga and anime websites).

What do you think? --Dine2016 (talk) 08:47, 25 January 2020 (UTC)

  • I would like a new template for historical kana. I think the historical kana, and the historical pronunciations associated with them, are just "supplementary information".
  • (Off-topic, I also want |alt= to be separated from t:ja-kanjitab, as they are not necessarily "kanji".)
  • Communication between templates is technically possible by making one page transclude itself. However I don't know whether there will be unpredicted consequences once this kind of self-transclusion is put into large-scale practice. -- Huhu9001 (talk) 09:10, 25 January 2020 (UTC)
  • Self-transclusion may work if there is only one etymology section. But when there are multiple, you don't know which etymology section contains the "current word" to fetch arguments to other templates. Therefore it is still desirable to merge them. If one template has the reading pattern of the lemma spelling, one has alternative kanji spellings, and one has historical kana, they can't communicate and their relative order in the HTML output is fixed. It's better to merge them into a single morpheme template which will have greater freedom in the presentation of the information.
  • As Eirikr said above, we're going to cover all stages from Old Japanese to Modern Japanese in the ==Japanese== section, so it doesn't make sense to separate modern kana/pronunciation from historical kana/pronunciation and treat the latter as supplementary information. They can be presented side by side. --Dine2016 (talk) 10:26, 25 January 2020 (UTC)
So does that mean all OJP L2 will be eliminated soon? -- Huhu9001 (talk) 14:35, 25 January 2020 (UTC)
No, how to cover premodern Japanese is not yet agreed upon. That's another topic. (I want to have ==Old Japanese== cover OJP from a different perspective than ==Japanese==.) --Dine2016 (talk) 15:34, 25 January 2020 (UTC)

By the way, I suggest avoiding the usage of class="mw-collapsible" because it does not work on the mobile site. -- Huhu9001 (talk) 08:50, 26 January 2020 (UTC)

Single-kanji + suru verbsEdit

I ran across @Huhu9001's recent change at 会する. The ruby work correctly now, but the romanization links to kai suru, where the practice for other suru verbs has been to link to the two portions separately. While 会す is effectively inseparable, with potential form 会せる, it appears that 会する is parsed by (at least some) modern speakers as separable, with potential form 会できる. Alternatively, if we are to present this as an inseparable verb like 愛する, the romanization should not have any spaces.

Do others have any different views on this kind of verb? ‑‑ Eiríkr Útlendi │Tala við mig 06:47, 26 January 2020 (UTC)

I have no idea on this. -- Huhu9001 (talk) 08:42, 26 January 2020 (UTC)

Inflectional suffixesEdit

@Dine2016, re: the recent changes like this one, might I suggest a simple template? This would allow for shorter wikitext and consistent wording, and an easy way to change the wording if such is ever needed. ‑‑ Eiríkr Útlendi │Tala við mig 21:09, 26 January 2020 (UTC)

@Eirikr: Of course. A template can also easily generate categories as well. If we create a template, I suggest we further divide inflectional suffixes (from a morphological analysis) into inflecting ones and uninflecting ones, because this is another place where 学校文法 and 形態論 disagree: う and た are uninflecting, but they are traditionally classified as 助動詞 because う is historically derived from む, and た has the 活用形 たろ(う) and たら which are distinct suffixes from た from a synchronic analysis. --Dine2016 (talk) 23:53, 26 January 2020 (UTC)

@Eirikr I noticed that you added an anchor to the よう page. This is another proof that MediaWiki is unsuitable for dictionary entries. We should have been using page titles like ja/よう, suff. like the Oxford English Dictionary, which will allow unambiguous identification of lexical items. But now we lump several lexical items (i.e. entries in printed dictionaries) on one page, what to do? My solution is that we begin each Etymology section with a template that identifies the current lexical item like {{ja-spellings}}, which will generate anchors like 蛙#ja-かえる and 蛙#ja-かわず. If there are no alternative spellings, the template can accept an extra identifier to generate anchors like よう#ja-volitional. This will allow one to link to a specific lexical item unambiguously, even if etymology numbers are reordered or if other languages (e.g. Chinese) interfere.

You may note that the standard entry layout (WT:EL) requires headword templates as the backbone of entries. This is because in languages like English, a change in POS usually results in different lexical items (e.g. en/change, v. and en/change, n. are different entries in the OALD). Languages like Japanese are quite different: as long as the kanji and kana are the same, printed dictionaries usually considers them to be the same lexical item, however the POS changes. This is another reason why we should work around the standard entry layout as much as possible. The Chinese editors who implemented Unified Chinese did a good job in shifting the backbone of entries from headword templates to {{zh-forms}} and {{zh-pron}}, and I think the Japanese editors should follow their example. --Dine2016 (talk) 06:10, 28 January 2020 (UTC)

@Dine2016: Yes, I've long considered the current Wiktionary data structure to be ... horrible, from a data management perspective. The MW back-end allows us to do things like have a lemma spelling page, with each language as sub-pages under that, and optionally each etym and/or POS as sub-pages under the language -- with transclusion bundling all that into a consolidated lemma spelling page showing all languages for that grapheme. Users could follow the grapheme, or follow the language-specific sub-page. Frankly, I don't care if kappa#Finnish changes, but I do care if kappa#Japanese changes. Etc.
However, for various reasons (some of which frankly don't make sense to me), other editors have been vehemently opposed to any such approach. Despite the fact that some of our forum pages effectively already do this -- such as Wiktionary:Beer parlor splitting things out by years and months for at least semi-sane management, and consolidating at Wiktionary:Beer parlor for ease of reading.
I even knocked up a sample of what this might look like years ago at User:Eirikr/Sandbox3/ni. Now, with Lua, I suspect that something even better might be possible -- on the technical level, anyway.
Re: your proposal for auto-generating anchors, I'm all for it.
Incidentally, one reason for why I put the anchor at よう above the etym header is usability -- so that the etym header itself would be visible when a user arrived at that anchor. I've long thought it very confusing when I land on a page and I can't tell what heading I'm looking at.
Cheers, ‑‑ Eiríkr Útlendi │Tala við mig 18:27, 28 January 2020 (UTC)
If you agree that the current Wiktionary data structure to be horrible, I think we should make the source of entries as logical as possible (that is, favor logical markup over presentational markup), even at the cost of the final rendering. Here the anchor belongs to Etymology 3, so it should be under Etymology 3, not at the end of Etymology 2.
In fact, I've encountered a similar problem when writing {{ja-see}}: single-kanji entries without Etymology sections may have the following structure:
==Japanese==

===Kanji===
...

{{ja-kanjitab|...}}
===Noun===
...
In this case the {{ja-kanjitab}} belongs to the word, not the ===Kanji=== section, so the code should not discard it when discarding the ===Kanji=== section. My solution was to write extra code to watch out for this case, but this is nevertheless dirty. --Dine2016 (talk) 05:29, 29 January 2020 (UTC)
  • @Dine2016, for single-kanji entries, if there is any POS content at all, there should be an ===Etymology=== header, even if there is no etymology actually provided at the current time. I'm glad you coded a workaround for {{ja-see}}. As you note, that kind of entry structure is defective, and given time, we should eventually fix those. It's much easier to fix them if we can find them, but I'm not sure of an easy way to find such entries; do you have any good ideas?
Re: wikicode structure, I prioritize usability for readers above usability for editors, hence where I put the anchor <span>. For the example at hand of よう, ideally there would be a way of generating an anchor in the etym header itself that includes editor-specifiable semantic information, not just a mutable index number. Lacking that, we are stuck with hackish workarounds. Oh well. ‑‑ Eiríkr Útlendi │Tala við mig 18:02, 29 January 2020 (UTC)
@Eirikr: "I prioritize usability for readers above usability for editors": Our standard layout which expresses hierarchy using headers is horrible for readers as well. For example, please take a look at the POS contents of 岸, then take a look at the ja-see-kango of が. Which layout is more "ichimokuryōzen"?
If we want to serve readers better, we should keep the source of entries reasonably parseable, so that others can develop better layouts for Wiktionary. Do you prefer to use EDICT via its official interface WWWJDIC, or via the third-party Jisho.org? Similarly, Wiktionary readers will certainly use a third-party interface to read Wiktionary data, and we should prepare for that.
"do you have any good ideas": (1) find a template that is used on all Japanese entries, such as the headword templates, (2) inject code into mod:ja-headword to make the headword templates transclude the entire page and analyze the Japanese section, (3) track or categorize the entries you want to find and (4) leave everything else to(エクスプロイト) the Wikimedia servers. (I haven't tried this idea, though.) --Dine2016 (talk) 03:24, 30 January 2020 (UTC)
@Dine2016, re: the header-heavy layout, agreed that it's sub-optimal. With a few minor quibbles, I quite prefer some version of the mock-ups you've pulled together in the past. Collapsing nouns and する (suru) verbs to a single header, for instance, is pretty much a no-brainer, but one that contravenes the dicta of WT:ELE. I ran into broader-community resistance to moving away from WT:ELE when we were developing the layout and templates for romanized Japanese entries -- technically speaking, it makes no sense to me to require anything more than the ==Japanese== header and a single template that generates the ===Romanization=== header and the sense line linking to the kana spelling -- and that's what we had for a brief while -- but other editors (notably, those not working with Japanese) pushed quite hard for keeping separate templates for {{ja-romaji}} and {{ja-romanization of}}. I still don't understand the reasoning behind this. Anyway, that experience made me a bit resigned to the current structure, and I've focused my energies more on getting the information in here. It may be time to revisit the layout in a more sustained and systematic fashion, especially now with Lua and all the capabilities we've gained since the last go-round (at least, the last one I participated in directly).
Re: finding entries based on structure, I kinda figured that might be the way to go. I don't have the Lua chops to pursue that, however. :-/   Maybe some day, but responsibilities IRL leave me little time to pursue that, and I confess that I find the spaghetti-ness of our module infrastructure makes it a bit hard to trace what goes where without putting in more effort than I can comfortably afford. Something for another day, perhaps. ‑‑ Eiríkr Útlendi │Tala við mig 17:52, 30 January 2020 (UTC)

make the pronunciation section required and more prominentEdit

After reformatting the headword lines, the next thing I want to do is give {{ja-pron}} a gray background.

As shown in the previous section, headers do a poor job of showing hierarchy. When browsing a multiple-etymology entry, one can easily find oneself lost in an ocean of headers which are all aligned to the left and don't differ much in font size, and have to look forward for headword lines to determine which word is being described. By contrast, Chinese entries have a more regular rhythm because the pronunciation section is central and has a gray background, so one can easily tell the next word from the previous.

Japanese Chinese
Etymology 1
Pronunciation
Noun
Etymology 2
Pronunciation
Noun
Etymology 3
Noun
Etymology 1
Pronunciation
Noun
Etymology 2
Pronunciation
Noun
Etymology 3
Pronunciation
Noun

It is easy to edit {{ja-pron}} to adopt the Chinese format, but I'd like to take the opportunity to propose a more regular entry layout as well. For example, etymology sections that lack a pronunciation section should be supplied one, and Osaka accents and usage notes outside {{ja-pron}} should be either incorporated or have the bullets removed. Most importantly, we can move the historical kana from the headword lines to {{ja-pron}}, whose primary role would be to identify the current word, by modern pronunciation or by historical spelling (which still embodies the modern pronunciation if you follow the rules).

What do you think of this approach?

(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 07:14, 31 January 2020 (UTC)

@Dine2016: Whatever you're doing, you're doing well. Thanks for the effort and sorry for not always participating. I'd like to have any alternative kana (including the historical) and transliterations to move to the pronunication section, which was already discussed, I think and I'd like to propose discouraging and even banning multi-word rōmaji, delinking any rōmaji, which have any space in them whereever they appear. Why do we need entries such as seishin bunretsu byō and why this entry 精神分裂病(せいしんぶんれつびょう) (seishin bunretsu byō) should expose it (have a hyperlink)?--Anatoli T. (обсудить/вклад) 07:24, 31 January 2020 (UTC)
@Metaknowledge, KevinUp -- Huhu9001 (talk) 08:36, 31 January 2020 (UTC)
If I can be honest, I'm not a huge fan of the grey box around {{zh-pron}}. It definitely aids navigation, but I have personally concluded that allowing the text to flow freely is superior. (Perhaps we could highlight Etymology headers with CSS for all languages?) —Suzukaze-c 09:29, 31 January 2020 (UTC)
I like these ideas, and I also support Anatoli's suggestions. I deeply appreciate the efforts to make the Japanese sections less of a mess, although if we're going to be completely subsuming Old Japanese and including those pronunciations in the Japanese L2, I think we also need to consider things like giving the (usually multitudinous) attested OJ spellings. —Μετάknowledgediscuss/deeds 17:43, 31 January 2020 (UTC)
  • @Dine2016, broad agreement from me. I'm not 100% sure I like the shading, but I'm fully supportive of making the pronunciation more prominent. This is much more of an issue for JA terms than for EN, given the possible wide profusion of unrelated pronunciations for a given headword spelling, such as clearly observable at Japanese with its 9 etymologies and 8 distinct pronunciations. ‑‑ Eiríkr Útlendi │Tala við mig 22:02, 31 January 2020 (UTC)
  • @Atitarev, I agree that spelling information should be handled differently. However, spellings -- kanji and kana both -- are not really "pronunciation" information, so I disagree with putting spellings there. I also don't understand your opposition to romanizations that include whitespace -- if the JA term is clearly a multi-word term, I don't understand why the romanization would not similarly clearly indicate word barriers, in conformance with Latin-alphabet formatting practices across many languages of putting a whitespace between words. Or perhaps I'm misunderstanding, and you're instead just opposed to the existence of multi-word romanized entries? ‑‑ Eiríkr Útlendi │Tala við mig 22:02, 31 January 2020 (UTC)
    • @Eirikr: All sorts of transliterations and IPA are nicely implemented in the Chinese model, which I really like.
    • Yes, I oppose multi-word romanisation entries, they are no longer needed (not really) as a disambiguation tool (also applies to Mandarin pinyin entries). The disambiguation and a large number of homophones is the main reason why we have romanisation entries in the first place. —Anatoli T. (обсудить/вклад) 23:10, 31 January 2020 (UTC)
@Eirikr: Or do you mean that modern and historical kana should be in the morpheme templates, but all sorts of romaji (and phonetic kana) can be in the pronunciation section? --Dine2016 (talk) 07:39, 3 February 2020 (UTC)
  • @Dine2016: My main point is that spelling and pronunciation are orthogonal. They're related, but separate phenomena, and I think we need to treat them separately. Your mock-up looked quite good to me. I might suggest tweaks, like showing our standard modified Hepburn by default, and only showing the others if the user opts to expand, but on the whole I like it -- it cleanly separates the written forms (kanji, kana, romanizations) from the spoken forms (the pronunciation).
I hope that helps clarify my position. ‑‑ Eiríkr Útlendi │Tala við mig 20:32, 3 February 2020 (UTC)
@Eirikr: As Atitarev (talkcontribs) pointed out, romanizations are not spellings, and other languages put transcriptions and transliterations in the pronunciation section as well even when they depend on the spelling. Mixing Japanese and Latin script in the morpheme/forms box is bad layout, moving the Latin script to {{ja-pron}} and placing it along the IPA is much cleaner. --Dine2016 (talk) 06:13, 5 February 2020 (UTC)
@Dine2016: There are also some other languages that do not put translit. in the pronunciation section. θεός#Greek; إله#Arabic; देव#Hindi. -- Huhu9001 (talk) 10:22, 5 February 2020 (UTC)
@Huhu9001: Ah, yes, I should have said "some other languages" instead. On the other hand, Greek, Arabic, and Hindi entries offer only one kind of romanization, so the romanization is simply put in the headword line. When there are multiple kinds of romanizations displayed on the page, is there any place for them if not the pronunciation section? --Dine2016 (talk) 10:27, 5 February 2020 (UTC)
At its core, Japanese is a bit different from most other languages: a single written form in Japanese may have multiple unrelated spoken forms, each with its own meanings and derivations. The spoken and written forms are largely independent, to a degree we don't see even in English with its otherwise wide potential disparity between the spelling and the pronunciation. And the spoken form of a Japanese term may have multiple unrelated written forms as well, again each with its own meanings and derivations. This allows for a truly broad, flexible, and powerful degree of expression -- and it also presents some (I believe) unique lexicographical challenges. Any given Japanese term as we record it here effectively represents a node in a two-dimensional array or matrix, where one dimension or axis is the written form, and the other is the spoken form. Some terms are simple, with only one spoken and one written form, but others can be quite complex. Which axis we focus on reveals different relations.
Regardless of the status of romanizations as "spellings", official, or unofficial, romanizations are clearly written forms, not spoken forms. Pronunciation is, by definition, about the spoken language. As such, putting details about written forms in a section that is specifically about spoken forms seems like incorrect organization of data. If we are to separate out other written-form details into its own section -- which I think is a good idea, and which I think is mostly achieved in your mock-up, which collects written forms into a discrete section -- then I think we should put romanizations there as well -- as, indeed, your mock up does. I'm a bit confused now that you appear to be arguing to instead put romanizations in the pronunciation section? ‑‑ Eiríkr Útlendi │Tala við mig 18:49, 5 February 2020 (UTC)
(Well, hyphenation isn't always related to pronunciation either... —Suzukaze-c 20:47, 5 February 2020 (UTC))
@Suzukaze-c: No argument there. How is this relevant? ‑‑ Eiríkr Útlendi │Tala við mig 21:13, 5 February 2020 (UTC)
@Eirikr: Other languages put spelling information in the pronunciation section as well. Even English has hyphenation info in the pronunciation section. Japanese is not really different from English. The reason it appears to be more complex is because some editors consider a word to be its lemma spelling, instead of the sum of all possible kanji and kana. So they write the entry in a way that depends on the lemma spelling. For example, the lexical item oru "to be, to exist" was originally lemmatized at 居る, so the original editor considered it to be 居る, and added |yomi=k and talked about its iru reading. This clearly went wrong when it was moved to おる. If we instead considered 居る to be a representation of the word { spoken form: oru, written form: [居る, おる], meaning: 'to exist, to be' }, then we wouldn't have had |yomi=k or talked about its iru reading, because these are properties of one of the spellings of the word, not the word itself. In light of this, Japanese 居る is not really different from English maneuver, and we can put spelling information in the pronunciation section. --Dine2016 (talk) 02:59, 6 February 2020 (UTC)
@Dine2016: There is still a distinction between speech and writing information even if you treat the word as a json. @Eirikr: Perhaps try to use a single template like {{ja-info}} in wikitext, which creates two tables, one for pronunciation, and one for spelling? -- Huhu9001 (talk) 07:41, 6 February 2020 (UTC)
My point is that there is no harm in putting both spelling and pronunciation info in the pronunciation section. Other languages are doing it as well. English. Chinese. Korean. Burmese. Thai. --Dine2016 (talk) 08:56, 6 February 2020 (UTC)
Personally I don't like the t:zh-pron style. Everything is jammed in a small box. Homophones should have been important information of Chinese words, but now it is collapsed and buried in it. -- Huhu9001 (talk) 09:38, 6 February 2020 (UTC)
(just a note: Wyang is responsible for zh, ko, my, th, kh, bo. my used to have extra romanizations in the headword until {{my-IPA}} was written —Suzukaze-c 15:59, 6 February 2020 (UTC))
And back then, Burmese headword lines were absurdly long and hard to parse by eye. I think Wyang was right on this issue in general, because I find those entries much easier to use now. —Μετάknowledgediscuss/deeds 16:17, 6 February 2020 (UTC)
No value judgements; just a reminder that a moderately common practice was implemented by one person in multiple places. —Suzukaze-c 20:35, 6 February 2020 (UTC)
I withdraw what I said
learn; study school
hiraganaModern hiragana: がっこう がく→がっ こう
historical hiragana: がくかう, がつかう がく→がつ かう
kanjishinjitai: 学校
kyūjitai: 學校
rōmajiHepburn romanization: gakkō gaku > gak
Waapuro Hepburn: gakkou gaku > gak kou
Kunrei-shiki / Nihon-shiki: gakkô gaku > gak

I think my Chinese background has made me biased against romaji. It seems that many westerners start learning Japanese with romaji, and mastering the kana script is a great challenge to them (unlike Chinese learners who start out with kana and kanji+furigana directly). For them, romaji is the Japanese script at an early stage, even if it is not an official Japanese script. Maybe we should incorporate the romanizations into the morpheme template, to enable such learners to identify their morphemes without bothering to learn kana.

@Eirikr, Atitarev, Huhu9001, Suzukaze-c (Please correct me if I'm wrong about western learners.) --Dine2016 (talk) 12:15, 8 February 2020 (UTC)

@Dine2016: What parameters do we need for this table? -- Huhu9001 (talk) 12:35, 8 February 2020 (UTC)
@Huhu9001: This is just a proof of concept. I don't know how to code it either. --Dine2016 (talk) 13:49, 8 February 2020 (UTC)
I was worrying about whether this table needs a lot of information to create, rendering the wikitext overly complicated. We have some entries like 立てば芍薬座れば牡丹歩く姿は百合の花. -- Huhu9001 (talk) 14:12, 8 February 2020 (UTC)
Also, since you are advocating a morpheme template, rather than a kanji one, how will it present words like 行(き)止(ま)り? -- Huhu9001 (talk) 14:12, 8 February 2020 (UTC)
@Dine2016: I maintain that rōmaji is not the Japanese writing system (it's kanji, hiragana and katakana), even if it's used for transcriptions, also by native speakers. The comparison with Hanyu pinyin is not 100% but very close. --Anatoli T. (обсудить/вклад) 23:30, 9 February 2020 (UTC)
And yet, it exists. :)
@Anatoli, if you're advocating for the wholescale removal of romaji from Japanese entries, or even just advocating against the inclusion of alternative romanization schemes (other than the modified Hepburn we've used here for some time), I cannot agree. If instead you are advocating for the removal of romaji-only entries, I can support that, as above. That said, your position here is a bit unclear?
English-speaking (or at least English-reading) learners of Japanese will almost always be taught romaji before any other writing. Given the wide use of romaji, I think we would do our users a grave disservice to exclude romaji from our entries. I have also seen enough confusion over the years with regard to different romanization systems that I believe it would be useful to include the different spellings from different romanization schemes within a single entry, at the bare minimum to help with discoverability and searching. ‑‑ Eiríkr Útlendi │Tala við mig 16:36, 10 February 2020 (UTC)
@Eirikr: You misunderstood. I don't suggest to remove romaji from entries. I only suggest to delink multiword romaji. We shouldn't have entries for whole phrases in romaji, individual components will suffice for disambiguation. The entry お誕生日おめでとうございます has a red link to o-tanjōbi omedetō gozaimasu and we have otanjōbi omedetō gozaimasu, which can be broken up into component romaji for linking. --Anatoli T. (обсудить/вклад) 18:57, 10 February 2020 (UTC)
@Anatoli, thank you. As clarified, I second this proposal. ‑‑ Eiríkr Útlendi │Tala við mig 19:48, 10 February 2020 (UTC)
@Eirikr: Question: Where should Portuguese spellings (e.g. 親切 → xinxet) from the Christian materials be placed? I think it's bad if "xinxet" is placed in the forms/morpheme template while its reconstructions is placed in the pronunciation section. At least a copy of "xinxet" should be in the pronunciation section for reference. --Dine2016 (talk) 10:12, 11 February 2020 (UTC)
@Dine2016: Interesting question.
Note that these are not just from religious materials -- see also the Nippo Jisho, particularly the one on Google Books.
There are important differences between the Portuguese references and modern, not just in orthography. The Portuguese texts record a continued distinction of /ɔː/ (from older /au/) versus /oː/ (from /oo/ or /ou/). The presence of ⟨xe⟩ in the Portuguese also points towards /ɕe/ where we have modern /se/, indicating a greater degree of palatalization in the past. Things like the ⟨xinxet⟩ example you include also point towards a kind of final consonant without a following vowel, something that is absolutely ruled out for everything but /ɴ/ and /s/ in modern "standard" Japanese.
Of the several challenges presented by these historical texts, one in particular is not knowing very well which dialect of Japanese they recorded. We know that modern dialects can have quite a range of differences in phonology and construction. Did the Portuguese record the "standard" Japanese of the time, as used by the central government? Or was this more reflective of the local dialect of the regions in which the Portuguese were active, mainly in the southwest around Nagasaki? I'm not sure.
Back to your question of "where", I'd suggest perhaps somewhere close to the historical kana, considering that the Portuguese sources are a kind of historical romaji. It would probably also be a good idea to link through from the middle-Japanese Portuguese romaji through to an appropriate section at WT:AJA, which would explain the romanization scheme and the historical context.
Since these generally record the Middle Japanese pronunciation of (probably) the Kyūshū region of around 1600, I'm not sure they're immediately germane to the modern language, so they probably shouldn't be shown by default, and much like the historical kana and the alternative non-Hepburn romanizations, they should probably only be shown after the user deliberately expands something. ‑‑ Eiríkr Útlendi │Tala við mig 18:14, 11 February 2020 (UTC)

Proto-Japonic terms derived from Indo-European languagesEdit

Why does such a provocative, yet disappointingly empty category such as this exist? カモイ (talk) 07:11, 2 March 2020 (UTC)

@カモイ: Unknown. One option to learn more would be to check the Category page's History tab to see who created it, and contact them directly. ‑‑ Eiríkr Útlendi │Tala við mig 22:12, 2 March 2020 (UTC)
I see, thanks, I will do that. カモイ (talk) 23:18, 2 March 2020 (UTC)
@カモイ, FWIW, I doubt any such terms have been identified. There are Japanese terms that derive from PIE, but only as borrowings, things like (mitsu, honey) from Chinese from Tocharian, cognate with English mead; or (kawara, roof tile), from Sanskrit कपाल (kapāla, skull; any flat bone; cover, covering), cognate with English head. But again, these are definitely borrowings, as opposed to native terms with roots that stretch clear back to any presumed relationship with PIE. ‑‑ Eiríkr Útlendi │Tala við mig 22:26, 3 March 2020 (UTC)

Kyūjitai normalized to shinjitai in Unicode and {{ja-kanji forms}}Edit

62 kyūjitai are normalized to shinjitai in Unicode. MediaWiki also performs this normalization for kyūjitai literally included in source text, but not for HTML entities:

  • is automatically normalized to and stored only in the latter form in source text of given page.
  • &#xFA52; is displayed as .

Template:ja-kanji forms says k=y should be used, resulting in using Korean font, but it is not working.

E.g. page currently contains:

  • {{ja-kanji forms|k=y}} is displayed as a table with minimally bigger shinjitai 禍 falsely labelled as kyūjitai.
  • {{ja-kanji|grade=c|rs=示09|kyu=&#64082;}} is correctly displayed as:
(common “Jōyō” kanji, shinjitai kanji, kyūjitai form 禍)
  • {{ko-hanja|hangeul=화|eumhun=|rv=hwa|mr=hwa|y=hwa}} is correctly displayed as:
禍 • (hwa) (hangeul 화, revised hwa, McCune–Reischauer hwa, Yale hwa)

I think that HTML entities would be better solution than using Korean font, because characters remain in intended graphical form after copy-pasting elsewhere.

  • {{ja-kanji forms|禍|&#xFA52;}} does not work, it generates single & as kyūjitai.
  • {{ja-kanji forms|禍|&#64082;}} does not work, it generates shinjitai as kyūjitai.
  • However both {{ja-kanji forms|禍|[[&#xFA52;]]}} and {{ja-kanji forms|禍|[[&#64082;]]}} appear to work, they generate correct kyūjitai without link. (Shinjitai in first argument optionally can be enclosed in [[ ]] with no change of behavior.)

Can recommended usage of this template be changed to use HTML entities for these kyūjitai? (Personally I prefer hexadecimal notation.) Arfrever (talk) 17:44, 17 June 2020 (UTC)

Arabic numeral alternative formsEdit

Is there a standard for whether or how to create an entry for alternative forms of terms using Arabic numerals rather than kanji? For example, an entry for 6ヶ月 that redirects to 六ヶ月? It looks like there are currently separate entries for 6月 and 六月 but I don't think that's the right approach here. Noktulo (talk) 17:59, 5 July 2020 (UTC)

I also thought about maybe using the ja-see template but this note made me think it might not be the right answer either: "Please use this template for alternative spellings in the Japanese script only. Alternative sound forms (e.g. 追っ払う) and alternative spellings in other scripts (e.g. H#Japanese) currently still use the old approach." Noktulo (talk) 18:06, 5 July 2020 (UTC)

Return to the project page "About Japanese".