Theknightwho

huh?

Latest comment: 8 months ago6 comments2 people in discussion

You wrote, "you put this in the wrong section, and isn't the first I've seen. Please go through and correct these."

I have no idea what you're talking about. It looks like you moved it to the wrong section. kwami (talk) 23:24, 6 January 2024 (UTC)Reply

@Kwamikagami The template for Ewe letters should go in the Ewe section, not at the bottom of the page. Theknightwho (talk) 23:26, 6 January 2024 (UTC)Reply

Ah, I see. Yes, that was an oversight when I added a section after Ewe. kwami (talk) 23:46, 6 January 2024 (UTC)Reply

@Kwamikagami I’ve noticed it on quite a lot of these single-character entries you (re)organised, which is why I mentioned it. Theknightwho (talk) 00:18, 7 January 2024 (UTC)Reply

Sorry about that. When they appeared at the bottom of a page, I must've dismissed them as the kind of generic info that normally belongs there, not noticing that they were language specific. kwami (talk) 00:21, 7 January 2024 (UTC)Reply

@Kwamikagami It's okay. They display the alphabet for the language, though it gets a bit silly when there are a bunch of languages that only define it as a letter (as opposed to (e.g.) a which has lots of other senses as well). Theknightwho (talk) 00:27, 7 January 2024 (UTC)Reply

Wiktionary:Grease pit/2022/September

Latest comment: 8 months ago2 comments2 people in discussion

It took about eight hundred null edits, but CAT:E is now down to just this one- which is due to your deleting a sandbox module. Chuck Entz (talk) 04:52, 8 January 2024 (UTC)Reply

@Chuck Entz Thanks - my bad. I've commented it out. Given sandbox modules are inherently unstable, I imagine this was always going to break at some point. Theknightwho (talk) 21:57, 10 January 2024 (UTC)Reply

RC:Proto-Celtic/sexskā

Latest comment: 8 months ago2 comments2 people in discussion

Why are you reverting RC:Proto-Celtic/sexskā? This is the standard header usage as stipulated in WT:ORDER. -- Sokkjō 18:11, 11 January 2024 (UTC)Reply

@Sokkjo Because you made the opposite revert just before, and I wasn't aware that "alternative forms" and "alternative reconstructions" are supposed to be treated differently, which just seems like an oversight. Theknightwho (talk) 18:44, 11 January 2024 (UTC)Reply

Removing transliteration from reference templates

Latest comment: 8 months ago11 comments3 people in discussion

Hi,

I disagree with you removing transliteration from reference templates without any discussion. If you're comfortable with Cyrillic, others are not. It's been a very long practice for most language written in scripts other than Latin. If it's to be done, need to agree with the public and do it for many other languages as well. @Benwing2, @Vahagn Petrosyan. Anatoli T. ^{(обсудить}/^вклад) 02:16, 19 January 2024 (UTC)Reply

I agree with Anatoli here. Benwing2 (talk) 02:20, 19 January 2024 (UTC)Reply

@Atitarev @Benwing2 I'm not sure what the transliteration is adding here: we give a translation (which is necessary), but not giving the original title simply makes it difficult to look up the reference work. Theknightwho (talk) 02:22, 19 January 2024 (UTC)Reply

If you have questions about reasoning, you can always ask in WT:BP. I haven't established this practice, I'm simply following it and I see why it's used. Anatoli T. ^{(обсудить}/^вклад) 02:27, 19 January 2024 (UTC)Reply

@Atitarev Alright - I'll revert and start a BP discussion, because I think obscuring the original title is a really bad idea. Theknightwho (talk) 02:28, 19 January 2024 (UTC)Reply

Thanks. In some case, when the title is not too long - both the original script and transliterations can be used (IMO!). As in {{R:ur:Rekhta}} and {{R:ur:UDB}}:

Would it be even possible for you to remember these references if they were written only in Urdu? ریخْتَہ لُغَت (rexta luġat) and اُردُو لُغَت (urdū luġat). No-one reverted my edits there yet. Anatoli T. ^{(обсудить}/^вклад) 02:35, 19 January 2024 (UTC)Reply

@Atitarev So my preference would be that we should use transliteration + original script if there's no available translation (e.g. it's a proper noun or some weird coinage), but otherwise we should put original title + translation. I wouldn't remember the Urdu reference (since I can't read Urdu), but I definitely would remember the translation. Theknightwho (talk) 02:39, 19 January 2024 (UTC)Reply

In case of {{R:ur:UDB}} it's just "UDB". "Urdu lughat" is the name dictionary is known by (=urdū luġat), which simply means "Urdu dictionary".

"Orfoepičeskij slovarʹ russkovo jazyka" can be memorised by users having no exposure to Russian and used to refer to to native speakers. Native speakers may have no knowledge of [Orthoepic Dictionary of the Russian Language] or think of some other works. There are pros and cons, the decision was to use the translit. Browse through some Category:Armenian reference templates, you will probably get an idea. Anatoli T. ^{(обсудить}/^вклад) 02:48, 19 January 2024 (UTC)Reply

@Atitarev Alright. Whether or not we use transliterations, I'll raise a BP disucssion about including the original title, since I think that should always be included. Theknightwho (talk) 02:53, 19 January 2024 (UTC)Reply

@Theknightwho: I personally have no objections on including the original title. The question that may arise is whether to include the author, institution, etc. (original text, transliteration). There are some complex templates. Anatoli T. ^{(обсудить}/^вклад) 03:27, 19 January 2024 (UTC)Reply

@Atitarev Yeah, it could get very messy, and we don't want ridiculously long references that are hard to use. Theknightwho (talk) 03:39, 19 January 2024 (UTC)Reply

Good idea

Latest comment: 7 months ago2 comments2 people in discussion

Hey, do you think I need to do this ([1]) anywhere I've got some romanization of Mandarin or Cantonese or whatever? I will start integrating it into my normal work if you think so. What do you think of this: [2]? I'm afraid to expend a lot of energy on this, because I feel that ultimately it will be automatically generated. Just let me know. --Geographyinitiative (talk) 17:00, 30 January 2024 (UTC)Reply

@Geographyinitiative It's not a big deal, but it's how we display WG in Chinese entries, so I thought it made sense to make them match. I only did it since I was using the same etymology for other WG-derived forms of fanqie (of which there are several, since English-speakers tend to mangle the apostrophes and dashes). Theknightwho (talk) 17:03, 30 January 2024 (UTC)Reply

Ordering

Latest comment: 7 months ago7 comments3 people in discussion

"I am consistently baffled by your ordering of etymologies in entries. The top 3 etymologies are all inherited from Middle English, so I see no obvious reason to put them in reverse order of notability." Well, let me unbaffle you. They're not in "reverse order of notability" – I'm not sure what that even means – but in chronological order, the same way every historical dictionary orders their senses. "Bill" the weapon is a very old word that appears in, for example, Beowulf, which is why I had it first. A bird's bill also goes back to Old English. The others come later, from other sources. I see you have changed it, but I'm not sure what your plan is. "Notability"? Is that just a subjective judgement or are you quantifying it somehow? Ƿidsiþ 06:30, 2 February 2024 (UTC)Reply

@Widsith I understand why you've done that, but all three were inherited into English from Middle English, so the chronological argument doesn't hold since we treat the two as separate languages. This is of course arbitrary to some extent, but it does reduce its importance, since to an average English speaker all have been an established part of the language for the whole period of which they are able to understand.

Notability is determined by use. No, I haven't quantified it here, but I am reasonably certain that the order in which I've placed them is the order in which they are most used.

We've had this discussion before, with brake, and I've spotted quite a few others. For example, at pink the colour was at etymology 4, sense 3, below regional and obsolete terms. That makes sense if you are only concerned with etymology and chronological order, but there are other concerns to balance against that, too. Theknightwho (talk) 15:37, 9 February 2024 (UTC)Reply

@Widsith In this case I agree with User:Theknightwho. It is normal to put rarer etymologies after more common ones. You will see for example that under bay and flag, the etymologies associated with the obvious meanings are first irrespective of whether the terms are inherited from Old English or borrowed from French. Benwing2 (talk) 02:37, 10 February 2024 (UTC)Reply

This is not a method that I find useful or sustainable in a dictionary, but I understand that everyone has their own preferences. I just find it annoying to have things moved around as though Wiktionary had a policy on this matter, which as far as I know it doesn't. I'm also not at all sure how people plan to decide on the most ‘notable’ sense of words like set or run. Ƿidsiþ 16:58, 11 February 2024 (UTC)Reply

@Widsith For what it's worth, I would much rather this was done by user preference. i.e. have some (possibly subjective) ordering as the default, but it would display chronologically if the user's preferences are set that way and the info is there, since I appreciate that the current approach is an unhappy compromise between incompatible needs. It would take some heavy duty work to achieve it, though.Theknightwho (talk) 17:04, 11 February 2024 (UTC)Reply

@Widsith I think it's fine to order meanings within an Etymology section chronologically or according to some logical order, e.g. literal before figurative, as long as there is appropriate verbiage, like originally .... The issue above has to do with different Etymology sections, and I'm not sure there are multiple such sections for run or set. Benwing2 (talk) 20:40, 11 February 2024 (UTC)Reply

@Theknightwho, yes we've talked about the desirability of implementing this before, but as far as I know no one has yet come up with a solution that doesn't involve putting the editing work behind some daunting coding. @Benwing2, I agree that ordering within an Etymology section is the more important issue. Ƿidsiþ 08:21, 19 February 2024 (UTC)Reply

Solombala English

Latest comment: 7 months ago3 comments2 people in discussion

Hi! Do you know how to add new language code for Solombala English? It is a very poorly attested pidgin language, so I'm not sure if it worths it to add it, and I don't know how to do it. There are totally 20 words known in this language, and 1 of them is misprinted, and 1 other is very piquant in the original source, but this fact is ignored by the later translators. Tollef Salemann (talk) 17:22, 9 February 2024 (UTC)Reply

@Tollef Salemann Hiya - it's best to request this at WT:RFM, to get consensus. Theknightwho (talk) 17:26, 9 February 2024 (UTC)Reply

Thanks! Tollef Salemann (talk) 17:35, 9 February 2024 (UTC)Reply

pagename breakage in Module:headword

Latest comment: 7 months ago2 comments2 people in discussion

Hi. After I revamped Module:headword and added support for overriding the pagename using `data.pagename`, you made a bunch of changes to this module that broke this support. The pagename in `data.pagename` is now partly ignored in favor of a pagename taken from Module:headword/data, which always uses the actual pagename. What is the reason for this? Is it to do with unsupported titles? If so we need to restore this support as it's important for testing purposes. Benwing2 (talk) 02:34, 10 February 2024 (UTC)Reply

@Benwing2 My bad - yes, it's to do with unsupported titles. The pagename in Module:headword/data is a normalised form of the page title which is supposed to be used in place of mw.title.getCurrentTitle().subpageText in all circumstances. Theknightwho (talk) 02:49, 10 February 2024 (UTC)Reply

Yo, why'd you do this revert?

Latest comment: 7 months ago7 comments2 people in discussion

https://en.wiktionary.org/w/index.php?title=Talk:hshs&diff=prev&oldid=78117814&title=Talk%3Ahshs&diffonly=1, If I'm the person who made the diss then I should be able to delete it, right? Problem solved..? Heyandwhoa (talk) 00:18, 16 February 2024 (UTC)Reply

@Heyandwhoa It was a thread involving multiple people, so it's best to keep it. Theknightwho (talk) 00:20, 16 February 2024 (UTC)Reply

@Theknightwho But technically the problem's already solved? Heyandwhoa (talk) 00:22, 16 February 2024 (UTC)Reply

@Heyandwhoa Sure, but we still don't delete the discussion. Theknightwho (talk) 00:24, 16 February 2024 (UTC)Reply

@Theknightwho Then what's the reason of keeping it? (There's page history?) Heyandwhoa (talk) 00:26, 16 February 2024 (UTC)Reply

@Heyandwhoa Because it's much easier to find if we ever need to refer to it. We tend to prefer transparency when it comes to discussions, and I don't see any reason to delete this. Theknightwho (talk) 00:28, 16 February 2024 (UTC)Reply

Hmm, okay then. Heyandwhoa (talk) 00:40, 16 February 2024 (UTC)Reply

In case you haven't noticed

Latest comment: 7 months ago4 comments2 people in discussion

WT:LOL is in CAT:E, and your edits to Module:families/data and Module:languages/data/exceptional are the only recent changes in the transclusion list for the section with the error. Chuck Entz (talk) 23:19, 16 February 2024 (UTC)Reply

@Chuck Entz Thanks - it was this bug, which was down to me as well. Theknightwho (talk) 23:36, 16 February 2024 (UTC)Reply

While you're at it, could you take a look at Module:translations/multi-nowiki/documentation? I realize it's the wee hours in your time zone, so whenever you have the time. Thanks! Chuck Entz (talk) 02:17, 17 February 2024 (UTC)Reply

@Chuck Entz Thanks - that’s caused by recent changes to the template parser. It’s complex, but template validity is determined at two different points:

The first checks the whole page and finds any template blocks ({{...}}), so any that don’t even register as a template block fail at this stage: e.g. {{ [[ }} fails outright, since the parser sees the }} as part of the “wikilink” opened by ]], even though the wikilink will (later) be deemed invalid. Those were already being caught.
The second checks the template name is a valid title. This was harder to check for, since we want to exclude invalid titles, but not parser functions like {{#if:}}, since “#if:” is not a valid title. Another issue is that the first argument for a parser function (argument “0”) comes after the colon, not after the first pipe.

On top of this, there are transclusion modifiers like {{subst:}}, {{safesubst:}} etc. I haven’t added support for these yet, partly because you can’t even save a subst call (though you can use clever transclusion to make one), and also because I thought it wasn’t needed yet. However, the documentation for Module:translations/multi-nowiki has some kind of example involving subst, which is now failing: {{subst:#invoke:languages/templates}}.

What happens is that the template parser doesn’t know subst is special, so it thinks a template call starting with that should be treated as a conventional call to a template with a name starting subst: (which is in fact valid). Template calls work a bit like links, so everything after a # gets treated as a fragment, which is ignored, so the template parser thinks this is a call to the template subst:. Multitrans-nowiki then strips off the subst:, giving the empty string, which eventually leads to an error since that isn’t a valid template name.

I’m reluctant to spend too long as this since there’s no clear need for multitrans-nowiki in the near future, so if there’s no quick fix I’ll just comment the documentation’s use of it out for now. Theknightwho (talk) 17:50, 18 February 2024 (UTC)Reply

[Untitled]

Latest comment: 7 months ago4 comments2 people in discussion

Stop censoring users talking to people on user talk pages. It's against wiki's policies

I can delete what I like from my own talkpage, but since you're complaining about censorship on Wikipedia, all I can actually see is you repeatedly trying to censor criticism of some anti-vaccine group in Canada. Theknightwho (talk) 04:27, 19 February 2024 (UTC)Reply

Thank you for just proving my point. You're using my activity on a vaccine post where I'm trying to work AGAINST censoring the other side of the issue... to influence your decision on my editting a word "sapristi" which has nothing at all to do with vaccines. See the bias? See the evil? See why users online hate wikipedia editors? So go ahead and delete my comments to prove my point. Go ahead and prove that you can't take criticism and you just shut down opinions and editors that you dont' agree with. Have a nice day.

Firstly, I only looked at your editing history after you randomly accused me of censoring you on Wikipedia. Secondly, you removing content because you don't like it is not "work[ing] AGAINST censoring the other side" - it's just censorship and shutting down opinions that you don't agree with. Thirdly, your contribution at sapristi was removed because it wasn't in line with WT:Entry layout. You were invited to add the same content in the proper manner, but instead you've chosen to have a childish tantrum about it instead. Grow up; it's pretty clear to any well-adjusted person that you're spending way too much time online. Theknightwho (talk) 04:40, 19 February 2024 (UTC)Reply

No, what's obvious to anyone reading this transaction is that Wikipedia is a censorious body. You editors say: "everyone is welcome" and then you say: "you just didn't speak to us the correct way... you need to put your comments in a talk page first." Then when the user goes to edit a talk page, the content is deleted by yet another wikipedia editor who says: "the talk page content you provide is hateful or not wanted" even though this 3rd editor has nothing to do with the original issue. Ever heard of 'mob rule'? That's wikipedia. Stop insulting users that are trying to fix your censorship with comments like "spend less time online". Why don't you do the same and stop censoring everyone that disagrees with you. It's the 3rd time in 3 days that I've been called a 5 year old by a wikipedia editor. Perhaps you've heard of projection?

So let me get this straight: you're being censored because you were prevented from removing criticism of an anti-vaccine group. Did I get that right? Theknightwho (talk) 06:58, 19 February 2024 (UTC)Reply

"It's the 3rd time in 3 days that I've been called a 5 year old by a wikipedia editor." Maybe listen then. Equinox ◑ 11:51, 19 February 2024 (UTC)Reply

ᠬᠥᠭᠵᠢᠮ

Latest comment: 7 months ago5 comments2 people in discussion

Knight, this is very strange. At the editing stage ᠬᠥᠭᠵᠢᠮ rendered correctly. But after editing was finished, on the completed page, it rendered as two teeth (no dots to indicate voicing). I don't know what the problem is, but it might be the computer (operating system) I'm using -- that is a Macbook. At any rate, I won't try to "fix" this kind of issue again.

Bathrobe (talk) 11:45, 19 February 2024 (UTC)Reply

@Bathrobe That makes sense. The whole state of Mongolian encoding is a mess, so I imagine there are many entries which are genuinely wrong. Theknightwho (talk) 11:53, 19 February 2024 (UTC)Reply

This is actually the first one I've encountered so far. I'll look out for others.

There might be a problem with Wiktionary's rendering system. After all, the fact that it renders correctly at the editing stage, and even renders correctly on this page (although not in the heading to this section) suggests to me that the issue lies with Wiktionary. Bathrobe (talk) 18:38, 19 February 2024 (UTC)Reply

@Bathrobe The heading renders okay to me - I suspect it’s a browser or font issue. Theknightwho (talk) 18:51, 19 February 2024 (UTC)Reply

Another example I've found is өгөх, that is ᠥᠭᠬᠦ, which renders wrongly at the page on 'give'. Bathrobe (talk) 19:40, 19 February 2024 (UTC)Reply

(mock-)Persian σάτρα

Latest comment: 6 months ago4 comments4 people in discussion

Hey, I was wondering if you had some suggestions how to deal with σάτρα, which according to the dictionaries is either a Persian or mock-Persian word, but is currently included as an Ancient Greek word? It was, to my knowledge, never actually borrowed into Greek, since it's a hapax in a comedy line. The quote (from a Comedy) there also is technically also non-Greek, though there is disagreement as to whether it can be interpretted as a real Persian quote, or if it is just meant to sound vaguely like Persian to an Ancient Greek audience. Thanks! AntiquatedMan (talk) 07:57, 28 February 2024 (UTC)Reply

@AntiquatedMan Hiya - something similar recently came up on the Beer Parlour in this thread, actually, but @Nicodene is much better placed to help you than I am, so I'll let them respond. Theknightwho (talk) 08:03, 28 February 2024 (UTC)Reply

Well, it's certainly not Greek. I would ask our Indo-Iranian editors what to do with it. Nicodene (talk) 08:51, 28 February 2024 (UTC)Reply

Huh, def not the Old Persian word for gold. Will have to look into it. --{{victar|talk}} 09:06, 29 February 2024 (UTC)Reply

changelog messages

Latest comment: 6 months ago2 comments2 people in discussion

Just want to say, I notice you've gotten a lot better about changelog messages. You can see it for example in Module:languages/data/3/k, where all your changes made in the last 3 months as well as most of them after March 2023 or so have changelog messages. Thank you; this is very helpful both when trying to understand why a particular change was made and when scanning the history to get the gist of what was changed over a period of time. Benwing2 (talk) 04:16, 29 February 2024 (UTC)Reply

Thanks. Theknightwho (talk) 23:21, 29 February 2024 (UTC)Reply

'terms with audio links'

Latest comment: 6 months ago2 comments2 people in discussion

123 pages e.g. 後來 are generating a category named 'terms with audio links' with the language name missing. The correct language seems to be Teochew, but I'm not sure why this is getting left out. Benwing2 (talk) 06:26, 4 March 2024 (UTC)Reply

@Benwing2 Thanks, and fixed. It was down to an old hack in Module:zh-pron caused by trying to determine the parent of an etymology-only language by taking the first 3 letters of the code, which never got updated when Teochew got converted to a full language several years ago. Until now, I think they were being categorised as Min Nan audio. Theknightwho (talk) 11:33, 4 March 2024 (UTC)Reply

Shit

Latest comment: 6 months ago14 comments5 people in discussion

What kind of "shit" are you talking about? If you don't have the temperament to play with other kids, you should find your own sandbox. kwami (talk) 03:40, 5 March 2024 (UTC)Reply

@Kwamikagami You can clearly see what change I made to ◌̀, where you were so careless in your editing that you left the Ancient Greek section looking like this for months. Either you didn’t notice or didn’t care, but time and again I have found you leaving big messes like this on single-character pages, and you just don’t seem to give a shit about being more careful with your editing. Instead, you focus on making massive, sweeping changes without consensus, and treat anyone who objects with contempt (as can be seen in your message here, where you care more about the word “shit” than the entry you messed up for months). You will eventually find yourself blocked if you carry on. Pinging @Benwing2, @Vininn126 and @AG202 who have all expressed serious concerns about this before. Theknightwho (talk) 10:19, 5 March 2024 (UTC)Reply

I mean, I don't know why they've been allowed to keep doing this. AG202 (talk) 13:13, 5 March 2024 (UTC)Reply

@AG202 At the very least, I don't think they should be autopatrolled. Theknightwho (talk) 13:15, 5 March 2024 (UTC)Reply

I agree with removing the autopatroller status. Do you mind starting a thread in the BP about this? Benwing2 (talk) 03:55, 6 March 2024 (UTC)Reply

Agreed. Vininn126 (talk) 08:26, 6 March 2024 (UTC)Reply

I don't know how to use all the templates. It's not obvious what {{{10}}} is supposed to mean, for example. So I do the best I can. Something is better than nothing, unless you prefer aesthetics to content.

Perhaps you might consider making the templates functional to people who don't know their backstory? WK is supposed to be for anyone to edit, not just for the people who create the interface.

WP suffers a similar problem with declining participation because it's become less accessible to newbies. Yes, I know I'm not a newbie here, but I am when it comes to stuff like this. If an error is made with a template, the template should explain what the error is -- "X is missing", "Y is not permitted", etc. Not gibberish like "{{{10}}}". What kind of an error message is that? If you create templates that generate gibberish, then you're going to be stuck cleaning up after people who don't understand the gibberish. kwami (talk) 05:18, 6 March 2024 (UTC)Reply

@Kwamikagami This is a volunteer-led project and things aren't perfect, including the documentation. You have been around Wikipedia for quite awhile now and should know this. When you run into an issue, ask for help. I haven't seen you do this, or admit when you've made mistakes, or care about whether your changes are sloppy or break things. Instead you continue to make major changes without consensus and leave things in a messed-up state, as Theknightwho points out. I have tried to make it clear to you that Wiktionary is much more consensus driven than Wikipedia and so you can't just push your changes without consensus and expect to win out in a war of attrition, as people often do at Wikipedia. Doing that will lead to you being blocked. But you continue to do the same thing; sooner or later this will result in you getting blocked long-term. Benwing2 (talk) 05:45, 6 March 2024 (UTC)Reply

But this wasn't a major change, and there was no war of attrition. And I make mistakes all the time. When have I ever denied that? Yes, I should've asked for help in this case, but we're talking about one edit here, from almost a year ago. I don't know that I even noticed it at the time, but all Theknightwho had to do was say I'd left a mess and should clean it up. They didn't have to "clear up this kind of shit." If I make a mistake, just tell me. kwami (talk) 08:39, 6 March 2024 (UTC)Reply

People have and this is the exact behavior you have displayed then, too. You're not taking responsibility for what happened. Vininn126 (talk) 08:41, 6 March 2024 (UTC)Reply

@Kwamikagami has displayed this precise evasive behaviour going back years, and it clearly shows a contempt for other users. I really don’t see how they can be trusted to edit anymore, as the amount of work they create for others is enormous. The clean-up job will take months. This isn’t just one edit, and they know it’s not just about one edit, and pretending otherwise is just insulting. Theknightwho (talk) 11:18, 6 March 2024 (UTC)Reply

I've said this before, and I'll say it again. We need to stop coddling users that have shown time and time again that they will continue to behave in a disruptive manner. It makes other users not want to participate if they see that problematic users have free rein to mess things up, leaving others to clean up the mess. I don't care how long a user has been editing here, if anything, that gives more reason for stricter enforcement. This discussion shouldn't even be needed; they should've been blocked indefinitely when it was made clear many times that nothing will change. Y'all have the power to do so, so use it. Talking about it over and over again gets us nowhere. AG202 (talk) 16:48, 6 March 2024 (UTC)Reply

If there are no plans to block, please don't ping me again in these discussions, as they only frustrate me even more. AG202 (talk) 16:50, 6 March 2024 (UTC)Reply

No, I don't "know it’s not just about one edit." You've given one edit from a year ago that I don't even remember. But evidently you have seen others, so could you point some out? If it's going to "take months" to clean up my mess, you must have seen a lot of them. As I said, I'm happy to fix my errors if I know about them. I can't fix errors I don't know about. kwami (talk) 04:49, 8 March 2024 (UTC)Reply

"you already know that we don't do hard redirects in mainspace"

Latest comment: 6 months ago5 comments3 people in discussion

That's false. We've done that for years. But then, I expect such things from you any more, like claiming bad faith when I try to accommodate conflicting demands. kwami (talk) 04:34, 11 March 2024 (UTC)Reply

Really, do we @Benwing2? That's news to me, @Kwamikagami. Theknightwho (talk) 04:38, 11 March 2024 (UTC)Reply

Not sure what this is in reference to. We do generally avoid mainspace hard redirects, preferring soft redirects. Some people have argued that hard redirects are acceptable when they involve characters that are unique to a given language, on the principle that no other language could possibly have the term in question in it. Sometimes hard redirects outside of this case exist as well, but IMO they should not, and often come about because editors either don't know the rules about hard redirects or are too lazy to care. Benwing2 (talk) 04:52, 11 March 2024 (UTC)Reply

@Benwing2 In this particular case, it was redirecting ҧ‎ to ԥ, which would be like hard redirecting variant Chinese characters. Even though the literal meaning is the same, they're still different, because you can't just interchange them in every context. Theknightwho (talk) 05:28, 11 March 2024 (UTC)Reply

Agreed, in this case (and in general all cases with single characters) there should not be any hard redirects. Benwing2 (talk) 05:32, 11 March 2024 (UTC)Reply

Breath of Fresh Air

Latest comment: 6 months ago2 comments2 people in discussion

10/10 great work, heroic, excellent-- great changes on the names of these languages. --Geographyinitiative (talk) 00:20, 14 March 2024 (UTC)Reply

@Geographyinitiative Thanks! Theknightwho (talk) 00:24, 14 March 2024 (UTC)Reply

Template talk:ja-pron#Add DJR4

Latest comment: 6 months ago3 comments2 people in discussion

I do not have a right to edit Module:ja-pron, so could you please add ['DJR4'] = 'R:Daijirin4', in ref_template_name_data instead of me? Lugriaルグリア [会話／貢献] 05:43, 15 March 2024 (UTC)Reply

@Lugria

Done Theknightwho (talk) 17:29, 17 March 2024 (UTC)Reply

Thank you. Lugriaルグリア [会話／貢献] 00:49, 18 March 2024 (UTC)Reply

Invalid params in call to Template:cite-book: tr=; nocat=; termlang=; subst=; journal=

Latest comment: 6 months ago4 comments4 people in discussion

Technically I have not called parameters. But with the catch-my-attention gadget I now see suchlike in every of many templates which are both reference and citation templates, e.g. T:R:sem-eth:Littmann, including in the mainspace wherever they are included, interestingly here only for the parts choosing cite-book and quote-book, not cite-journal and quote-journal, so there is probably just an omission copypasto in some Module check. Some people do bot runs to remove alleged invalid parameters, but these are conditionally invalid and would need to be passed first to be invalid. Fay Freak (talk) 07:25, 17 March 2024 (UTC)Reply

@Fay Freak I think @JeffDoozan may be able to help - I'm not very familiar with these templates. Theknightwho (talk) 20:01, 17 March 2024 (UTC)Reply

@Fay Freak I added the attention seeking warnings to cite-book and am actively cleaning up uses of cite-book with bad params with the goal of making it more compatible with quote-book. I've adjusted cite-book to handle/ignore most of the parameters that quote-book handles, which removed the warnings on T:R:sem-eth:Littmann. JeffDoozan (talk) 20:38, 17 March 2024 (UTC)Reply

@JeffDoozan: {{R:cu:ESJS}} started throwing an invisible module error at about the same time you did this, and I suspect these changes are somehow involved. I don't really understand what's going on, but tinkering with html comments has narrowed it down to {{interval}} in the |entryurl= code throwing an error when |2= for the main template is missing, and |entryurl= not being displayed when |1= is missing. I have no clue why the module error didn't show up until now, since neither {{R:cu:ESJS}} nor {{interval}}/Module:interval have been edited recently and I didn't see anything about your edits that should have affected the |entryurl= parameter in {{cite book}} as used in this template. I'm obviously missing something. Chuck Entz (talk) 22:15, 17 March 2024 (UTC)Reply

Orthography invented for Wikt?

Latest comment: 6 months ago6 comments3 people in discussion

I deleted the Ubykh entry from ӄ, thinking it was a hoax, but after seeing other contributions by the author and your (and others') comments on their talk page, reverted myself. But the source does not support the orthography, which appears to have been invented for Wiktionary. It's one thing to give words in an invented orthography, as how else would we represent unwritten languages, but another to claim that a letter is in the alphabet when the language has no alphabet. So, what's happening? kwami (talk) 02:05, 18 March 2024 (UTC)Reply

@Thadh can comment on this better than I can. However, the Ubykh entry at ӄ appears to be a one-letter word. Deleting it and then complaining about made-up letter entries is precisely the kind of impulsive behaviour that got your autoconfirmed status stripped. Theknightwho (talk) 03:02, 18 March 2024 (UTC)Reply

Ah, yes, that went right over my head. I do wonder why we would present words in invented non-Latin alphabets on Wk-en, but it's not the error I thought it was. kwami (talk) 03:07, 18 March 2024 (UTC)Reply

@Kwamikagami: I was not working on this language, so you're better off asking the ones who did, but in case a language is not documented in anything but a linguistic transcription system, I do believe that we should normalise it to something useful. That is no different from normalising languages with inconsistent orthographies, and also something any other dictionary would do as well. Considering Ubykh was only spoken on the territory of Russia, Cyrillic is the orthography most likely used by its people, and so even if this is an "invented" orthography (I don't know about that, it could be based on some already existing one), it would be along the lines what I would propose as well. Thadh (talk) 08:28, 18 March 2024 (UTC)Reply

There are discussions here on changing our orthography and deciding which Cyrillic letters should be used for which sounds. "How about X?"

Ubykh was only spoken in the territory of Turkey when recorded. The language was unwritten, and apparently unwritten when it was spoken in Russia. If there was any writing in Russia, it was presumably in Arabic script, not Cyrillic. AFAICT, Cyrillic alphabets were not created for W Caucasian languages until the 1930s (it was Latin in the 1920s), and the Ubykh fled en masse to the Ottoman Empire in the 1860s. kwami (talk) 17:45, 18 March 2024 (UTC)Reply

Or rather, it appears to be a one-letter morpheme. AFAIK verbal roots do not occur without affixes; that is, ӄ cannot be a word. I could be wrong about that, but the source does give the lemma as q-. By analogy with English -ly and other bound forms, should this be hyphenated as well? kwami (talk) 03:45, 18 March 2024 (UTC)Reply

"Use your brain"

Latest comment: 6 months ago1 comment1 person in discussion

I don't appreciate your tone in the edit's comment https://en.wiktionary.org/w/index.php?title=%E7%9B%B4&diff=prev&oldid=78507420

Thanks ~~ 219.91.32.60 03:14, 18 March 2024 (UTC)Reply

Module timeouts at 噫

Latest comment: 6 months ago3 comments2 people in discussion

This involves {{quote-book}}, {{ruby}} and the template parser in the Hanja section of Korean Etymology 1. Commenting out everything but the quotes doesn't fix it, but commenting out different combinations of the quotes and different combinations of {{ruby}} gives wildly different results- either everything takes less than a second or each instance of {{ruby}} takes over a second. It's some kind of interaction, but I haven't had time to track it down. Chuck Entz (talk) 15:14, 20 March 2024 (UTC)Reply

@Chuck Entz Thanks - I've narrowed it down, and it's actually Module:template parser after I did a major rewrite yesterday in order to speed it up; on the pages I tested (including a and 人) it's about 50% faster than before. However, there's definitely some kind of bug which the code in the Korean section of that page is triggering; I'm not sure if it's a true infinite loop or some kind of catastrophic backtracking, because certain truncated snippets of the page code take way longer than expected, even though they do return before timing out. For comparison, the old version could parse this page hundreds of times without timing out (though still a lot fewer times than I expected, given the length, suggesting there's something weird in the code causing a lot more backtracking than usual).

The reason the page is throwing errors right at the start is because I recently (before this latest update) added full heading parsing capabilities as well, because that needs to be totally accurate for things like automatic self-links, Category:Entries with incorrect language header by language etc. to work properly. Templates find out what section they're in by getting a section count from the native parser, and then calling some function which counts all the headers it sees on the page, then cross-comparing. However, doing a naive parse for anything that looks like a heading can result in false positives (and, in very weird situations, false negatives); for example, headers nested inside templates don't iterate the native parser's section count, since the count is done before templates are expanded, which does sometimes occur with pages using {{multitrans}}. The new function in Module:template parser does the count properly, since the native parser also happens to parse headings and templates at the same time. (Incidentally, this is why section edit buttons don't appear next to headings inside templates.)

To avoid avoid massive repetition of work, the section counts for every L2 are calculated by Module:headword/data and stored for other modules to access, which means that the template parser is now being called on basically every page. Thankfully, whatever code on 噫 which is triggering this bug must be quite rare. Theknightwho (talk) 16:27, 20 March 2024 (UTC)Reply

@Chuck Entz I've fixed the bug, which related to instances of [[[ appearing on the page, which occurs in some of the rubytext code; each time it occurred, the problem would get exponentially worse. I'm not sure what was causing the old version to be so slow, but the new version is about 10 times faster (which is the expected time for a page of that length). Theknightwho (talk) 17:33, 20 March 2024 (UTC)Reply

"you couldn't even be bothered to correct the lowercase form"

Latest comment: 6 months ago1 comment1 person in discussion

<Ꜹ> is the usual capital form of <ꜷ>, so it's not as simple as blindly following Unicode. kwami (talk) 20:02, 21 March 2024 (UTC)Reply

template parser error

Latest comment: 5 months ago2 comments2 people in discussion

Hi, please take a look at CAT:E; there are 38 errors currently and many of the refer to an undefined __pairs() function. Benwing2 (talk) 03:27, 22 March 2024 (UTC)Reply

@Benwing2 Thanks - fixed. To reduce traversal time, the new version of the Wikitext:new() method will try to avoid generating wikitext nodes where they're not necessary, since they're only needed as containers for other nodes: e.g. the raw input "foo{{bar}}baz" becomes the wikitext node { "foo", {{bar}}, "baz" }, where {{bar}} is a template node. It does this in two ways: firstly, by checking that a container is actually needed (e.g. if the input stack layer is { "foo" } or { {{foo}} }, it simply outputs whatever's at index 1), and then it calls pcall(table.concat, layer), which converts inputs like { "foo", "bar", "baz" } to "foobarbaz".

Wikitext:new() is also called on the final output from the parse, but this was causing problems with plain inputs since they were getting converted to strings, which the iterator functions weren't expecting. To get around this, I've added a force_node parameter to Wikitext:new() that causes the output to be wrapped in a Wikitext node if it wouldn't otherwise have been a node (e.g. { "foo", "bar", "baz" } becomes { "foobarbaz" }, but { {{foo}} } still becomes {{foo}}).

As a side point, I'm going to see if it's faster to compress inputs like { "foo", "bar", {{baz}} } to { "foobar", {{baz}} }, but I suspect it'll be a false economy. Theknightwho (talk) 15:57, 22 March 2024 (UTC)Reply

Lua errors

Latest comment: 5 months ago2 comments1 person in discussion

Errors appear in {{l|*|tr=}}. Something like Lua error in Module:headword/data at line 1134: table index is nil
Seemly caused by recent edits in Module:headword 17lcxdudu (talk) 07:34, 26 March 2024 (UTC)Reply

Seemly be solved now. 17lcxdudu (talk) 10:33, 26 March 2024 (UTC)Reply

Template parser errors

Latest comment: 5 months ago2 comments2 people in discussion

There are some errors now in Wiktionary space reading e.g. * Lua error in Module:template_parser at line 54: 'for' limit must be a number. Benwing2 (talk) 22:35, 28 March 2024 (UTC)Reply

@Benwing2 Thanks - I've reverted the change which caused this, since was just a minor time-saving measure anyway. Theknightwho (talk) 22:43, 28 March 2024 (UTC)Reply

ancestor vs. parent terminology

Latest comment: 5 months ago8 comments2 people in discussion

Now that we're renaming "etymology-only language" to "language variety" (and similarly, "etymology-only family" will become "family variety"), I think it's time to revisit the terms "parent" and "ancestor". These terms are confusing to me because normally in computer science, an "ancestor" is simply an nth level parent where n >= 1, i.e. it could be a parent, parent's parent, etc. With the current definitions of parent and ancestor, however, there's no clear way to refer to the nth-level parent or to distinguish first-level ancestor from nth-level ancestor. I propose renaming these terms:

parent -> containment parent
ancestor -> inheritance parent

Corrrespondingly, ancestor becomes either containment ancestor or inheritance ancestor. IMO these terms clarify the conceptual difference between the current "parent" (expresses a containment relationship, i.e. "Y is X's parent" means "X is-a Y") and "ancestor" (expresses an inheritance relationship, i.e. "Y is X's ancestor" means "X inherits terms from Y", either directly or through a chain, depending on whether a first-level or nth-level "ancestor" is intended). I frequently get confused about them given the current names (and I imagine I'm not the only one). Thoughts? Another way to think about it is that "containment" is a spatial relationship while "inheritance" is a temporal relationship. Benwing2 (talk) 08:44, 5 April 2024 (UTC)Reply

For my part, I would find "containment parent" and "inheritance parent" more confusing than "parent" and "ancestor". Maybe "containment parent" is a phrase that makes sense to programmers?—though when I google it, the results even on sites about code are like { containment: "parent" }, collocations of the two words rather than use of "containment parent"—but I would love to know if it makes sense to anyone else. Would it work to just rename Module:etymology languages/data's "parent" field to something like "IsAVarietyOf", so it's clearly different from "ancestor"? (Would I be right to infer that "parent" is the more problematic of the two names? Latin really is an "ancestor" of French, but it's perhaps confusing to call Latin, the L2 that includes New Latin, the "parent" of Old Latin; Latin is just the thing Old Latin is a variety of.) - -sche (discuss) 06:00, 7 April 2024 (UTC)Reply

@-sche Yeah I have made up the term "containment parent" as an attempt to convey that's it's (a) in a containment ("is-a") relationship, and (b) a direct parent (vs. an indirect or any-level parent). I would somewhat agree that "parent" is maybe the more problematic, but "ancestor" is also problematic in that there's then no way to distinguish an nth-level "parent" or "ancestor" from a single-level one. But I think if we can ditch the term "parent" it will be a good first start. I think "is-a-variety-of" (in whatever casing style), or just "is-a", will work in place of "parent", and maybe we use "direct" (or "single-level") vs. "any-level" to distinguish what in computer science terminology are normally termed "parent" and "ancestor" vis-a-vis trees. Then hasParent becomes isAVarietyOf (at any level, unless a flag direct/singleLevel is passed in), and similarly hasChild becomes hasVariety, etc. Benwing2 (talk) 06:21, 7 April 2024 (UTC)Reply

Hmm, what are the situations where we need a way to refer to the 1st vs 2nd vs 4th level ancestor, that "containment..." vs "inheritance..." would be more helpful with than just calling them "1st level ancestor", "2nd level ancestor", etc (or "ancestor-lvl-1" or whatever other name/casing)? - -sche (discuss) 16:34, 7 April 2024 (UTC)Reply

@-sche Maybe my last response wasn't phrased well, as I'm proposing using your terminology of essentially "is-a-variety-of" and "ancestor" rather than "containment"/"inheritance". Benwing2 (talk) 19:18, 7 April 2024 (UTC)Reply

@-sche What do you think of replacing "parent" with "container" rather than "isAVarietyOf" or similar? Then we have "containers" and "ancestors" which hopefully is clear enough. Benwing2 (talk) 09:13, 11 April 2024 (UTC)Reply

It still seems a little odd to call e.g. Latin a "container", but it's not being used in anything reader-facing, yes? so I guess it's fine, it's intelligible what it means. - -sche (discuss) 21:12, 11 April 2024 (UTC)Reply

@-sche Yeah I've sort of thought better of this. Maybe "isAVarietyOf" is fine. User:Theknightwho what do you think? Benwing2 (talk) 23:48, 11 April 2024 (UTC)Reply

Template parser fails in CAT:E

Latest comment: 5 months ago3 comments3 people in discussion

I've only been doing spot-checks so far, but it looks like all the invocations of Module:zlw-lch-headword and Module:pt-headword outside of mainspace are throwing errors,though most of them haven't hit the CAT:E page itself yet. The usual "string expected, got nil" stuff. Chuck Entz (talk) 18:05, 6 April 2024 (UTC)Reply

@Chuck Entz The error here lies somewhere in Module:headword/page, since it's feeding nil into the template parser. @Benwing2 may know better than me why that's happening, as he recently split out whole-page processing from Module:headword/data. I'll see what I can do, though. Theknightwho (talk) 18:15, 6 April 2024 (UTC)Reply

@Chuck Entz I added a better error in Module:headword/page, but this really needs fixing in a bunch of headword modules (almost all written by me) because I've been sloppy in handling the distinction between subpage names, pagenames without namespace prefixes, and full pagenames with namespace prefixes. The reason this doesn't occur on mainspace pages is that generally all three of the preceding are the same for them. An additional complication arises from unsupported titles. I put in a preliminary fix for the Lechitic, Portuguese and French headword modules but there are many more, and the fixes I put in for those three modules need to be redone. I will see if within the next day or so I can come up with the proper solution and then implement this for at least the 10 most major headword modules. Yet another reason why we really need a headword module library/framework, so when we discover issues like this we don't have to repeat the fix across 100 modules, each slightly different (you are in a maze of twisty little passages, all alike ...). But such a beast will be tricky to design because there are so many variations in the way existing headword modules work (some of which can and should be harmonized but some of which are due to language-specific differences). Benwing2 (talk) 05:14, 7 April 2024 (UTC)Reply

bug in Module:string

Latest comment: 5 months ago3 comments2 people in discussion

Your recent change to Module:string has caused some issues, e.g. in Federweißer, where {{alt-de-ch}} is throwing an error that it didn't use to. Benwing2 (talk) 23:47, 11 April 2024 (UTC)Reply

FYI, this diff [3] caused the error. Benwing2 (talk) 03:16, 12 April 2024 (UTC)Reply

@Benwing2 My bad - I'd put replacement_escape(pattern) instead of replacement_escape(replace)... Theknightwho (talk) 08:22, 12 April 2024 (UTC)Reply

recent breakages

Latest comment: 5 months ago8 comments2 people in discussion

One of your recent changes has broken {{#invoke:User:MewBot|getLanguageData}}. I get this error from my Python script:

Lua error in Module:string_utilities at [[Module:string_utilities#L-621|line 621]]: bad argument #1 to '_find' (string expected, got table)

. This is coming from getScriptCodes in Module:languages. Can you take a look? Benwing2 (talk) 01:26, 21 April 2024 (UTC)Reply

Also Module:data consistency check has broken in that it doesn't know about Hants. Benwing2 (talk) 01:29, 21 April 2024 (UTC)Reply

@Benwing2 Seems to be something to do with the other special code "All" - I'll have a look. Theknightwho (talk) 01:38, 21 April 2024 (UTC)Reply

@Benwing2 Should be fixed: Module:languages/data/all had special handling of "All" so that it returned a table with every script code, but I changed Module:languages so that it only expects a string from the raw data. I didn't notice it because Module:languages/data/all isn't used in mainspace. Theknightwho (talk) 01:47, 21 April 2024 (UTC)Reply

Thanks! Benwing2 (talk) 01:49, 21 April 2024 (UTC)Reply

Just FYI, it broke again with Lua error in Module:scripts at [[Module:scripts#L-252|line 252]]: bad argument #1 to 'pairs' (table expected, got nil). Benwing2 (talk) 06:13, 22 April 2024 (UTC)Reply

This time coming from {{#invoke:User:MewBot|getScriptData}}. Benwing2 (talk) 06:16, 22 April 2024 (UTC)Reply

@Benwing2 Thanks for fixing this.

This reminds me of a change I'd like to make to language/script/family objects, which is to replace some of the methods with literal data in cases where the result is simply a string, as it adds unnecessary overhead that starts to become noticeable in loops (e.g. lang:getCode() is about 7 times slower than simply accessing lang._code, and I don't see any ways to optimise it). Even accessing via a metamethod is about twice as slow, which increases to 25 times as slow when comparing require to mw.loadData. Obviously this only starts to make a difference across hundreds of thousands/millions of calls, but you reach that pretty quickly with anything that iterates over large amounts of data.

I assume the logic of the current setup is to ensure that the data remains immutable by hiding where it's really stored, but we don't seem to have any issues with title or frame objects, even though you could trash a frame object by changing frame.args or whatever. Theknightwho (talk) 13:13, 22 April 2024 (UTC)Reply

Thanks for the help

Latest comment: 5 months ago2 comments2 people in discussion

Thanks for cleaning up my new entries. I appreciate the help learning the local conventions. BillHPike (talk) 02:41, 21 April 2024 (UTC)Reply

@BillHPike No problem - let me know if you need help with anything. Theknightwho (talk) 02:46, 21 April 2024 (UTC)Reply

affix breakage

Latest comment: 4 months ago2 comments2 people in discussion

OK, I fixed the issue with Module:scripts but there's breakage in Module:affix due to your recent changes. See for example кемитме, where the use of an embedded link is causing the page to get added to Category:Kyrgyz terms prefixed with Unsupported titles/кемитүү`vert`кемит-. You might want to add a check in the category-formatting code for unsupported titles showing up in categories, because this is essentially never correct; if you throw an error in that case, you'll discover bugs related to insufficiently handled embedded links more quickly. Benwing2 (talk) 06:26, 22 April 2024 (UTC)Reply

@Benwing2 Thanks - this was actually caused by my changes to export.get_fragment in Module:links a few days ago: I separated out most of the work into a local function, with the externally-callable function now only doing a pre-screen for embedded links before it calls the local one (since that check isn't necessary when calling get_fragment locally). As part of that, I made the check a bit smarter so that it removes redundant links, but I must have forgotten about piped links because "[[кемитүү|кемит-]]" was being converted to "кемитүү|кемит-", so it was as though the input was {{affix|ky|кемитүү{{!}}кемит-|-ма|alt2=-ме|t1=to add}}. Theknightwho (talk) 13:41, 22 April 2024 (UTC)Reply

your snarky comments

Latest comment: 4 months ago14 comments3 people in discussion

Why don't you make an effort so the rest of us have a guideline to follow? I did make an effort there, but couldn't figure out how to format it; rather than formatting it badly, I left it unformatted. Or do you really prefer corrupted entries?

Where is our MOS on how to format this stuff? kwami (talk) 12:13, 22 April 2024 (UTC)Reply

For me, per J3133's demand, I moved the quote to its own section (since it doesn't illustrate 'me' as a subject). No idea what the definition should be, though, since it's a fixed phrase, so I tagged it as definition needed. kwami (talk) 12:20, 22 April 2024 (UTC)Reply

@Kwamikagami I would have more sympathy if this was a problem that a lot of long-term users experience, but it isn't; you keep leaving problems that other users have to clear up over and over, and it is extremely tiresome to deal with because I see no indication that it's ever going to improve. Even if you were still doing it, I would mind a lot less if you showed some appreciation for why it's a problem.

If you don't know how to do something, you know about the WT:Information desk, WT:Beer parlour and WT:Grease pit, where questions are regularly answered every day. You could also add {{attention|langcode}}, which is invisible but categorises the entry in (e.g.) Category:Requests for attention concerning Translingual, though YMMV on it being dealt with, as some languages are patrolled a lot more often than others. If it's a larger issue, there is also {{rfc|langcode|optional reason}} (Requests for Cleanup), which should go under the L2 header (or sometimes the relevant etymology heading if it's a long entry), which adds a banner requesting clean-up and categorises the entry in (e.g.) Category:Requests for cleanup in Translingual entries. There's also {{rfc-sense|langcode}}, which is used when you only want to tag a specific sense: it adds the category, but it uses inline formatting instead of the banner. You could also list it at WT:Requests for cleanup, but it's probably not a good idea to do that if the problem is something you've just added yourself. Theknightwho (talk) 12:41, 22 April 2024 (UTC)Reply

Wiktionary:Todo/Lists/Template language code does not match header (sorted by language)#Assiniboine (asb) Chuck Entz (talk) 13:34, 22 April 2024 (UTC)Reply

I'll clean that up. kwami (talk) 13:41, 22 April 2024 (UTC)Reply

But also Wiktionary:Christmas Competition 2017. Chuck Entz (talk) 13:37, 22 April 2024 (UTC)Reply

@Chuck Entz I'll deal with this. It's something to do with my recent changes to :findBestScript() in Module:languages; the only change I made that could have affected outputs was that it now only removes ASCII punctuation before doing the count, instead of everything categorised as punctuation by Unicode, since some of them are script-specific. It's not clear how that could be causing issues here, but it's got to be something to do with ‘ in "c‘k‘seir", which is non-ASCII punctuation. Theknightwho (talk) 13:48, 22 April 2024 (UTC)Reply

Turns out this was a completely unrelated issue caused by a change to Armenian transliteration. Now fixed. Theknightwho (talk) 14:08, 22 April 2024 (UTC)Reply

In this case, you complained before anyone on those boards would've had a chance to respond.

It's ridiculous to have to go to those boards over and over again, or to rely on you to tell me which template to use for every little thing. Where are the basic rules for how to construct an entry? Where's the outline to follow? I don't mind formatting things properly, my problem is having to hunt through an unorganized mishmash. Sometimes I can find what I'm looking for, often I can't. You complain about me leaving a mess, but all you offer are one-off instructions. What do I do next time, leave a mess and drop a note on your talk page so you can tell me how to clean it up? That's not a solution. kwami (talk) 13:47, 22 April 2024 (UTC)Reply

BTW, telling me once isn't enough. I'll forget. Or do the wrong thing because that's what someone else told me to do. There needs to be someplace a normal user can go to figure things out on their own without bothering others. kwami (talk) 13:53, 22 April 2024 (UTC)Reply

Quit whinging. Provide me guidelines to follow, and I'll follow them. If there are no guidelines, except scattered around where only the people who wrote them know to find them, then I'm not going to have much luck getting them right. kwami (talk) 10:13, 23 April 2024 (UTC)Reply

@Kwamikagami I have already told you where to go. Stop demanding things from me.

I also don’t know what prompted this extra comment, but right now you seem to be blaming me for your years (and it is years) of lazy editing in which you have blatantly ignored the norms of the site because you can’t be bothered to learn them, and bleating about how it’s impossible to learn is just your trademark abdication of responsibility for everything you do, since no-one else seems to suffer from this problem. Many, many other editors have expressed the same opinion I have of your edits, so stop acting like you’re a victim and maybe do some reflection on why everyone seems to think you’re a problem-editor. Theknightwho (talk) 11:26, 23 April 2024 (UTC)Reply

It was in response to yet another of your complaints.

If you've told me where a guideline is, I apologize. I have no memory of that. Could you tell me again? I'll put a link on my user page. Also, it would be useful if you linked to it from the Help page, so that people could find it without demanding it of you.

Expressing frustration is not "acting like a victim". I'm expressing frustration at your constant criticisms without providing any remedy for rectifying what you criticize. kwami (talk) 11:39, 23 April 2024 (UTC)Reply

In that particular case, we'd just had a problem with mismatching from placing templates for specific languages under 'translingual'. But now I'm supposed to do that. How am I supposed to know when to do something that's had a history of causing problems and shouldn't be done? Guidelines. We need guidelines. The people who decide on them should write them up so the rest of us know what to do. kwami (talk) 10:20, 23 April 2024 (UTC)Reply

strange transclusions

Latest comment: 4 months ago5 comments2 people in discussion

Reconstruction:Proto-Slavic/para is transcluding {{tl-adj}} and other strange things. I think what must be happening is the code at Module:headword/page is transcluding para rather than Reconstruction:Proto-Slavic/para, which contains Tagalog and other entries. Do you have any idea what's going on? I will also take a look. I factored out Module:headword/page from Module:headword/data but I don't completely understand the code in it. Benwing2 (talk) 22:59, 22 April 2024 (UTC)Reply

@Benwing2 I think it's because of {{desctree|zlw-opl|para|id=steam|t=steam}}. When the template parser iterates over templates, it normalises the returned template name by grabbing the title object and resolving redirects etc. It avoids having to worry about arbitrary (non-ASCII) whitespace or what counts as a valid title, and so on. If a given template turns out to be invalid, it's skipped. Theknightwho (talk) 23:07, 22 April 2024 (UTC)Reply

I see. So I guess it's harmless but there's no way to avoid the templates showing up as transclusions? Benwing2 (talk) 23:20, 22 April 2024 (UTC)Reply

@Benwing2 The best way is probably for desctree to be more precise about what it puts through the template parser. The get_section function in Module:utilities is probably the best way to do it, as you can tell it you want the Descendants section under Foo L2. Theknightwho (talk) 23:25, 22 April 2024 (UTC)Reply

I see, thanks, something to work on :) ... Benwing2 (talk) 23:30, 22 April 2024 (UTC)Reply

I

Latest comment: 4 months ago7 comments3 people in discussion

you said my transcription wasn't phonemic what do you mean by that? HistorienCanadien (talk) 20:03, 2 May 2024 (UTC)Reply

@HistorienCanadien Phonemic transcriptions should only show features that are phonemically contrastive in a language: /äi̞̯/ isn't correct as a phonemic transcription, because it implies it could be contrastive with /äi̯/, /äi/ or /ai/, but these distinctions don't exist in Canadian English. Even as a phonetic transcription, [äi̞̯] is way too narrow. Theknightwho (talk) 20:10, 2 May 2024 (UTC)Reply

why way too narrow? HistorienCanadien (talk) 20:19, 2 May 2024 (UTC)Reply

@HistorienCanadien [äi̯] is plausible, but there's no way that everyone who speak Canadian English says [äi̞̯]. It's way too fine a distinction. Theknightwho (talk) 20:23, 2 May 2024 (UTC)Reply

i was editing General American not Canadian. unless another General American speaker says their pronunciation is different i feel that my phonetic transcription should stay HistorienCanadien (talk) 20:48, 2 May 2024 (UTC)Reply

@HistorienCanadien It's way too narrow for General American as well. Your position is not the default until proven otherwise - that's not how it works. Theknightwho (talk) 20:55, 2 May 2024 (UTC)Reply

@HistorienCanadien: Hi, GenAm speaker here. [äi̞̯] is too narrow for our current practices. As for /ʌ/ in GenAm, there's currently not a consensus, and there've been many discussions about it. I support changing it to the schwa, but there are competing arguments. I'll point you to the vote that's yet to start, which has links to previous discussions: Wiktionary:Votes/2023-12/Represent the GenAm NURSE and STRUT vowels as schwa. AG202 (talk) 02:43, 3 May 2024 (UTC)Reply

old lombard

Latest comment: 4 months ago1 comment1 person in discussion

idk why old lombard wasn't working, as i put (Old Lombard) in articles like Jesu, zinqui, and giaçça. That Northern Irish Historian (talk) 04:11, 10 May 2024 (UTC)Reply

◌͛

Latest comment: 4 months ago1 comment1 person in discussion

Thanks for catching that. It should probably be under Translingual. Texts on Lithuanian dialectology are primarily but not exclusively written in Lithuanian. It would also be fine IMO to list it under Lithuanian, but if so we should move the other Lithuanian-dialectology conventions to Lithuanian headers as well. kwami (talk) 21:11, 10 May 2024 (UTC)Reply

Edit summary on don't tread on me

Latest comment: 3 months ago22 comments3 people in discussion

FWIW, some of the things you removed at that entry weren't added by me. And was that long and that big a swipe really necessary? Purplebackpack89 21:18, 25 May 2024 (UTC)Reply

@Purplebackpack89 It was constructive criticism, so please take it on board. Theknightwho (talk) 21:21, 25 May 2024 (UTC)Reply

Please stop following me around. Purplebackpack89 21:25, 25 May 2024 (UTC)Reply

@Purplebackpack89 You commented on my talkpage haha. If you're referring to me removing your invalid vote, that's because you had just posted about it on the WT:Beer parlour (i.e. the most public place on Wiktionary), and it's completely fair to remove procedurally invalid votes. I suggest you stop trying to pick fights like this. Theknightwho (talk) 21:29, 25 May 2024 (UTC)Reply

ME the one picking fights? You tinkered with my edits in three different places in the span of only a few minutes, which is only explicable if you're intentionally checking out my edits. Leave me alone. If my edits need modification, somebody else can do it. Purplebackpack89 21:33, 25 May 2024 (UTC)Reply

@Purplebackpack89: You have legally licensed your contributions according to the free wiki licence and anybody may edit and improve them. Equinox ◑ 21:35, 25 May 2024 (UTC)Reply

That shouldn't be interpreted as a license to harass. Other Wikimedia projects have an anti-harassment policy for a reason...

But I suppose you think this is the Wild West where you and Knight think you're entitled to harass me, call me a moron, and try to ride me out of town on a rail.

Well, that's not going to work. I'm staying and I'm going to try to enact policies that clean up behavior. Purplebackpack89 21:45, 25 May 2024 (UTC)Reply

@Purplebackpack89 Two of the edits were correcting things you did that were procedurally invalid, and one offered you constructive criticism about how to make better edits in the future. Please grow thicker skin. What you're doing right now is pretty clear-cut misuse of the anti-harassment policy. Theknightwho (talk) 21:50, 25 May 2024 (UTC)Reply

It was procedurally invalid for me...

...to close a discussion that was itself procedurally invalid? I closed an RFD that should never have been started because the OP wasn't asking for anything to be deleted.

Also, on what planet do you think that saying things like "grow thicker skin" is going to make me MORE disposed to listening to you? Purplebackpack89 21:53, 25 May 2024 (UTC)Reply

@Purplebackpack89 It's very clear, at this point, that there is no combination of words I could say that you would actually listen to. I will continue to change bad edits as I encounter them, and I will leave alone good ones; I don't really care who made them. Theknightwho (talk) 21:56, 25 May 2024 (UTC)Reply

If you're editing PAGES, that's fine. If you're snooping through an individual EDITOR'S contributions on multiple WikiProjects (which you've more or less said you're doing), that's harassment Purplebackpack89 22:04, 25 May 2024 (UTC)Reply

@Purplebackpack89: Absolutely untrue. Editor contributions are publicly shared on purpose. Seeing one bad edit by a user may be a red flag that others also need attention. Equinox ◑ 22:08, 25 May 2024 (UTC)Reply

@Purplebackpack89 All three were on, or had been linked to from, highly prominent public forums in recent threads, so it's not particularly surprising that I came across them. I only checked your contributions after I noticed you were picking fights with multiple editors (not least because I have Equinox and SGConlaw's talkpages on my watchlist). It's very clear that the only thing you're really interested in here is feeling like you're winning the argument, or that you're the victim, or some similar self-focused aim that's purely about your ego, but that's not something any of the rest of us have any obligation to care about. Feel free to keep complaining on my talkpage, but I won't be responding any further. Theknightwho (talk) 22:10, 25 May 2024 (UTC)Reply

@Purplebackpack89 That isn't how it works - you don't get to declare yourself immune from scrutiny, particularly when you decided to start making waves on the most highly-trafficked pages. I see you have a long history of this kind of behaviour here and on Wikipedia, too. Theknightwho (talk) 21:36, 25 May 2024 (UTC)Reply

A search for Purplebackpack and "hounding" finds many examples of his false hyperbolic claims of "harassment": [4]. Equinox ◑ 22:10, 25 May 2024 (UTC)Reply

You did that search WHY? I stand by each of those claims 110%. Also, that's another personal attack... Purplebackpack89 22:21, 25 May 2024 (UTC)Reply

hot dog too?

Knight, if you wanted to give the impression that you're NOT stalking me, you're failing. And that one isn't a high-profile page or your previous excuse. You're clearly stalking my edits Purplebackpack89 10:48, 26 May 2024 (UTC)Reply

Furthermore, some people think of a hot dog as a sandwich and some do not. It's not necessary to include the word "sandwich" in the definition. I see that your edit had to be modified by another editor only a few minutes after you made it. Purplebackpack89 10:51, 26 May 2024 (UTC)Reply

@Purplebackpack89 The only harassment I see here is from you having a meltdown any time someone tries to correct you: “entree” was unhelpful at best. Theknightwho (talk) 11:15, 26 May 2024 (UTC)Reply

Sandwich is controversial and had to be removed. Hence why your edit was modified only six minutes later. Purplebackpack89 11:21, 26 May 2024 (UTC)Reply

@Purplebackpack89 To something else - again, this just reinforces the fact the only thing you actually care about is feeling like you’re right. Theknightwho (talk) 11:23, 26 May 2024 (UTC)Reply

Untrue. I care about Wiktionary have as many good, uncontroversial definitions as possible Purplebackpack89 11:29, 26 May 2024 (UTC)Reply

Stalking/harassment, again

Latest comment: 3 months ago11 comments4 people in discussion

You really don't care about perception, do you? You're clearly stalking/harassing me at this point
Some of those deletions are...questionable. For example, it is acceptable to redirect a conjugation of a verb to an entry of an infinite or another conjugation of a verb

Purplebackpack89 15:47, 27 May 2024 (UTC)!Reply

Reviewing someone's contributions is not stalking or harassment. Theknightwho (talk) 15:49, 27 May 2024 (UTC)Reply

In this case, it is. PERIOD. If the contributions were that egregiously wrong, somebody else can deal with them. Again, you don't seem to care how I feel or how you're perceived. Purplebackpack89 15:59, 27 May 2024 (UTC)Reply

I review lots of users' contributions - it's nothing particular about you. Theknightwho (talk) 16:01, 27 May 2024 (UTC)Reply

The "duck test" would suggest it IS something particular about me. Please also address concern #2 Purplebackpack89 16:07, 27 May 2024 (UTC)Reply

Try reviewing my contributions, and see how many of them relate to you. Theknightwho (talk) 16:22, 27 May 2024 (UTC)Reply

@Purplebackpack89: Let me point out that you are accusing TKW of bad faith and of lying about it. That's basically a personal attack and should be your last resort, not the first thing you say when someone reverts or deletes your edits. You've been doing this kind of thing constantly for as long as I can remember. It annoys people and seriously damages your credibility. Chuck Entz (talk) 15:19, 29 May 2024 (UTC)Reply

Maybe the language was a little too severe, but @Chuck Entz I stand by my assertion that Knight was overdoing it and didn't seem to have any regard for my feelings. And remember I'm not the only editor who has problems with Knight. This project has a problem with "the guardians" not being nice to other editors and Knight is part of that problem. Purplebackpack89 16:29, 29 May 2024 (UTC)Reply

@Chuck Entz At BEST, Theknightwho doesn't either know or care about descalation. The beer parlour thread has proven it...several editors have given him suggestions on how to do it and he's ignored them. Theknightwho could've easily avoided any appearance of harassment if he avoided editing pages I created for awhile. Instead, he literally did the exact opposite. Purplebackpack89 13:11, 2 June 2024 (UTC)Reply

Sorry to butt in here. I'm going to look through Purple's contributions. Denazz (talk) 22:42, 27 May 2024 (UTC)Reply
Yep, there is lots of crap made by Purply. Thanks for being a vigilant editor, TKW! Denazz (talk) 22:43, 27 May 2024 (UTC)Reply

WhatLinksHere Weirdness

Latest comment: 3 months ago3 comments2 people in discussion

I've been working with the Template:Tracking subpages in Special:WantedTemplates and have managed to clear a number of them using the API-sandbox purge method in between other stuff. I haven't been able to completely clear the latest one: Special:WhatLinksHere/Template:tracking/inflection_of/tag/m still has 10 entries (e.g., wiener breath) that I can't seem to affect at all. Apparently they're links rather than transclusions, but I can't find any reference to the pseudo-template in the entries themselves- not in the wikitext, not in the transclusion list below the edit window, not in the HTML- not anywhere. And why these 10 entries, but not others?

A side issue: going through the transclusion list at wiener breath I noticed Module:languages/data/3/m, which shouldn't be there at all- "en" is the only language code in the wikitext. Special:WhatLinksHere/Module:languages/data/3/m currently has 8,190,877 entries- it looks like a whole lot more than all the entries with references to languages having three-letter codes that start with "m". Chuck Entz (talk) 20:50, 31 May 2024 (UTC)Reply

Hmm - I suspect something is defaulting to mul somewhere, which would explain Module:languages/data/3/m. I'll have a look. Theknightwho (talk) 20:55, 31 May 2024 (UTC)Reply

@Chuck Entz I think it might be something weird server-side, as I managed to fix contruded by removing {{infl of}}, and it hasn't reappeared since I restored the page back to its original state. Hopefully they'll filter out naturally soon. Theknightwho (talk) 21:20, 31 May 2024 (UTC)Reply

Enabling "oftext" parameter for `{{doublet}}`

Latest comment: 3 months ago4 comments2 people in discussion

If you look at the corresponding module there's a bit of code (line 116) that checks the frame for a parameter called oftext and uses that, if it exists, in place of the word "of", so that for example if oftext="is" the template will generate an output that starts with "Doublet is..." instead of "Doublet of... ". However it appears that one cannot currently use this parameter with the template because the oftext parameter is not actually declared as part of the frame (circa lines 68-99). I am unable to edit the module due to page protection settings; can you change it so that this parameter works? Thank you. Brusquedandelion (talk) 20:50, 2 June 2024 (UTC)Reply

@Brusquedandelion So my understanding is that oftext= is a parameter for the frame, not for the parent frame. I'm not sure how familiar you are with modules and templates (so forgive me if I'm over-explaining here), but the {{doublet}} template calls the module with {{#invoke:}}. The frame arguments are those which are passed directly to {{#invoke:}} from inside the template itself (e.g. one of them is cat=doublets, which ensures that the page is always added to the "[langname] doublets" category). On the other hand, the parent frame arguments are those are passed to the {{doublet}} template from the entry (i.e. the template's actual parameters).

The reason it's set up like this is because the output is generated using the misc_variant_multiple_terms function in the module, which is generic, and the frame arguments are used to customise it for each template (though the only other one we seem to have at the moment is {{piecewise doublet}}); the oftext= parameter has been included to facilitate that kind of customisaton.

That being said, it wouldn't be difficult to add support for custom text to {{doublet}}, but before I add it it would be helpful if you could let me know what you're trying to do, as it might be that there's already a better way to do it. Theknightwho (talk) 14:20, 3 June 2024 (UTC)Reply

I'm not super familiar with templates/modules so I appreciate the explanation, though all that basically accords with my understanding. I went to modify the template itself in order to pass along an oftext argument but from glancing at the module code I got the impression that wouldn't work. Am I wrong?

At any rate, what I am trying to do is suppress the "of" altogether, so that, in combination with the nocaps parameter, I get something that says "doublet X", as that is what would makes sense in the running text I am writing for an entry (specifically, the collocation "Compare inherited doublet X).

Thanks again for your help. Brusquedandelion (talk) 14:38, 3 June 2024 (UTC)Reply

@Brusquedandelion I'd do it like this (using an English example):

Compare inherited {{doublet|en|nocap=1}} {{m|en|term}}.

This gives you: "Compare inherited doublet term." This ensures you still get the doublet link + category, without tying you to a specific wording. Theknightwho (talk) 14:46, 3 June 2024 (UTC)Reply

Etymology of derogatory terms

Latest comment: 3 months ago4 comments4 people in discussion

Hello, I wanted to address the two pages you reverted about the etymology of red man and redskin. I'm at a restaurant right now, so there's not much I can do. How can I attest that these terms were indeed from the history of skinning and scalping? I tried to add a resource to back up my claim. I am Native American, and my friend and I care to make people aware about the history. What would be good resources to add? I can get onto my computer when I get home or have another person help with defining the etymology. Thank you Flame, not lame (talk) 00:19, 11 June 2024 (UTC)Reply

@Flame, not lame I did read the link you provided ([5]), but it doesn't talk about the terms redskin or red man, or why red was historically used to describe Native Americans. I'm particularly conscious that derogatory terms often generate folk etymologies that aren't actually true, so it would be good to see a source that provides a factual basis for the claim. @Chuck Entz may have some thoughts on this. Theknightwho (talk) 00:27, 11 June 2024 (UTC)Reply

(@Flame) As far as I have found (although even after my edits to the entry just now, it has been several years since I made a thorough review of information about this subject), the skinning/scalping theory has not been accepted by scholars as plausible, in part because the other explanation (that it's a reference to skin colour) is supported by the documentary evidence that both Europeans and Natives long referred to Natives' skin colour as red and Europeans' skin as white (and Africans' skin as black). This in no way affects the extent to which the terms are now offensive. It's just that (as far as I've seen) there's not historical evidence that scalping has anything to do with why either Natives or Europeans referred to Natives as redskins, red men or red. If there are scholars advancing the argument that scalping/skinning is the etymological origin of the terms, with evidence, it would be useful to bring that to bear so it can be discussed and weighed against the scholarship pointing to the skin-color etymology. - -sche (discuss) 01:51, 11 June 2024 (UTC)Reply

There's so much nonsense out there about scalping already- why make up more of it? I vaguely remember something about scalping having originated during wars between the French and the English as a way to verify that the Indians were really killing the people they were being paid to kill- but it's been decades since I've read anything on the subject. Besides, I'm more interested about cultures in the western US, which have never had anything to do with such things. Chuck Entz (talk) 04:54, 11 June 2024 (UTC)Reply

questions about aliases of parameters with special properties

Latest comment: 3 months ago3 comments2 people in discussion

Since you've done a lot of work on Module:parameters, I figure you might know the answer to this. If |bar= is an alias of |foo=, and |foo= has special properties such as list = true, separate_no_index = true, type = "script", etc., do those special properties have to be duplicated in the alias spec (in this case the spec for |bar=), or does the module pick them up automatically from the aliasee (in this case |foo=)? It's hard to tell from reading the code. If they are supposed to be duplicated in the alias spec, what happens if they're not? I do see a comment in the code about non-list aliases of list params, which suggests that at least list = true needs to be duplicated on both params, but I don't know if that holds everywhere. Definitely this should be documented in the documentation for Module:parameters, as it's been a constant source of confusion for me. Benwing2 (talk) 05:02, 14 June 2024 (UTC)Reply

@Benwing2 I can't remember exactly without looking in detail, but I think all the special properties other than list get picked up automatically (i.e. they don't need to be specified), but it wasn't possible to enable that for lists since there are cases where the alias only applies to the first item in the list, meaning that if you want the alias to apply to the whole list you have to specify it for that as well.

That being said, nothing will go wrong if you over-specify anything for the alias (though there might be undefined behaviour if you give the main param and alias contradictory properties). Theknightwho (talk) 20:06, 14 June 2024 (UTC)Reply

Thanks! Benwing2 (talk) 20:39, 14 June 2024 (UTC)Reply

usage notes

Latest comment: 3 months ago3 comments2 people in discussion

Hi. I think the usage notes at 🔚 and 🔙 should be deleted. Color of emojis is font dependent, not OS dependent (and that's rather trivial anyway); also, with translingual coverage, there are not "other" languages because no language is defined to begin with.

(I reverted myself on this because of the possibility that deleting wording, even if it's nonsense, might constitute reason for a block.)

Thanks, kwami (talk) 07:18, 15 June 2024 (UTC)Reply

No objection here, so I went ahead and removed. kwami (talk) 07:05, 17 June 2024 (UTC)Reply

@Kwamikagami Sorry, I missed this due to the thread below being posted as well. I agree with the change. Theknightwho (talk) 19:05, 18 June 2024 (UTC)Reply

Why removed templates?

Latest comment: 3 months ago2 comments2 people in discussion

Just saw you removed lite templates and want to ask why these templates should be removed/replaced? Thank you. -- Miwako Sato (talk) 09:48, 15 June 2024 (UTC)Reply

@Miwako Sato Because they're no longer needed, as their original purpose (to save resources) is no longer applicable. Using them in Thai entries causes transliterations not to work, and the correct script formatting is not applied. Theknightwho (talk) 13:41, 15 June 2024 (UTC)Reply

Add NKD2

Latest comment: 3 months ago3 comments2 people in discussion

Hi, since Module:ja-pron is fully-protected, can you add {{R:Nihon Kokugo Daijiten 2 Online}} as NKD2 to the reference? --TongcyDai (talk) 07:12, 19 June 2024 (UTC)Reply

@TongcyDai Done. Theknightwho (talk) 07:15, 19 June 2024 (UTC)Reply

Thank you! --TongcyDai (talk) 07:18, 19 June 2024 (UTC)Reply

Luftgeschaeft's Pronounciation

Latest comment: 3 months ago1 comment1 person in discussion

Hi, Please explain why have you deleted the pronounciation. If you felt that the IPA was sonewhat wrong, pelase update it and explain why. Shokomoshiko (talk) 19:41, 19 June 2024 (UTC)Reply

Asaf's Pronounciation

Latest comment: 3 months ago1 comment1 person in discussion

Hi again, you also deleted the pronounciation there. Please explain yourself. IDK if you're targetting me beacuse another community member specifically liked my addition. Shokomoshiko (talk) 19:48, 19 June 2024 (UTC)Reply

defaulting `etym_lang = true` in Module:parameters?

Latest comment: 2 months ago10 comments3 people in discussion

I'm doing a bit more work on Module:parameters and it occurs to me that it might either make sense to default `etym_lang = true` or maybe create another type that accepts languages and etym languages; maybe even make "language" do this and create another type "full language" that means "only full languages". Either way, we could (more or less programmatically) find all the cases that currently use 'type = "language"', convert the ones that don't explicitly allow etym languages to use `etym_lang = false` or `type = "full language"` (whatever we adopt) and then change the defaults. Then we audit the ones that don't accept etym languages and see how to make them accept them. I think the end goal is for all templates to accept etym languages and do the right thing with them (which in practice means converting to full languages when generating categories names containing the language; not sure what else needs to be done). Thoughts? Benwing2 (talk) 08:38, 21 June 2024 (UTC)Reply

@Benwing2 I agree with the change to have etym-only langs be allowed by default, but I don't know whether we'd ever want headword templates to accept them, since that's basically what defines the difference between full langs and etym-only langs (but I'm open to being persuaded otherwise). Theknightwho (talk) 08:46, 21 June 2024 (UTC)Reply

As it happens, {{head}} already accepts etym languages; I added that support with the thought that e.g. Dari-only or Classical-Persian-only terms might use the appropriate etym variant code to get the Dari or Classical Persian translit. But I agree that this might not make sense. What do you think as for whether full-language-only templates should use `etym_lang = false` or `type = "full language"`? I'm leaning towards the latter but could be persuaded otherwise. Benwing2 (talk) 22:34, 21 June 2024 (UTC)Reply

@Benwing2 Definitely the latter, since Lua's nil is falsy, so it fits the framework better.

In terms of etym-codes with {{head}}, I think there's scope for doing that more widely if we integrate {{tlb}} into it and automate labels if an etym-lang code is used. Is there any reason why Classical Persian can't be split out as its own L2 at this point? It seems to have separate separate handling for pretty much everything by now. Theknightwho (talk) 22:43, 21 June 2024 (UTC)Reply

@Theknightwho You'd have to ask User:Babr about how similar written modern Persian and Classical Persian are. I note however that Dari has similar translit handling to Classical Persian but it's not clear to me it would be right to split out Dari since Dari and Iranian Persian are almost completely mutually intelligible and share the same written form.

BTW on the topic of extending {{head}}: I've been thinking of creating a new version of {{head}}, maybe called {{h+}} or something (or alternatively, repurpose {{h}} for this, since {{h}} is currently just an alias for {{head}} and used on only about 500 pages). It would have more modern-style args handling, e.g. there would be a single argument for each inflection with a separator between the name of the inflection and its value, and further inflection properties specified via inline modifiers rather than the awkward |f3tr= and such. Genders would be specified using a single |g= argument with comma-separated genders rather than |g=, |g2=, etc. Maybe we could even consider combining the POS and |head= arguments as well, I dunno. Benwing2 (talk) 23:06, 21 June 2024 (UTC)Reply

@Benwing2 I like this idea, but if we're going to rebuild the template I'd prefer if we did something a bit more radical, by creating a framework for integrating language-specific modules. In other words, languages would all (very roughly) follow the same format in their headword template layouts, but custom parameters, functionality and display would be available. In most cases, I'd like it so that the only change to the end user would be from {{xx-noun}} to {{h|xx|noun}}.

The two major advantages are:

This would ideally end the incentive for people to create pointless, underpowered headword templates, because I think most of it is basically just "they've got one, we should have one too".
It would be an opportunity to create a framework for common features useful for certain language families or certain (sets of) scripts, so that smaller languages would more easily be able to take advantage of features developed for larger cousins without copy and pasting (i.e. forking). We've seen a bit of this already, with your work on Romance languages and the retrofitting of some Japanese modules to be language-neutral for the Ryukyuan languages, but it's a problem that'll keep recurring so long as most features are siloed off into language-specific modules.

Theknightwho (talk) 14:38, 22 June 2024 (UTC)Reply

Yup, I think this is a good idea. I started writing a library a few months ago to help with creating language-specific headword modules, similar in spirit to the more-recently created Module:parameter utilities. The intention at the time was to make it easier to create lang-specific headword templates in a more standardized fashion but I think having the support built into a new {{h}} would be even better. Benwing2 (talk) 21:51, 22 June 2024 (UTC)Reply

@Theknightwho @Benwing2 this has come up before and there is pretty much unanimous consensus that splitting Persian would be more of a hassle for editors of all varieties than beneficial. If we split Persian the vast majority of entries would be the exact same spelling, with the exact same definitions and the exact same etymologies, with the only discernible difference being that they use different transliterations in the headword.

Additionally, there would be a lot of logistical issues with how we would split modern varieties of Persian. The differences between modern varieties pretty much boils down to their reflexed pronunciations from Classical Persian, and while Iranian Persian had pretty extensive phonemic mergers, Tajik and Dari by-and-large did not (they did have phonemic mergers but not to the same extent). A split would be more plausible if those phonemic mergers were indicated in writing, but they arent.

The solution, that @Saranamd and I have discussed and suggested before, is that because the reflexed pronunciations are typically very simple to predict, we could enter the classical transit and produce multiple transliterations. For example on the entry for 'lion' we would currently enter {{fa-noun|tr=šir}}, but because of phonemic mergers in IP, there is no way to decern any information relevant to other varieties from that. Whereas, if it was possible to enter the classical transliterations, we could enter something like {{fa-noun|tr=šēr}} and generate something like: Template:User:Sameerhameedy/Sandbox, from only filling out one parameter. Or, as another example, Template:User:Sameerhameedy/Sandbox by only entering {{fa-noun|tr=bōra}}. fa-IPA already generates multiple transliterations from the classical transliteration, so such a template would only need to fetch such romanizations, not generate them itself. — BABR (talk) 01:47, 22 June 2024 (UTC)Reply

@Babr I am generally in favor of such a solution although it would take some thinking to make sure it works transparently everywhere. Benwing2 (talk) 01:59, 22 June 2024 (UTC)Reply

@Benwing2, @Theknightwho, So, returning to this, I did make Module:fa-translit which can output two transliterations. I am not sure if this is a good solution but it was the only one I could think of that did not require rewriting hundreds of modules. However, if we consider using this, there are a few points that would need to be addressed:

There is no way to override the Iranian romanization when it is incorrect.
Ideally diacritics would be hidden when using the language codes fa or prs (though Saranamd requested that diacritics be removed for all varieties, which I am fine with TBH).
- For fa: this is because diacritics are used very differently in modern Iranian Persian vs how they were used in Classical Persian texts centuries ago, so showing diacritics could be confusing to readers.
- For prs: this is because there is no standardized diacritical notation; The classical notation is used for simplicity, but in reality, there are many different ways to vocalize the same text in Dari.
not sure what separator should be used between transliterations ; / , or ,? There doesn't seem to be an agreed upon standard for this but FWIW Chinese uses ／ to separate traditional and simplified.
only somewhat related, but it would be nice to have a template similar to {{zh-q}} that would automatically mark what transliteration is being used based on the lang code. I don't have the coding ability necessary to do such a thing right now but, as adding quotations is currently quite a hassle, I hope creating such a module is considered in the future.

Anyways, if either one of you has the time, please let me know your thoughts.

— BABR・talk 19:55, 20 July 2024 (UTC)Reply

Template:kangxi-ws

Latest comment: 2 months ago4 comments2 people in discussion

Check the transclusions of this template. This completely breaks the links. –MJL ‐Talk‐^☖ 18:25, 22 June 2024 (UTC)Reply

@MJL Alright - I'll do a proper fix. Theknightwho (talk) 18:28, 22 June 2024 (UTC)Reply

Great; thank you.

–MJL ‐Talk‐^☖ 18:32, 22 June 2024 (UTC)Reply

@MJL You may have to hang on a bit - I'm going to integrate {{kangxi-ws}} and {{wikisource}} into Module:interproject, but it might not be today, depending on how much time I have. Theknightwho (talk) 19:00, 22 June 2024 (UTC)Reply

Odd module error at 軍

Module:wuu-pron is throwing an error with the message Lua error in Module wuu-pron at line 131: Invalid final: "iuwn". While it's true that @Musetta6729 removed that final from Module:wuu-pron/data earlier in the day, there's no trace of it in the entry, and I can't follow the code well enough to figure out where it comes in. Complicating things has been work by @Benwing2 on very basic parameter-related modules that might have interacted with something (there's also what bears all the marks of a massively-transcluded bug that probably lasted only a few ohnoseconds before it was fixed, but that has no bearing on this). Any ideas?

bad deletion of fight city hall

Latest comment: 2 months ago34 comments5 people in discussion

Hello, I believe your deletion of fight city hall was in error as it as an acceptable use of a redirect. As such, I am going to recreate it. Please do not delete it again without discussion Purplebackpack89 02:37, 27 June 2024 (UTC)Reply

Furthermore, I believe some of your other redirect deletions were also in error, and were motivated by thoughts on creator, not content, so they will also be coming back, and, again, if you believe they should be deleted, start a discussion. Purplebackpack89 02:49, 27 June 2024 (UTC)Reply

@Purplebackpack89 All this tells me is that you're personally offended that I deleted something you created. Please stop creating redirects in mainspace. Theknightwho (talk) 02:51, 27 June 2024 (UTC)Reply

Please stop deleting redirects in mainspace. If you believe consensus favors their deletion, RfD them instead of deleting them. Purplebackpack89 02:57, 27 June 2024 (UTC)Reply

@Purplebackpack89 It's impossible to have any kind of discussion with you about anything. Theknightwho (talk) 02:59, 27 June 2024 (UTC)Reply

Ha! Look in the mirror! YOU are literally the one avoiding discussion by speedy-deleting instead of RfDing. If you believe they should be deleted, what's the goddamn harm in starting an RfD? Purplebackpack89 03:01, 27 June 2024 (UTC)Reply

@Purplebackpack89 We don't do redirects like that in mainspace, because this is a verb. Theknightwho (talk) 02:50, 27 June 2024 (UTC)Reply

There is nothing in policy preventing those redirects. You either just don't like them, or you don't like me. You WOULDN'T have deleted them if they hadn't been created by another editor. Purplebackpack89 02:53, 27 June 2024 (UTC)Reply

@Purplebackpack89 Yes I would. You just try to make every argument personal as a way to poison the well, and have done since you started contributing. You're not that special. Theknightwho (talk) 02:54, 27 June 2024 (UTC)Reply

Ugh, @Benwing2: I hate to ping you, but can we please get another admin monitoring this conflict? And to both parties, this clearly isn't going to go anywhere. And to Purplebackpack89, from my understanding, Theknightwho is correct in his assessment about redirects. AG202 (talk) 02:58, 27 June 2024 (UTC)Reply

Purplebackpack89 needs a long block, and if I need to I will happily show the substantial evidence that this is how they've behaved towards numerous contributors the entire time they've been contributing to Wiktionary. It's combative, manipulative and completely unacceptable. Theknightwho (talk) 03:01, 27 June 2024 (UTC)Reply

By the same logic, Theknightwho needs an equally-long block, and needs his tools taken away. There was a discussion of this only a few weeks ago, and no consensus to block emerged. And @AG202, I would disagree with that assessment, and what's the harm anyway in Theknightwho RfDing instead of speedy deleting. Again, I ask AG202, TheKnightwho, Benwing, anyone: what's the harm in just RfDing instead of speedy deletion? Why won't you answer this question? Purplebackpack89 03:04, 27 June 2024 (UTC)Reply

You made this personal; not me. Theknightwho (talk) 03:12, 27 June 2024 (UTC)Reply

Cap. If these redirects really needed deletion, somebody else would have done it. It didn't have to be you. Some of the redirects you deleted had existed for months or even years and you're the only one who took issue. Purplebackpack89 03:21, 27 June 2024 (UTC)Reply

@Purplebackpack89 This is another tactic you've tried many times before with other admins in the past: Metaknowledge, Ungoliant, Mglovesfun etc. etc. You make things personal any time anyone disagrees with you, so that you can claim you're being victimised. Theknightwho (talk) 03:27, 27 June 2024 (UTC)Reply

Well, in some of those cases, it was true. Purplebackpack89 03:33, 27 June 2024 (UTC)Reply

Also, you've ignored the fundamental question: if the redirects were bad, why didn't somebody else delete em? Purplebackpack89 03:34, 27 June 2024 (UTC)Reply

Because they probably didn't notice them. None of them had any views in the month prior to them being deleted. Theknightwho (talk) 03:45, 27 June 2024 (UTC)Reply

@AG202 Thanks for the ping. Here are my thoughts:

As a general rule we disprefer hard redirects in favor of soft ones using {{alt form}} or the like.
It seems to me it's OK to have fight City Hall (note caps) and maybe fight city hall as a soft redirect to you can't fight City Hall, because they occur in expressions like "it's impossible to fight City Hall" and "sometimes one can fight City Hall". OTOH, busted my neck (which I see was also deleted by User:Theknightwho) is not acceptable even as a soft redirect because it uses the past tense, and verbs are lemmatized in the infinitive.
User:Purplebackpack89, please assume good faith on the part of User:Theknightwho. It's not helpful to claim he's personally targeting you or that he should refrain from reviewing your changes just because it's you.
As a general rule, edit warring is a no-no, and may well result in blocking. What that means is that if you believe a change to your contribution was made erroneously, you should not redo your contribution (which would be edit warring), but contact the person who made the change on their talk page, and if that doesn't seem to be working, bring it up in the Beer parlour.

Benwing2 (talk) 03:34, 27 June 2024 (UTC)Reply

OTOH, this could've also been avoided if Knight had RfDed the redirects instead of speedily-deleting. Isn't speedy-deleting mostly supposed to be used for just vandalism anyway? Purplebackpack89 03:45, 27 June 2024 (UTC)Reply

@Benwing2 There is some level of harassment going on here: PB89 has now written on multiple users' talkpages about this particular incident ([6] [7]) and originally wrote a BP thread (since removed) ([8]). This kind of thing happened last time, and it's quite frustrating that this keeps getting written off as tit-for-tat, when they've been doing this kind of thing to multiple contributors for many years. Theknightwho (talk) 03:49, 27 June 2024 (UTC)Reply

busted my neck

FYI, the most recent creation of busted my neck is a conjugation/inflection, not a redirect, so your rationale is in error. Also, half of what you said was an attack. If you want to RfD the inflection, please start a different RfD. 04:00, 27 June 2024 (UTC)

@Benwing2 since you're being tagged in everything, please note that Theknightwho made an inaccurate RfD. Purplebackpack89 04:06, 27 June 2024 (UTC)Reply

@Purplebackpack89 It's a soft redirect, which is what Ben refers to above, too. Do not remove nominations at RFD by other contributors as you did here ([9]), even if you disagree with them. Theknightwho (talk) 04:07, 27 June 2024 (UTC)Reply

OK, I'll instead close it as speedy keep Purplebackpack89 04:09, 27 June 2024 (UTC)Reply

Could an uninvolved admin please handle this? Deleting a thread at RFD and intentionally misunderstanding what a soft-redirect is to wrongly speedy keep something is block-worthy behaviour, in my view. @Vininn126 @Thadh @Chuck Entz Theknightwho (talk) 04:12, 27 June 2024 (UTC)Reply

Could an uninvolved admin please handle TheKnightwho? He should've backed away from this awhile back but he keeps just ramping it up, ramping it up, ramping it up. Any block of me should be met with a block of equal length of Theknightwho. Purplebackpack89 04:15, 27 June 2024 (UTC)Reply

You asked me to RFD them instead of speedy deleting them, so I did. You recreated busted my neck as a soft redirect after Benwing2 explicitly told you that it wouldn't be acceptable as one, so I pointed out that you were being disruptive. Which are you.

There's no winning with you - whatever anyone does, you'll always claim to be a victim if you don't get exactly what you want. Theknightwho (talk) 04:24, 27 June 2024 (UTC)Reply

@Purplebackpack89 Do not do that. You know (or should know) about WP:Point. Just because you don't like an RFD doesn't give you license to delete the RFD, speedy-close an RFD as kept or otherwise act disruptively. In this case I agree with User:Theknightwho that your behavior is getting to the point of being block-worthy. Benwing2 (talk) 04:15, 27 June 2024 (UTC)Reply

@Benwing2 So Theknightwho is allowed to create an RfD that's an attack of me and I'm not allowed any recourse? The RfD itself was disruptive. We are in a content dispute and both of us are equally guilty here. Furthermore, what policy prevents a speedy close such as that? NONE. Purplebackpack89 04:17, 27 June 2024 (UTC)Reply

@Purplebackpack89 I don't understand why creating an RFD is an attack. You actually suggested yourself above that User:Theknightwho should create an RFD instead of speedy-deleting a hard redirect. As I mentioned above, you need to assume good faith, which you don't seem to be doing in the case of Theknightwho. Benwing2 (talk) 04:25, 27 June 2024 (UTC)Reply

Did you read the nomination? Knight uses the RfD to attack me and make claims that I am being disruptive. Purplebackpack89 04:28, 27 June 2024 (UTC)Reply

@Purplebackpack89 TKW expressed his personal opinion ("in my view") that your actions are disruptive. IMO that's not an attack (not to mention that your actions since then *have* been disruptive). If he did it repeatedly and without good reason it might well be a personal attack but not in this single instance. I have asked you repeatedly to assume good faith, and you have repeatedly ignored this request. Assume Good Faith is a cornerstone principle that is absolutely necessary for harmonious interactions in a wiki; without it, things rapidly degenerate into flame wars. Benwing2 (talk) 04:39, 27 June 2024 (UTC)Reply

Purple, in my view the way you handed the situation was not appropriate. Frankly, since you are a Wikipedia editor and closing/deleting RFD's you are involved in is a block-able offense over there, I think you should have known that it wasn't appropriate. That being said, please look at WT:Redirections#Acceptable uses, there are very few situations where redirections are considered appropriate on WT. If you feel that this was an appropriate situation you can mention that on the RFD, but please stop edit warring. — BABR・talk 04:29, 27 June 2024 (UTC)Reply

@Benwing2 Just FYI, they (effectively) edit-warred me reinstating the RFD by immediately trying to close it as a speedy keep ([10]), and after I reversed that ([11]) they then removed the RFD notice at the entry anyway ([12]). They objectively are being extremely disruptive: it's a statement of fact, not an attack. Theknightwho (talk) 04:20, 27 June 2024 (UTC)Reply

Japanese errors

Latest comment: 2 months ago24 comments2 people in discussion

Hi. Are you going to fix the Japanese yomi-related errors? They keep increasing. Benwing2 (talk) 05:08, 28 June 2024 (UTC)Reply

@Benwing2 Yes, but the actual entries need fixing. Theknightwho (talk) 05:12, 28 June 2024 (UTC)Reply

Sure, I get that. It's just that these sorts of errors have been sitting there since I logged on c. 12 hours ago. Benwing2 (talk) 05:16, 28 June 2024 (UTC)Reply

Hey also, Special:WantedCategories just refreshed and there are quite a lot of new categories of the form Category:Japanese terms spelled with 紗 read as さ. I checked one of the terms in this category and it looks legit; was there a mistake in the category generation before that was causing a lot of these categories to be missed? BTW I have a script that tries to autopopulate the correct reading type for categories of this nature. If all the new categories of this form in Special:WantedCategories are legit, I'll go ahead and run the script. Benwing2 (talk) 05:45, 28 June 2024 (UTC)Reply

@Benwing2 So I've basically been sorting out some of the messiness in Module:kanjitab, as I wanted to sync up Module:category tree/poscatboiler/data/lang-specific/jpx with it in terms of how yomi (readings) get handled, but in the process I did a heavy rewrite of that part of the module. The main things are:

The new errors are due to an extra safety check on jukujikun, which are (by definition) compound readings. Quite a few instances of {{ja-kanjitab}} have had the relevant characters marked as jukujikun, but the reading has been apportioned between them instead of being shared as an indivisible unit, so it now throws an error if you say that a single kanji reading is jukujikun, and explains how to solve it.
Previously, only jōyō kanji "with exceptions" had categorisation for individual readings, but the exception list had grown so massive that I just removed it altogether. Given some of the largest uncreated categories have >100 members, I think this was justified. Most seem to be kyūjitai, but that's by-the-by.
I've extracted the yomi data into Module:kanjitab/data, and it's shared between that and the jpx category tree module, which enables full synchronisation. The last thing to do is to synchronise the input yomi parameter for {{auto cat}} for categories like Category:Japanese terms spelled with 紗 read as さ, because it makes sense for that to accept the same inputs as kanjitab (e.g. {{auto cat|k}} would be treated as kun'yomi), and for it to throw an error if an invalid yomi type is entered. This also means we can get rid of the disjoint between Category:Japanese terms spelled with kanji with on readings (terms with at least one on'yomi kanji) and Category:Japanese terms read with on'yomi (terms that only have on'yomi kanji), by moving the former to Category:Japanese terms spelled with kanji with on'yomi readings, as the only reason it uses "on" is because there are hundreds of instances of {{auto cat|on}} treating "on" as literal.
If we ever have need to add new yomi types in the future, adding them to the data module will add them to kanjitab and the category tree at the same time.

Given Module:kanjitab/data has quite a few fields that are only relevant to the category tree, it might be better at some other location, but I'm not sure where yet. Theknightwho (talk) 06:11, 28 June 2024 (UTC)Reply

OK thanks for the detailed info. What are your plans for the remaining work and do you need my help? As I mentioned, I have a script that I've been running periodically to populate the reading type in the {{auto cat}} call for the 'Japanese terms spelled with FOO read as BAR' categories, since {{auto cat}} will throw an error otherwise. The script is here: [13] and it looks both in {{ja-readings}} and {{ja-kanjitab}}. I presume that script will need updating (maybe?), since you are planning on changing the reading abbreviations. Let me know when you want me to run that script and how to fix it up to account for the new abbreviations. In terms of making the actual abbreviation change in the {{auto cat}} call, that would require another bot script but it should be extremely easy. Benwing2 (talk) 06:22, 28 June 2024 (UTC)Reply

@Benwing2 I'm just working out the final snags on the {{auto cat}} parameter, and then I'll start the clean-up work in entries.

In terms of the reading abbreviations, we might need to change them for some of the less common ones, since kanjitab's set of inputs is a bit eclectic. I'll document the kanjitab data module, as that should make it straightforward for you to update your script, but the short version is that each yomi's data value exists in the table under every alias key, since that means it can be treated as a lookup table. Theknightwho (talk) 06:35, 28 June 2024 (UTC)Reply

OK thanks! I'll wait to run my script until you've indicated how to update it. As long as it gets done within 3 days (which is the update frequency for Special:WantedCategories) it should be fine. Benwing2 (talk) 06:38, 28 June 2024 (UTC)Reply

@Benwing2 I've just checked, and it breaks any that have apostrophes and macrons (since kanjitab never supported those as raw inputs). Given that affects most of them, I'll set all the canonical names as aliases which should avoid the issue. I think it would be sensible to sort out the kanjitab yomi specs at some point, as they're arbitrary and unintuitive: kun'yomi is kun or k, kan'yōon is kan, which means kan'on (10x as common) has to be kanon, and so on. Theknightwho (talk) 06:47, 28 June 2024 (UTC)Reply

Sorry, what do you mean by "it breaks any that have apostrophes and macrons"? Benwing2 (talk) 06:51, 28 June 2024 (UTC)Reply

@Benwing2 So {{auto cat}} currently treats the inputs as literals, which means you have to put {{auto cat|tōon}} or whatever. Since kanjitab's aliases were all lowercase ASCII, fully synchronising auto cat and kanjitab causes any inputs like that one to throw an error, since it doesn't see them in the list of valid yomi inputs. Out of the 8 valid yomi types for those kinds of categories, 6 have an apostrophe and/or macron, so I've changed Module:kanjitab/data to add the canonical names as an alias for each type, which fixes the problem. Theknightwho (talk) 07:05, 28 June 2024 (UTC)Reply

Thanks! Benwing2 (talk) 07:52, 28 June 2024 (UTC)Reply

Do you mind if I run my script to infer the correct reading types? I think the only things needing changing, if anything, are the following settings at the top:

allowed_reading_types = ["goon", "kanon", "toon", "soon", "kanyoon", "on", "kun", "nanori"]

canonicalize_reading_types = {
  "kanon": "kan'on",
  "toon": "tōon",
  "soon": "sōon",
  "kanyoon": "kan'yōon",
}

along with this table farther down:

              yomi_to_canonical_reading_type = {
                "o": "on",
                "on": "on",
                "kanon": "kanon",
                "goon": "goon",
                "soon": "soon",
                "toon": "toon",
                "kan": "kanyoon",
                "kanyo": "kanyoon",
                "kanyoon": "kanyoon",
                "k": "kun",
                "kun": "kun",
                "juku": "jukujikun",
                "jukuji": "jukujikun",
                "jukujikun": "jukujikun",
                "n": "nanori",
                "nanori": "nanori",
                "ok": "jubakoyomi",
                "j": "jubakoyomi",
                "ko": "yutoyomi",
                "y": "yutoyomi",
                "irr": "irregular",
                "irreg": "irregular",
                "irregular": "irregular",
              }

Note that the canonicalize_reading_types map operates on the output of yomi_to_canonical_reading_type, so an original yomi value of kan maps to kanyoon and then to kan'yōon, which goes into the {{auto cat}} call. Benwing2 (talk) 02:43, 29 June 2024 (UTC)Reply

@Benwing2 Go ahead. As of yesterday {{auto cat}} now accepts all the same inputs as {{ja-kanjitab}}, so you don't need to canonicalise from kanyoon to kan'yōon etc. It was just necessary to grandfather that in so that we didn't need to amend hundreds of pre-existing categories that already use the canonical name. You don't even need to bother with the first step either - it'd be easier to just use whatever codes have been put into kanjitab, since the codes for the two templates are directly synchronised via Module:kanjitab/data. Theknightwho (talk) 02:47, 29 June 2024 (UTC)Reply

OK, I'll disable the step that runs the canonicalize_reading_types mapping. Benwing2 (talk) 02:51, 29 June 2024 (UTC)Reply

OK just one more question, are the existing errors in CAT:E going to cause issues, e.g. 甲烏賊? I don't really understand what the issue is regarding these errors or why they're difficult to fix. Benwing2 (talk) 02:54, 29 June 2024 (UTC)Reply

@Benwing2 They shouldn't, because they're all caused by jukujikun being applied to single kanji, as the template now restricts it to indivisible multi-kanji readings, since that's (by definition) what they are. We don't generate categories for jukujikun other than the general "spelled with" and "read with" categories, since they can't apply to individual kanji.

On a separate note, I think we need to clean up the "terms spelled with Y" (contains at least one Y yomi) and "read with Y" (consists entirely of Y yomi) categories, since I honestly don't think the latter is particularly useful. Theknightwho (talk) 03:06, 29 June 2024 (UTC)Reply

What's an example of the latter category? Benwing2 (talk) 03:19, 29 June 2024 (UTC)Reply

@Benwing2 For single-kanji yomi, the "spelled with kanji with X" categories are one of the umbrella categories for the "KANJI read as YOMI" categories, while the "read with X" category is just a big list of everything purely read with that yomi type.

However, we've also got these two, which seems to be a mistake, since the only difference is that {{juku}} puts them in one, while {{ja-kanjitab}} puts them in the other.

Theknightwho (talk) 03:26, 29 June 2024 (UTC)Reply

OK, feel free to hack away and let me know if you need any bot category deletion work. My script to autopopulate things like Category:Japanese terms spelled with 儲 read as もう is currently running. Benwing2 (talk) 03:33, 29 June 2024 (UTC)Reply

Thanks. Theknightwho (talk) 03:34, 29 June 2024 (UTC)Reply

Script is done running. It processed 3,741 pages and saved {{auto cat}} onto 3,634 of them. 105 were skipped due to not being able to infer the reading type; presumably the remaining 2 were already done. The ones where the reading type couldn't be inferred seem to be cases where the source pages leave out the |yomi= param in the {{ja-kanjitab}} call. These should ideally be fixed. Benwing2 (talk) 06:19, 29 June 2024 (UTC)Reply

@Benwing2 Thanks - that's helpful. Theknightwho (talk) 06:50, 29 June 2024 (UTC)Reply

For reference, if you feel like tackling some of them, here's the full list (98 of them actually):

Page 97 Category:Japanese terms spelled with 匡 read as まさ
Page 98 Category:Japanese terms spelled with 様 read as さ
Page 196 Category:Japanese terms spelled with 侑 read as ゆ
Page 197 Category:Japanese terms spelled with 暢 read as のぶ
Page 198 Category:Japanese terms spelled with 杏 read as あ
Page 646 Category:Japanese terms spelled with 亦 read as も
Page 647 Category:Japanese terms spelled with 似 read as ねし
Page 648 Category:Japanese terms spelled with 侑 read as ゆき
Page 649 Category:Japanese terms spelled with 允 read as じゅ
Page 650 Category:Japanese terms spelled with 允 read as まさ
Page 651 Category:Japanese terms spelled with 允 read as み
Page 652 Category:Japanese terms spelled with 凵 read as うけばこ
Page 653 Category:Japanese terms spelled with 努 read as りき
Page 654 Category:Japanese terms spelled with 勉 read as りき
Page 655 Category:Japanese terms spelled with 勧 read as くゎん
Page 656 Category:Japanese terms spelled with 只 read as た
Page 657 Category:Japanese terms spelled with 叶 read as か
Page 658 Category:Japanese terms spelled with 叶 read as の
Page 659 Category:Japanese terms spelled with 吾 read as わが
Page 660 Category:Japanese terms spelled with 嘉 read as かず
Page 661 Category:Japanese terms spelled with 坐 read as くら
Page 662 Category:Japanese terms spelled with 埴 read as はい
Page 663 Category:Japanese terms spelled with 埴 read as はな
Page 664 Category:Japanese terms spelled with 壽 read as ひろ
Page 666 Category:Japanese terms spelled with 妃 read as みめ
Page 667 Category:Japanese terms spelled with 宥 read as ひろ
Page 668 Category:Japanese terms spelled with 尖 read as さき
Page 669 Category:Japanese terms spelled with 峻 read as たか
Page 670 Category:Japanese terms spelled with 嶺 read as れ
Page 671 Category:Japanese terms spelled with 庵 read as わん
Page 672 Category:Japanese terms spelled with 彗 read as す
Page 673 Category:Japanese terms spelled with 怜 read as れ
Page 674 Category:Japanese terms spelled with 恐 read as かしこ
Page 675 Category:Japanese terms spelled with 惣 read as そ
Page 676 Category:Japanese terms spelled with 應 read as のう
Page 677 Category:Japanese terms spelled with 捷 read as はしこ
Page 678 Category:Japanese terms spelled with 斐 read as のみ
Page 679 Category:Japanese terms spelled with 暢 read as よう
Page 680 Category:Japanese terms spelled with 曹 read as そー
Page 681 Category:Japanese terms spelled with 最 read as かなめ
Page 682 Category:Japanese terms spelled with 朋 read as お
Page 683 Category:Japanese terms spelled with 杏 read as あず
Page 685 Category:Japanese terms spelled with 椛 read as はな
Page 686 Category:Japanese terms spelled with 椽 read as えん
Page 687 Category:Japanese terms spelled with 楓 read as か
Page 688 Category:Japanese terms spelled with 様 read as さぁ
Page 689 Category:Japanese terms spelled with 欣 read as のぶ
Page 691 Category:Japanese terms spelled with 汝 read as なむ
Page 692 Category:Japanese terms spelled with 汝 read as なん
Page 693 Category:Japanese terms spelled with 泥 read as で
Page 694 Category:Japanese terms spelled with 洲 read as やす
Page 696 Category:Japanese terms spelled with 漲 read as みなぎ
Page 697 Category:Japanese terms spelled with 濃 read as こま
Page 698 Category:Japanese terms spelled with 玖 read as くに
Page 699 Category:Japanese terms spelled with 瑛 read as はな
Page 700 Category:Japanese terms spelled with 瑞 read as ず
Page 701 Category:Japanese terms spelled with 瑞 read as ひろ
Page 702 Category:Japanese terms spelled with 瑞 read as み
Page 703 Category:Japanese terms spelled with 皓 read as つぐ
Page 704 Category:Japanese terms spelled with 磅 read as ぽんど
Page 706 Category:Japanese terms spelled with 秀 read as ひ
Page 707 Category:Japanese terms spelled with 稀 read as の
Page 708 Category:Japanese terms spelled with 穆 read as よし
Page 709 Category:Japanese terms spelled with 竈 read as へ
Page 710 Category:Japanese terms spelled with 竭 read as つ
Page 711 Category:Japanese terms spelled with 笠 read as ささ
Page 712 Category:Japanese terms spelled with 糎 read as せんちめーとる
Page 714 Category:Japanese terms spelled with 紗 read as さえ
Page 715 Category:Japanese terms spelled with 結 read as すき
Page 716 Category:Japanese terms spelled with 縮 read as ちり
Page 717 Category:Japanese terms spelled with 翅 read as は
Page 718 Category:Japanese terms spelled with 而 read as の
Page 719 Category:Japanese terms spelled with 胡 read as あ
Page 720 Category:Japanese terms spelled with 舛 read as ます
Page 721 Category:Japanese terms spelled with 芥 read as あく
Page 722 Category:Japanese terms spelled with 芦 read as よし
Page 723 Category:Japanese terms spelled with 苺 read as いっご
Page 724 Category:Japanese terms spelled with 茄 read as な
Page 725 Category:Japanese terms spelled with 菫 read as すみ
Page 726 Category:Japanese terms spelled with 葵 read as あい
Page 727 Category:Japanese terms spelled with 蓍 read as めど
Page 728 Category:Japanese terms spelled with 蕈 read as たけ
Page 729 Category:Japanese terms spelled with 蚺 read as せん
Page 730 Category:Japanese terms spelled with 蚺 read as ぜん
Page 731 Category:Japanese terms spelled with 蝦 read as しゃ
Page 732 Category:Japanese terms spelled with 衛 read as まもる
Page 733 Category:Japanese terms spelled with 褌 read as みつ
Page 735 Category:Japanese terms spelled with 贖 read as あが
Page 736 Category:Japanese terms spelled with 邉 read as なべ
Page 737 Category:Japanese terms spelled with 郁 read as ゆ
Page 738 Category:Japanese terms spelled with 郡 read as こうり
Page 739 Category:Japanese terms spelled with 釧 read as くし
Page 740 Category:Japanese terms spelled with 釺 read as みる
Page 741 Category:Japanese terms spelled with 隼 read as とし
Page 742 Category:Japanese terms spelled with 青 read as おう
Page 743 Category:Japanese terms spelled with 頁 read as ぺーじ
Page 744 Category:Japanese terms spelled with 颯 read as はや
Page 745 Category:Japanese terms spelled with 髮 read as が

Benwing2 (talk) 07:28, 29 June 2024 (UTC)Reply

Edit request

Latest comment: 1 month ago4 comments2 people in discussion

Hi. Could you please add ‘Bangladeshi politics’ to the module using: labels["Bangladeshi politics"] = { display "[[w:Politics of Bangladesh|Bangladeshi politics]]", topical_categories = true, } Thank you in advance! Inqilābī 19:54, 30 June 2024 (UTC)Reply

@Inqilābī Apologies - I missed this. Now done. Theknightwho (talk) 20:51, 7 July 2024 (UTC)Reply

Thanks! Inqilābī 20:53, 7 July 2024 (UTC)Reply

Could you please do the same for Burmese politics using: labels["Burmese politics"] = { display "[[w:Politics of Myanmar|Burmese politics]]", topical_categories = true, } Inqilābī 21:46, 31 July 2024 (UTC)Reply

Japanese errors and categories again

Latest comment: 2 months ago12 comments2 people in discussion

Hi. I hope categories like Category:Japanese kanji with kun reading やす・める are correct because there are thousands of them being created by my bot currently due to their appearance in Special:WantedCategories. (What are they? Why is there a dot in the middle?)

Also, what is your plan for clearing the remaining 50 or so Japanese errors? You seem to be doing a few a day; at this rate it might take 10 days, which is too long IMO. Benwing2 (talk) 08:28, 1 July 2024 (UTC)Reply

@Benwing2 Yes, they’re correct - they’re the ones that had hyphens in, but I moved them to using the dot because it’s what Japanese dictionaries use, and it’s way more legible. It shows the distinction between furigana (the actual reading) and okurigana (any following kana which are required for that reading to apply). Those subcategories are then grouped by furigana only, which is new (e.g. see Category:Japanese kanji with kun reading あ for an extreme example), but that makes up a tiny proportion of the “new” categories since there aren’t that many readings which always need okurigana, so most of those group categories already had something in them.

They’ve been slower to clear than I thought, as they’re often edge-cases, but I’ll spend more time doing it. Theknightwho (talk) 11:30, 1 July 2024 (UTC)Reply

Great, thanks, and apologies for all the questions; some of the workings of Japanese are a bit mysterious. Benwing2 (talk) 21:28, 1 July 2024 (UTC)Reply

In the process of trying to run my script on Okinawan categories I found and fixed some bugs where it was giving up too soon in certain situations. But in the process I found that some pages have "disallowed" reading types listed such as jubakoyomi. "Disallowed" means it's not in the allowed_reading_types list that I compiled awhile ago based on something or other (I don't remember what). In particular there are four reading types that my script knows about but doesn't have listed in allowed_reading_types: jukujikun, jubakoyomi, yutoyomi and irregular. Should any of these be allowed (and what are they)? Benwing2 (talk) 02:47, 3 July 2024 (UTC)Reply

@Benwing2 They're disallowed in the sense that we shouldn't have "X terms spelled with Y read as Z" categories for those reading types, since jukujikun, jūbakoyomi and yutōyomi have to be compound readings by definition, and we don't categorise irregular readings since they should just be one-offs limited to one particular compound term (which can happen for tons of reasons: reading contractions/syncopations, kanji substitution, kanji have been swapped around etc etc). Personally, I think we should get rid of jūbakoyomi and yutōyomi in kanjitab, since they apply to two-kanji compounds that are on/kun and kun/on respectively, so they're trivial to determine automatically, and using them as inputs discourages people from giving the specific type of on'yomi + makes processing slightly trickier. Theknightwho (talk) 02:56, 3 July 2024 (UTC)Reply

I see, thanks. I ran into this because some of the 'X terms spelled with Y read as Z' categories do seem to be based on jūbakoyomi readings. E.g. Category:Japanese terms spelled with 頁 read as ぺーじ under the term 欠頁 has {{ja-kanjitab|けつ|ぺーじ|yomi=j}}. The following should probably be looked at:

Page 3 Category:Japanese terms spelled with 頁 read as ぺーじ: 欠頁: WARNING: Disallowed reading type jubakoyomi: {{ja-kanjitab|けつ|ぺーじ|yomi=j}}
Page 3 Category:Japanese terms spelled with 頁 read as ぺーじ: 缺頁: WARNING: Disallowed reading type jubakoyomi: {{ja-kanjitab|けつ|ぺーじ|yomi=j}}
Page 15 Category:Japanese terms spelled with 剗 read as せ: 剗海: WARNING: Unrecognized reading type : {{ja-kanjitab|せ|o1=の|うみ|yomi=,k}}
Page 22 Category:Japanese terms spelled with 吾 read as わが: 吾輩: WARNING: Disallowed reading type yutoyomi: {{ja-kanjitab|わが|はい|yomi=y|alt=我輩,我が輩,吾が輩}}
Page 33 Category:Japanese terms spelled with 嶺 read as れ: 嶺臣: WARNING: Disallowed reading type jubakoyomi: {{ja-kanjitab|れ|おみ|k2=お|yomi=j}}
Page 36 Category:Japanese terms spelled with 怜 read as れ: 怜生: WARNING: Disallowed reading type jubakoyomi: {{ja-kanjitab|れ|お|yomi=ok}}
Page 55 Category:Japanese terms spelled with 漕 read as こぎ: 阿漕: WARNING: Disallowed reading type jubakoyomi: {{ja-kanjitab|あ|こぎ|yomi=j}}
Page 70 Category:Japanese terms spelled with 竈 read as へ: 黄泉竈食い: WARNING: Saw 4 chars in contents title but 3 readings よも2,へ,く, skipping: {{ja-kanjitab|よも2|o1=つ|へ|く|k3=ぐ|r=y|yomi=juku2,k,k}}
Page 127 Category:Japanese terms spelled with 銜 read as くわ: 銜え煙草: WARNING: Saw 3 chars in contents title but 2 readings くわ,たばこ2, skipping: {{ja-kanjitab|くわ|たばこ2|yomi=k,juku2}}

These are just the ones issued for the first 132 categories my script processed; I hit ^C after that. Benwing2 (talk) 03:05, 3 July 2024 (UTC)Reply

Hmm, possibly the last two issues are spurious? What does e.g. the 2 in よも2 in the first argument to {{ja-kanjitab}} mean? Benwing2 (talk) 03:07, 3 July 2024 (UTC)Reply

@Benwing2 よも2 means it applies to two kanji, so those are spurious, yeah. Theknightwho (talk) 03:09, 3 July 2024 (UTC)Reply

@Benwing2 Thanks - I can't remember off the top of my head if kanjitab takes "jubakoyomi" and then categorises the first as "on" and the second as "kun", but it sounds like that's what it's doing.

To be honest, this is just another reason why we need to get rid of those two as inputs, as they're completely unnecessary. Theknightwho (talk) 03:08, 3 July 2024 (UTC)Reply

No objections from me :) ... Benwing2 (talk) 03:12, 3 July 2024 (UTC)Reply

All of the Japanese errors are gone, but I don't see you having edited all the pages in question. Did I miss this or did you change the code to not throw these errors? Benwing2 (talk) 05:42, 7 July 2024 (UTC)Reply

@Benwing2 I changed it to tracking temporarily since some of them were awkward and it was really slow going. I’ll restore it once they’re all fixed. Theknightwho (talk) 12:51, 7 July 2024 (UTC)Reply

Do not ever restore personal attacks, for any reason

Latest comment: 2 months ago3 comments3 people in discussion

I carefully redacted phrases on User talk:Koavf which I interpret as personal attacks against my person. You chose to restore those personal attacks on someone else's user talk page. Never do that again, or I will report you to administrators for personal attacks on your own behalf. These are not your statements to make, nor restore, nor support. Please stay out of it. Elizium23 (talk) 01:42, 7 July 2024 (UTC)Reply

@Elizium23 Firstly, "random user" or "rando" is not a personal attack by any metric. Secondly, even if it was, you do not have the right to edit another user's comments to remove it anyway. I will block you if you attempt to do this again. Theknightwho (talk) 01:44, 7 July 2024 (UTC)Reply

@Elizium23 Theknightwho is correct here. You're not supposed to edit other people's comments, as a general rule. There may be exceptions, e.g. if someone posted doxxing material or used a slur like the n-word; in such a case the whole comment should be revdel'ed by an admin. More innocuously, sometimes I edit other people's comments to fix Lua errors caused by changes in templates that the comments make use of. But other people's comments may not be edited to change their meaning, nor to fix typos, nor to fix errors of fact, nor to remove personal attacks that fall short of the standard for revdeling, etc. Benwing2 (talk) 05:40, 7 July 2024 (UTC)Reply

Emptying category

Latest comment: 2 months ago4 comments2 people in discussion

Is there a reason you are emptying Category:Okinawan Han characters? Is there a consensus to delete this? —Justin (koavf)❤T☮C☺M☯ 07:41, 13 July 2024 (UTC)Reply

@Koavf I’m not. Where are you getting that from? Theknightwho (talk) 07:47, 13 July 2024 (UTC)Reply

https://en.wiktionary.org/w/index.php?title=Category:Okinawan_kanji_with_kun_readings_missing_okurigana_designation&curid=8237915&diff=80702389&oldid=63780908 —Justin (koavf)❤T☮C☺M☯ 08:52, 13 July 2024 (UTC)Reply

@Koavf That’s a maintenance category, and I changed it to {{auto cat}}. It belongs in entry maintenance. Theknightwho (talk), Theknightwho (talk) 15:38, 13 July 2024 (UTC)Reply

CJKV transclusion fun

Latest comment: 2 months ago1 comment1 person in discussion

I've been going through Wiktionary:Todo/Lists/Entries using nonexistent templates, and I found 45 entries listed as using "zh-cp". Early on I found a fat-finger typo for {{zh-co}} in 頭 and fixed it. After that I kept finding false positives: it was in the transclusion list but searching for "zh-cp" in the source turned up nothing. Finally, I did an insource: search for it and found nothing anywhere in mainspace. So I did null edits on all 45 pages and Special:WhatLinksHere/頭 went from 45 pages to none. Apparently simply linking to a Han-character entry adds all of that page's templates to the linking page's transclusion list. I believe someone mentioned that on the Grease pit, as well as a similar effect with {{etymon}}, but this is a good illustration of just how much that can affect template-related problem solving. Chuck Entz (talk) 03:09, 15 July 2024 (UTC)Reply

Vandal

Latest comment: 2 months ago1 comment1 person in discussion

Hello, I would like to ask you (as I'm not able to edit WT:VIP) if you can block Xdwev vfre2wwd (talk • contribs • global account info • deleted contribs • nuke • abuse filter log • page moves • block • block log • active blocks). This user is a cross-wiki vandal who has also vandalised here. Regards Wüstenspringmaus (talk) 17:46, 15 July 2024 (UTC)Reply

Hindustani

Latest comment: 2 months ago4 comments2 people in discussion

I was not engaging in edit warring. I made one reversion only. The user provided no source to back up his claim of Urdu = Hindustani. He/she had changed definitions that had been there for a long time. See: https://en.wiktionary.org/w/index.php?title=Hindustani&oldid=78733566. It has been a long established Wiki consensus to treat Hindi and Urdu as lects of Hindustani rather than equate to equate one as equivalent to Hindustani. See Talk: हिंदी for a discussion on this. Foreverknowledge (talk) 04:17, 18 July 2024 (UTC)Reply

@Foreverknowledge The question of how we as a dictionary use the term Hindustani is not the same as the question of whether a sense exists (or has existed historically) that deserves mention on the entry. For instance, the fact that we use the term Bengali instead of Bangla does not mean that we obliterate sense 1 (nonstandard) Synonym of Bengali (language), and neither do we delete the term Eirish because we choose Irish instead. You are being prescriptive, and that is not how things work. If a label like "historical" or "nonstandard" is appropriate, then that's fine, but you don't just delete it without going through the proper process. Theknightwho (talk) 04:34, 18 July 2024 (UTC)Reply

The quotation should belong to the first definition, as it initially was, before being moved to the Urdu definition by that user without any support. So please make that change. Foreverknowledge (talk) 04:40, 18 July 2024 (UTC)Reply

@Foreverknowledge Sure. Theknightwho (talk) 04:44, 18 July 2024 (UTC)Reply

"Terms spelled with" categories

Latest comment: 2 months ago2 comments2 people in discussion

There are still 3 of these in CAT:E that seem to be the result of changes you made a couple of days ago. Chuck Entz (talk) 14:50, 18 July 2024 (UTC)Reply

@Chuck Entz Thanks - fixed. Theknightwho (talk) 16:45, 18 July 2024 (UTC)Reply

Template internals showing in module error from Module:category tree/poscatboiler/data/lang-specific/jpx

Latest comment: 2 months ago2 comments2 people in discussion

Category:Japanese terms spelled with 兜 read as とう had a rather odd module error due to {{auto cat}} being used without a reading type. I fixed it by copying the reading type from the sole entry in the category, but you may want to rethink using inline formatting templates like {{code}} in error messages. The message was:

Lua error in Module category_tree/poscatboiler/data/lang-specific/jpx at line 626: For categories of the form "Japanese terms spelled with KANJI read as READING", at least one reading type (e.g. <syntaxhighlight inline="1" lang="text">kun</syntaxhighlight> or <syntaxhighlight inline="1" lang="text">on</syntaxhighlight>) must be specified using 1=, 2=, 3=, etc.

This obviously only comes up when there's a similar category without the necessary parameters and it only affects readability, so it's not urgent. Chuck Entz (talk) 21:19, 19 July 2024 (UTC)Reply

@Chuck Entz Thanks - good to know. I'll change it. Theknightwho (talk) 21:50, 19 July 2024 (UTC)Reply

redirects

Latest comment: 1 month ago1 comment1 person in discussion

I thought of this when I saw your recent edits to ↑ etc.

I've redirected superscript letters that are used in phonetics to the baseline letter, and describe their use there, under the phonetic definition. E.g. ᵄ. I did this after the example of the digits, e.g. ⁰. Should I split the phonetic symbols off as separate articles? (I made exceptions for superscript letters that have more complex usage, or that are specified by the IPA for certain uses. Those have their own articles.) kwami (talk) 23:13, 24 July 2024 (UTC)Reply

Reconstruction:Latin/ranceo

Latest comment: 1 month ago2 comments2 people in discussion

The problem with throwing an error for headwords in the Reconstruction namespace without "*" is that it doesn't account for normal languages where only the unattested forms are reconstructed. Unless I'm mistaken, that makes it pretty much impossible to use language-specific headword templates in reconstructed lemma forms. But then, I'm not sure there's a satisfactory way for headword and inflection templates in attested/normal languages to deal with forms that are in the reconstruction namespace. Since attestation patterns aren't the sort of thing that one can code for, there would have to be some kind of manual override. Chuck Entz (talk) 06:30, 26 July 2024 (UTC)Reply

Reconstruction:Latin/ranceo shouldn't exist: it's duplicated by ranceō, and given that non-participial forms, including "ranceo", appear in glosses or word lists later on, it makes more sense to have the entry just be in the main namespace with a usage note. However, given that Reconstruction:Latin/ranceo existed separately from 2019 up until now, I think the edit history ought to be merged into that of ranceo, but I'm not really familiar with that process, which seems to require an admin.--Urszag (talk) 06:59, 26 July 2024 (UTC)Reply

送り仮名

Latest comment: 1 month ago2 comments2 people in discussion

I'm guessing the transclusion module is having trouble with all of the entry-scraping done by the templates at English okurigana. So far, this is the only such entry in CAT:E, but there are others transcluding the same sense that just haven't updated the category links. Chuck Entz (talk) 20:58, 3 August 2024 (UTC)Reply

@Chuck Entz Thanks - I spotted this shortly before you commented here. The short-term solution is to not use {{tcl}}, as the proper fix requires a total rewrite of the transclusion module. The current one is a bit of a bodge that has special-purpose handling for a select few templates, as it pre-dates the template parser. Theknightwho (talk) 21:03, 3 August 2024 (UTC)Reply

Why've you removed the prononciation /ɛː/ for the SQUARE vowel?

Latest comment: 1 month ago2 comments2 people in discussion

A few months ago you removed the prononciation /ɛː/ for the vowel of SQUARE, also written /ɛə/ or /eə/. You gave the reason that /ɛ/ (a separate vowel from /ɛː/) is a lax vowel and lax vowels cannot be lengthened in RP. I really don't understand this raisoning. Yes, /ɛː/ happens to share the quality of /ɛ/ a "lax vowel". But that doesn't mean that the quality [ɛ] is fundamentally lax. /ɛː/ can appear at the end of syllables in English, whereäs /ɛ/ cannot. Collins [14] and Cambridge [15] both give the transcription /eəʳ/ for the word "square", but the audio is clearly [skwɛː] in both cases. The OED [16] states that "For many RP speakers, SQUARE is not diphthongal as traditionally regarded but a tense monophthong in the region of the DRESS vowel." So it seems that the prononciation /ɛː/ for the SQUARE vowel is not nonexistant as you claim, but is in fact accepted and widespread. I think we should at least list it as an alternative prononciation. UmbrellaTheLeef (talk) 21:23, 4 August 2024 (UTC)Reply

@UmbrellaTheLeef Different dictionaries take different approaches to representing the same phoneme. There is nothing phonemically distinct between the phonetic realisations [ɛə] and [ɛː] in English, so we shouldn't be giving them as two separate phonemic pronunciations. We need to pick /ɛə/ or /ɛː/, and I chose /ɛə/ in accordance with Appendix:English pronunciation. All of the differences you refer to are phonetic, which aren't relevant in a phonemic transcription. Theknightwho (talk) 21:28, 4 August 2024 (UTC)Reply

rfap bugfix bugfix

Latest comment: 1 month ago2 comments2 people in discussion

Hi, please double check my edit to your bugfix to the rfap template. Ncfavier (talk) 11:49, 16 August 2024 (UTC)Reply

@Ncfavier Yes, that's correct - thanks. Theknightwho (talk) 11:58, 16 August 2024 (UTC)Reply

Error thrown but probably okay

Latest comment: 1 month ago4 comments2 people in discussion

Hello, Knight. I noticed a puzzling category, named Pages with 1 entry on a definition in mainspace (laud or maybe laudative). I clicked and was dismayed that it threw an error. So I dredged around a bit, and noticed you had created 5 new categories of the form Category:Pages with x entry/entries for x = 1 to 5 and entry = number of language(s). I also saw your reason for doing so, it was documented, to check number of Wiktionary definitions. I just wanted you to be aware re Category:Pages_with_1_entry, as it has over 1.042 million members. Since it is a hidden category, it is probably okay, so I'm just casually mentioning it. I promise not to click on it again! FeralOink (talk) 08:04, 17 August 2024 (UTC)Reply

@FeralOink Thanks for letting me know - I hadn't realised this limit existed. They haven't fully populated yet, but the intention is to use these categories to determine the total number of entries by summing the number of members in each category, so the only thing that really matters is that {{PAGESINCATEGORY:Pages with 1 entry|pages}} works, which it seems to for the time being. Theknightwho (talk) 13:42, 17 August 2024 (UTC)Reply

Not to be an annoyance, but Pages with 1 Entry doesn't work for me. Pages with 5 entries and 3 entries are just fine. I had tried them earlier. However, when I try to navigate to Pages with 1 entry now (I did it again to tell you, promise I won't repeat) I am still getting this error message:

“To avoid creating high database load, this query was aborted because the duration exceeded the limit. If you are reading many items at once, try doing multiple smaller operations instead.

[0e35f7a0-5b4a-4d7c-9816-2f19f82f4a02] 2024-08-17 15:14:24: Fatal exception of type "Wikimedia\Rdbms\DBQueryTimeoutError.”

I'll let you take over from here!--FeralOink (talk) 17:23, 17 August 2024 (UTC)Reply

@FeralOink Sorry, I understand what you mean - what I meant was that so long as {{PAGESINCATEGORY:Pages with 1 entry|pages}} works (e.g. 4,148,652) then the fact that the page itself won't load isn't a big deal, since what we need it for is that total number. Thanks again. Theknightwho (talk) 17:25, 17 August 2024 (UTC)Reply

Romanization of 炎（ほのお）

Latest comment: 2 hours ago37 comments5 people in discussion

Hello Theknightwho - The romanization of 炎（ほのお） is not correct. Honoo is correct, not honō. The 4th edition of the Kenkyusha's New Japanese-English Dictionary (which is the authoritative dictionnary for the Hepburn romanization used on Wiktionary) has honoo. The reason is that this word is not pronounced with a long voyel but instead the last o is pronounced on its own (of course, in rapid speach the difference may not be clear). Several accent/pronunciation dictionaries including the famous NHK accent dictionary shows ホノオ, instead of *ホノー. Thank you for reconsidering this. Maidodo (talk) 01:10, 24 August 2024 (UTC)Reply

@Maidodo Alright - I’ve changed the romanisation. The main issue was the big warning you added, but I agree that the morpheme boundary means the romanisation should be “honoo”. I’ve also changed it in several other places to match, too. Theknightwho (talk) 07:22, 24 August 2024 (UTC)Reply

@Theknightwho - Thank you. I am not knowledgeable about wiki conventions, so I appreciate you putting everything in order. Maidodo (talk) 08:46, 24 August 2024 (UTC)Reply

@Theknightwho, @Maidodo, while this term derives from a compound, arguably there is no morpheme boundary anymore in modern Japanese: this word is lexicalized as a single thing, ほのお, not ほ + の + お. Much like あにい is historically an apparent result of "K" elision from あにき, leaving us with monomorphemic あにい, resultingly romanized as anī.

(Minor quibble on my part, I don't feel super strongly about this.) ‑‑ Eiríkr Útlendi │^{Tala við mig} 01:05, 4 September 2024 (UTC)Reply

@Eirikr: I think TKW has only mentioned morpheme boundaries on this talk page. Vance 2008:62 (The Sounds of Japanese) analyzes honoo as synchronically one morpheme but says it can be pronounced nevertheless with rearticulation between [no] and [o].--Urszag (talk) 03:43, 4 September 2024 (UTC)Reply

@Eirikr @Urszag @Theknightwho Hello. I agree that ほのお can be seen as monomorphemic for contemporary native speakers, but the NHK and Shinmeikai pronunciation dictionaries show that this is not a long vowel. My point was based on the assumption that ō should be only used for long vowels (長音). But I am aware that this is not as easy as it seems, especially when you take 遠い which is pronounced トーイ, while its past form 遠かった is pronounced トオカッタ due to the accented オ. Maidodo (talk) 09:03, 4 September 2024 (UTC)Reply

@Urszag, @Maidodo, point taken that ほのお is one morpheme, but with distinct "o" phonemes.

@Maidodo, interesting point about とおかった. That said, the two "o" morae there do not have a downstep separating them (as we see with the latter two "o"s in ほのお), no clear phonemic break, so I feel like there is less reason to romanize the "o"s separately. Does that make sense? ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:59, 4 September 2024 (UTC)Reply

It doesn't make much sense to describe the downstep as separating vowels, since the normal pattern for an accented syllable that ends in a long vowel is to show downstep before the second mora, such as [koː] in 校舎 kōsha kóꜜòshà). That is why a downstep after the second mora is often taken as evidence for the vowels being in separate syllables.--Urszag (talk) 18:19, 4 September 2024 (UTC)Reply

"separate syllables"

??? I don't recall syllables being much of a consideration for standard Japanese?

If we ignore downsteps entirely, then surely ほのお should be transcribed as honō, no? What other justification would we have for treating the last two "o" morae as separate, for romanization purposes? ‑‑ Eiríkr Útlendi │^{Tala við mig} 23:54, 4 September 2024 (UTC)Reply

@Eirikr @Urszag

As you probably know NHK and Shinmeikai are authoritative dictionary in terms of Japanese standard pronunciation. Both draw a difference between オ列＋長音記号 (e.g. コー in 氷) and オ列＋オ (e.g. ノオ in 炎), despite similar kana spelling (kana is not strictly phonetic, as you know). Long vowels are always rendered by a chōonpu (ー) in those dictionaries. 校舎 is コーシャ, with an accent on the first mora. The fall of pitch after コ does not make the ー part being pronounced as a separate vowel. But when the accent comes on the second mora, like in トオカッタ, then オ is clearly pronounced separately in standard / normal speed speech. In that regards, 炎 looks a very specific case, as the final /o/ is not accentuated while being pronounced as a separate vowel.

With that in mind, regarding the romanization, I believe it is a matter of convention. I am now supposing that the rule on Wiktionary (contrary to the Kenkyusha's dictionary) implies that only morpheme boundaries (morphemes as understood by contemporary native speakers, regardless of etymological analysis) should prevent using a macron (like in 里親 satooya). Then, honō would be okay. That said, I believe it should be noted somewhere that ō is not always pronounced /oR/ (i.e., a long /o/) but sometimes as two distinct /o/ as in /honoo/ or /tookatta/. In other words, this convention on macro usage is not 100% based on genuine pronunciation: it is rather based on the kana spelling and the morphemic boundaries.

Finally, the 現代仮名遣い rules themselves have acknowledged this ambiguity: https://www.bunka.go.jp/kokugo_nihongo/sisaku/joho/joho/kijun/naikaku/gendaikana/honbun_dai2.html point #6

It says that オ列＋オ is not necessarily a long vowel, and may depends on the cases. Please note that, in this remark provided by the 現代仮名遣い rules, the point is not the obvious morphemic boundaries like in the word 里親 satooya, which are clearly not long vowels.

Following this convention, ō would never be ambiguous when the macron represents the kana う, but ō may be ambiguous when it represents the kana お. I guess this is worth noting this nuance somewhere in the romanization conventions. Something similar happens with the romanization of ん as 'n', while the actual pronunciation of this mora depends on the cases. Same with the nasalisation of ガ行 (鼻濁音). Maidodo (talk) 00:29, 5 September 2024 (UTC)Reply

@Eirikr This is a side issue, but how do Japanese dictionaries represent phonetic breaks like this? Is there a way to do it using kana?

The reason I ask is that 炎 is categorised in Category:Japanese kanji with kun reading ほのお whether we analyse it as ほのお (honō) or ほのお (honoo), but there are minimal pairs. e.g. 泳(えい) (ei) [e̞ː] and 鱏(えい) (ei) [e̞i], which both ultimately go into Category:Japanese kanji read as えい. Theknightwho (talk) 23:11, 5 September 2024 (UTC)Reply

The 泳(えい) (ei) [e̞ː] and 鱏(えい) (ei) [e̞i] examples are a bit different, since there we have contrasting vowel values, while in 炎 we have the same /o/ vowel in all morae.

(Side note 1: The cases I'm aware of where kana ⟨ [C]ei ⟩ is realized as [e̞i] are kun'yomi, and derive from older forms with an interstitial consonant that has since elided, such as 姪 (mei), /mei/; whereas the on'yomi instances of kana ⟨ [C]ei ⟩ that I can think of are usually pronounced as long [e̞ː].)

(Side note 2: The "fancy" editor will offer to turn copy-pasted categories from your post above into wikicode, but it doesn't handle copy-paste of {{ja-r}} at all, and just omits it entirely from the pasted content. Unsure if that's anything fixable...)

In terms of kana notation to indicate differences in pronunciation, @Maidodo's post above uses that to some extent: from what I've seen, Japanese resources that include this information are consistent in indicating long vowels with the 長音譜 or lengthening mark ー, while separate vowels are spelled out using the relevant kana. For the 泳(えい) (ei) and 鱏(えい) (ei) examples above, the former would be annotated as エー, and the latter as エイ. That said, I think えい is still the "reading", inasmuch as the よみ (be it 音読み or 訓読み) is spelled that way. Reading = よみ = kana spelling, while what we're talking about in this thread has more to do with the finer points of pronunciation (発音). ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:28, 6 September 2024 (UTC)Reply

@Eirikr I suppose that's true to a point, but I think we do need to make a distinction, because there's a minimal triple:

江(ええ) (ē) [e̞ː]
泳(えい) (ei) [e̞ː]
鱏(えい) (ei) [e̞i]

Japanese seems to retain a pretty consistent distinction between the pronunciations of エー (ē) and エイ (ei) in katakana due to the influence of foreign loans (which I suspect is why 鱏, usually written エイ, has maintained [e̞i], even though it's written in katakana for a completely different reason), so I think there's a case for treating these three separately until such time that kana spelling gets updated again. Theknightwho (talk) 17:50, 6 September 2024 (UTC)Reply

I'm curious, where are you seeing ええ listed as a reading for 江? I'm only familiar with this kun'yomi as the single-mora え, and looking right now in my references, that's the only kun'yomi listed for this kanji. ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:55, 6 September 2024 (UTC)Reply

@Eirikr We list it as a nanori reading. I made a slight mistake, though - we actually list it as 江(ええ) (ee) [e̞e̞], though I've managed to find it in Jim Breen's dictionary ([17]), albeit with no clarification as to whether it's ē or ee, since that dictionary doesn't use macrons. Regardless, it's still a minimal triple. Theknightwho (talk) 17:59, 6 September 2024 (UTC)Reply

Ah, thank you, nanori are weird ones. 😄

Re-reading this, I don't see a triple?

For ええ as pronunciation /eː/, we have nanori kana spelling ええ for 江 and on'yomi kana spelling えい for various kanji.
- In romanization, the yomi ええ should be rendered as ⟨ ē ⟩, and the yomi えい should be rendered as ⟨ ei ⟩.
For えい as pronunciation /ei/, we have kun'yomi kana spelling えい for 鱏.
- In romanization, this yomi should be rendered as ⟨ ei ⟩.

I think your underlying point is that we have mismatches between the yomi and the pronunciations, about which I agree. So long as we keep pronunciation and yomi templates separate, and have means for editors to manually specify overrides where appropriate, I think we're in the clear, no? ‑‑ Eiríkr Útlendi │^{Tala við mig} 23:35, 6 September 2024 (UTC)Reply

@Eirikr It becomes an issue with Category:Japanese terms spelled with kanji read as えい, because you end up with two readings mixed together, due to the differing pronunciations. Fundamentally, I think that the pronunciation and spelling are both core aspects of a reading, which means that I see these as three separate readings:

Spelled ええ, read [e̞ː].
Spelled えい, read [e̞ː].
Spelled えい, read [e̞i].

Perhaps I'm coming at this from a different angle because of my background in Chinese (where pronunciation is everything when it comes to differentiating readings), but I don't think it makes sense to only consider the kana spelling either. Theknightwho (talk) 00:14, 7 September 2024 (UTC)Reply

Hmm, hmm. As far as Japanese lexicographic resources go, yomi = kana spelling, and as far as English-language references of Japanese go, yomi = "reading". It also follows that yomi ≠ pronunciation (at least, not as a one-to-one match). This is why those Japanese references that include pronunciation details have developed different notation for this, and list this separately from the straightforward kana spelling of the word.

As for Category:Japanese terms spelled with kanji read as えい, that could never (and should not) include any terms with the kana spelling of ええ, since ええ ≠ えい. That said, the category should include both 永 and 鱏, as both of these are spelled in kana as (i.e. have the yomi of) えい.

I think the underlying issue is that yomi and pronunciation do not always match. We have so far categorized by yomi, not by pronunciation. HTH! ‑‑ Eiríkr Útlendi │^{Tala við mig} 00:30, 7 September 2024 (UTC)Reply

@Eirikr: Setting aside pitch accent, it's possible for two vowel sounds to be separated by a 'rearticulation' before the second vowel sound. This contrasts with a pronunciation of a single elongated vowel sound, which has no rearticulation. I put some information about this in the Wikipedia "Japanese phonology" article and you can see the relevant acoustic data in Vance's The Sounds of Japanese, page 59: rearticulation is not a complete pause or stop, but it is visible as a dip in intensity. So there is simply a phonetic difference that can be observed between a careful pronunciation of long [oː] and of double [o*o] (where "*" represents this kind of rearticulation; there seems to be no unambiguous IPA symbol for it), even if they have identical pitch contours. Per Vance, [o*o] can be interpreted phonologically as /o.o/, with a syllable boundary. While syllables aren't self-evident to non-linguistically trained Japanese speakers, they are recognized as a prosodic unit of Japanese by a number of linguists. Actually, even some analysts who argue against the existence of the syllable in Japanese, such as Laurence Labrune, don't dispute that there can be a phonetic distinction between double vowels and long vowels (The Phonology of Japanese, 2012, page 45). I hope that helps clarify why one might use the transcription "honoo" as opposed to "honō".--Urszag (talk) 23:16, 5 September 2024 (UTC)Reply

@Urszag, I appreciate the further detail. I am familiar with the basics of the phenomenon, as seen for instance in the nonce sentence, 鳳凰(ほうおう)を追(お)う王(おお)を覆(おお)おう (hōō o ou ō o ōō, “let's hide the king who is chasing the phoenix”). What I'm trying to tease out here (albeit without having articulated it well) is some kind of clear guideline or set of criteria for how we [WT JA editors] determine the romanization of doubled vowels in a given term — something we [those of us in this thread] could add to the WT:JA TR page. ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:43, 6 September 2024 (UTC)Reply

@Eiríkr Since it’s a (minor) ambiguity in the spelling system, there isn’t guaranteed to be any simple rule for when words spelled with o-line kana + お are pronounced with "ō" vs. "oo" other than "look up the pronunciation in a pronunciation dictionary and see if it uses ー", as @Maidodo mentioned. (Compare the need to use references for pronunciation information in languages like English, French, etc., rather than just assuming the pronunciation based on the spelling.) But as a practical rule of thumb, the pronunciation with separate vowels "oo" almost never occurs except across a morpheme boundary: ほのお is a rare exception, not part of any well-defined larger category of systematic exceptions to my knowledge. So the guideline I'd like to see followed would be "following the pronunciations given in pronunciation dictionaries, romanize long おー as ō and double おお as oo; if that is unavailable, romanize as ō within a morpheme, oo across morpheme boundaries." I'm not sure which specific dictionaries are the best sources. Another example I can think of is 須佐之男 where the romanization Susanoo, reflecting the pronunciation with two short [o] sounds, is preferred over Susanō. Distinguishing "ō" and "oo" strictly on the basis of morphology, with zero reference to actual pronunciation, doesn't make as much sense to me as a prescribed norm of romanization: even if that criterion is slightly easier to apply, it isn't completely simple since morpheme boundaries aren't always obvious, and the information it conveys to the reader doesn't seem particularly useful or simple to interpret. That said, actual practice seems to be somewhat unsettled. I found these documents online that discuss the use of the macron: CEAL Response to “Clarification of LC practice concerning the use of diacritical marks in Japanese romanization” (January 22, 2012: "Appendix E: Another diacritic mark: the Macron") and CTP/CJM Joint Task Force’s Response to LC’s “Proposed Revision of the ALA-LC Japanese Romanization Table” (June 13, 2012). Some proposed formal criteria for use of ō vs. oo are discussed in these.--Urszag (talk) 19:06, 6 September 2024 (UTC)Reply

@Eirikr @Urszag It occurs to me that this is partially automatable without the need for special formatting like . so long as a module has access to both the kanji and kana inputs (as with {{ja-pos}} and {{ja-r}}), by simply blocking long vowels if they would occur across a kanji boundary. This wouldn't catch all cases, but it would catch quite a few of them. A quick-and-dirty solution would be for the module to simply add . in all the relevant places before feeding it through to the main transliteration module, since it doesn't matter if they're redundant. Theknightwho (talk) 19:30, 6 September 2024 (UTC)Reply

@Urszag @Eirikr @Theknightwho As you pointed out, the romanization ei is also another example where romanization, as well as kana orthography, is ambiguous in terms of phonological interpretation within morphemes. Sino-Xenic words can always be pronounced エ列＋長音符号 (/eR/ = a long 'e'), while dictionaries like the Shinmeikai Nihongo-Akusento note that pronouncing it as /ei/ is somehow common but is not standard except in low-speed speech (Ref: introductive commentaries, p.25). The Shinmeikai dictionnary has a specific notation using a sub-scripted ★ : this mark indicates that the standard pronunciation is /eR/ (long vowel) but that is is okay to pronounce it /ei/ in low speed. For example, 栄 is エイ_★ while 'ray' (fish) is エイ. (I am ignoring the accents). As you know, ray is not a Sino-Xenic term (ei is a kun'yomi). As said by @Eirikr, within morphemes, the kun'yomi-based words have other cases like 'ray', but it doesn't mean that everything is pronounced /ei/ in the non-Sino-Xenic realm : sometimes it is identical to Sino-Xenic behaviors (e.g. アカエイ (fish) pronounced アカエイ_★ ; or 稼いで pronounced カセイ_★デ), sometimes it is a mere long vowel (e.g. 中背 pronounced チューゼー). In my view, it is often influenced by the pitch accent pattern. Regarding 'ei', to sum up I would say that the cases a morpheme is exclusively pronounced /ei/ are limited to kun'yomi and are very rare.

Back to the お列＋お discussion, thank you for the phonetic details on 'rearticulation'. It's worth noting that, when we say double vowel, it is actually a phonological abstraction, since, in reality, the pronunciation is more subtle and may vary depending on situations, notably speed. @Urszag's proposal makes a lot of sense, to my view. I am just slightly amending it using katakana following the Japanese custom of using hiragana for orthography (仮名遣い) and katakana for pronunciation. "Following the pronunciations given in pronunciation dictionaries, romanize long オー as ō and double オオ as oo; if that is unavailable, romanize as ō within a morpheme, oo across morpheme boundaries." The only difficulty I see is when there are several accepted pronunciations based on distinct pitch accent usage. Those cases are probably not so numerous, but I can think about 覆う, which is pronounced either オオウ or オーウ: the former has an accent on the second mora, while the later is not accentuated (平板型). What I would suggest is that, when accepted pronunciations includes one with a long vowel (オ列＋長音符号), then ō is chosen as the standard romanization. The reason is that ō is a kind of default romanization for お列＋お and we do not want to create numerous exceptions. Then, we would have ōu, tōi, but tookatta, because, indisputably, トオカッタ is the only standard pronunciation of the past form of 遠い.

Finally, you may be aware that the Japanese government is now working on a revised standard rōmaji system to replace the currently official kunrei-shiki rōmaji which is not used a lot in practice. The discussions are still ongoing, as you can see here https://www.bunka.go.jp/seisaku/bunkashingikai/kokugo/roman/roman_03/94105001.html (latest meeting was on August 29 this year), but it seems almost certain that the long vowel will be shown using a diacritical mark on the vowel (長音は、これまでと同様、母音字に長音符号を付して表す), but I am not sure what will be decided regarding え列＋い and お列＋お. My guess is that え列＋い will never use diacritical mark. Regarding お列＋お, I guess that ō will be chosen, but I am curious of the way it will treat the cases were お列＋お is actually a double vowel within a morpheme, like with ほのう. Those subtleties are explicitly mentioned in the 仮名遣い rules, so logically, they should at least be mentioned in the romanization system to be published – but we will see. A dictionary like Wiktionary can also be more accurate than the government, anyway. Maidodo (talk) 22:05, 6 September 2024 (UTC)Reply

@Urszag, @Maidodo, I support your proposals, with one caveat:

"if that is unavailable, romanize as..."

What does "that" refer to here? Do you mean, "if you do not have access to a pronunciation dictionary that gives pronunciations in kana notation"?

@Theknightwho, I confess I don't entirely follow, but from what little I think I understand, it sounds good to me. 😄 ‑‑ Eiríkr Útlendi │^{Tala við mig} 23:44, 6 September 2024 (UTC)Reply

Yes, that's what I had in mind: using morphological structure as a fallback criterion if there's no available entry for a term in a pronunciation dictionary.--Urszag (talk) 23:54, 6 September 2024 (UTC)Reply

@Urszag When there are several standard/accepted pronunciations including one with a long vowel like 覆う (オーウ / オオウ) should we choose the macron as I suggested? Maidodo (talk) 00:44, 7 September 2024 (UTC)Reply

@Maidodo I don't have a strong preference between just using a macron or showing both forms (e.g. "ōu or oou"), although maybe the latter would look too crowded.--Urszag (talk) 00:57, 7 September 2024 (UTC)Reply

Understood. By the way, I can't understand why おおやけ is shown as おおやけ in the pronunction section, while 生命 is shown as せーめー. It would be more consistant in my opinion to have おーやけ. It would also align with Japanese pronunciation dictionaries. Maidodo (talk) 03:17, 7 September 2024 (UTC)Reply

@Eirikr @Theknightwho @Urszag

Since this is a different topic (not about rōmaji), I have started a new discussion here, if you are interested.

Template talk:ja-pron#Confirmation of Convention for Indicating Long Vowels in Pronunciation Sections Using This Template (in kana) Maidodo (talk) 08:21, 7 September 2024 (UTC)Reply

@Maidodo @Urszag My preference would be for us to use ー for long vowels, and to use a new kana for instances where the vowel is reiterated, like ほのお (honoo). This would only be for pronunciation sections and categories that give kana readings, but it would help keep things separate in cases where the orthography is ambiguous. The one exception is that while おう and えい can be normalised to おー and えー in pronunciation sections, that wouldn't be helpful (or correct) in kana reading categories so those should remain as-is. Theknightwho (talk) 14:32, 7 September 2024 (UTC)Reply

@Theknightwho ― Sorry if I am not correctly comprehending your point. It seems that the conversation is now about the kana-represented pronunciation (these notations remain designed /phonologically/ and not [phonetically]; for example ん will be always noted as ン).

Is there a structural difference between {お^列＋お} and {え^列＋い}? In my view, there isn't. The former is [オ^列ー], the latter is [エ^列ー], with exceptions (ほのお、えい(the fish)). I suppose that such exceptions can be handled by inserting a period in the code before the second kana.

I wonder if making a synthetic document summarizing all cases and covering both kana-represented pronunciation and rōmaji could be helpful. If yes, I can work on it. Maidodo (talk) 23:44, 7 September 2024 (UTC)Reply

@Maidodo: The conversation so far has been about the kana pronunciation section. In addition to this, Theknightwho wanted to discuss what spellings to use in the title of categories such as Category:Japanese kanji with kun reading ほのお, Category:Japanese kanji with on reading おう, etc.--Urszag (talk) 00:11, 8 September 2024 (UTC)Reply

@Eirikr @Urszag my thanks! Does it mean that ほのお would be still automatically transcripted as honō on this categorie page? I think I understand the technical difficulty. Maidodo (talk) 01:31, 8 September 2024 (UTC)Reply

@Theknightwho, I feel like something might be getting off track?

You'd stated (emphasis mine):

My preference would be for us to use ー for long vowels, and to use a new kana for instances where the vowel is reiterated, like ほのお (honoo). This would only be for pronunciation sections and categories that give kana readings, [...]

"Reading" in Japanese lexicography has always and only (to my knowledge) been the kana rendering of the term. This is relevant for collation, for kana spelling, and for knowing how to pronounce kanji, with certain caveats as mentioned to some extent in this thread. Our categories for Japanese terms that use the term "read" or "reading", such as Category:Japanese terms spelled with 紗 read as さ or Category:Japanese terms spelled with 儲 read as もう, have been organized based on this same principle, using the standard kana orthography for the term.

If I understand you correctly, you appear to be proposing that we change this so that our categories would instead use a pronunciation-based kana rendering with the nobashi ("lengthener" or the ー bar) for long vowels.

This would represent a major change from how we have operated for years, and it would also be a significant departure from how other resources organize their information. Consequently, I cannot support using the nobashi for the "reading" categories. That said, I have zero objection to using the nobashi for pronunciation sections. ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:02, 9 September 2024 (UTC)Reply

@Maidodo, @Eirikr, @Urszag, @Theknightwho: I didn't read the whole thread, but the reason why 炎 must be transliterated as hónoo and not ~~hónō~~ is very simple: the syllabic breakdown of that word is /ho.no.o/ and not /ho.noo/. This is clear from the different phonetic behaviour when compared with other words ending in -ō (i.e. /-oo/):

炎 (hónoo) + 会 (-kai) = 炎会 (honoókai) > /ho.no.ó.kai/
運動 (undō) + 会 (-kai) = 運動会 (undôkai) > /uɴ.dóo.kai/

The suffix 会 (-́kai) forces an accent in the preceding syllable. If 炎 (hónoo) was to be analysed as /hó.noo/, and therefore transliterated hónō, we would expect it to behave like 運動 (undō), becoming 炎会 (honôkai) (/ho.nóo.kai/, but that's not the case. Instead we have 炎会 (honoókai) > /ho.no.ó.kai/, from which we clearly see that the final /o/ is a syllable on its own. — Sartma ^{【𒁾𒁉 ● 𒊭 𒌑𒊑𒀉𒁲】} 16:17, 21 September 2024 (UTC)Reply

@Sartma: That's good evidence, although in principle, the syllabification of a suffixed form isn't necessarily guaranteed to be identical to that of the unsuffixed form. Even if the suffixed form indicates the root contains something underlying different from a long vowel, it would be possible for this distinction to be phonetically neutralized in the base word: that it isn't is an empirical matter. (I notice that the 5th edition of Hepburn's dictionary transcribes the word as honō, although whether this reflects a change over time in the pronunciation, a change in arbitrary transcription conventions, or just an error is something I am not sure of.)--Urszag (talk) 02:58, 22 September 2024 (UTC)Reply

Hi @Urszag. Sorry, I'm not sure I follow your argument. Can you give some examples of what you mean? As for the Hepburn's dictionary: it was printed 130 years ago, I wouldn't quote it on anything related to phonetics... — Sartma ^{【𒁾𒁉 ● 𒊭 𒌑𒊑𒀉𒁲】} 11:22, 22 September 2024 (UTC)Reply

ζώω

Latest comment: 23 days ago1 comment1 person in discussion

Hi -- I undid your revert, started a discussion on the talk page, and added a reference. -Ben

If you're disputing the existence of Ancient Greek ζάω (záō), the way to do that is not by claiming it's fake at the page for ζώω (zṓō), because all you've done then is make the two entries contradict each other. Theknightwho (talk) 18:36, 29 August 2024 (UTC)Reply

"not comparable"

Latest comment: 18 days ago8 comments2 people in discussion

For rare adjectives, shouldn't we remain agnostic rather than positively claiming they're not comparable? There's no reason something like "more Iapetian" couldn't exist, and if it could, then it's comparable. kwami (talk) 00:13, 4 September 2024 (UTC)Reply

@Kwamikagami Even theoretically, how could Iapetian be used comparatively given the current meaning? Theknightwho (talk) 00:15, 4 September 2024 (UTC)Reply

A more Iapetian albedo contrast, for the moon; for the titan, a greater degree of any quality associated with the titan. Not that anyone has necessarily used a comparative form in the entire history of English, but there's no reason they couldn't in a context they thought made sense. Certainly in poetry it wouldn't even need to make literal sense. "Not comparable" suggests the comparative form is ungrammatical, not unattested. I wouldn't want to list a comparative form either, if we can't attest to it, but we could simply leave it out. kwami (talk) 00:32, 4 September 2024 (UTC)Reply

I found "Japetian" as an old synonym for Indo-European, with "Japetian type" of grammatical structure. That might also plausibly be used in the comparative. Though it's a different etymology, from Japhet rather than Iapetus. kwami (talk) 00:35, 4 September 2024 (UTC)Reply

That took me to Japhetic, which we also claimed was not comparable. But GBooks has several instances of "more Japhetic" and possibly one of "most" (there's no preview on that one). kwami (talk) 00:43, 4 September 2024 (UTC)Reply

@Kwamikagami The issue is that you can do that for (almost) all incomparable adjectives. To be honest, this is something we should probably raise at the WT:BP, because I'm not fully convinced that comparability is something we should be showing in the headword line unless there are inflected forms. Theknightwho (talk) 00:43, 4 September 2024 (UTC)Reply

I agree with you there. There might be exceptions where the expected inflected forms are not possible, but we could mention that in usage notes. E.g. in my dialect, *funner and *funnest are ungrammatical. I don't know if there might be words that are like that across most English varieties, but if so there can't be very many. kwami (talk) 00:47, 4 September 2024 (UTC)Reply

@Kwamikagami That's a good point. Whatever we decide, we definitely need to do something about incomparable adjectives. There are situations where we decide to treat certain things as quirks of the language rather than a true quality of the term (e.g. any uncountable noun can be used countably when comparing types of that thing, but that doesn't stop them being uncountable), so it might be that we just need to hash out exactly what we mean by "not comparable". Theknightwho (talk) 01:02, 4 September 2024 (UTC)Reply

CAT:E again

Latest comment: 14 days ago4 comments3 people in discussion

There are 2 entries (kölsch and natt) due to Module:li-headword-eup passing a parameter in a way that Module:parameters no longer accepts, and a number of Devanagari entries due to Module:new-IPA using a character that Module:IPA no longer accepts. Then there are the Wiktionary-space instances of wikilinks that Module:IPA no longer accepts.

This has been obscured by what was apparently a widely transcluded error corrected after about .0053 ohnoseconds. That flooded the category, which was easily cleared using the API Sandbox purge method only to be flooded again.

There are also a couple (Biełaruś and śviatło) that seem to be the results of @Benwing2 completely reworking inflection modules just before going on vacation. Chuck Entz (talk) 19:32, 6 September 2024 (UTC)Reply

Hi Chuck. I know about those two Belarusian entries; my apologies that I haven't fixed them sooner. Benwing2 (talk) 19:49, 6 September 2024 (UTC)Reply

@Chuck Entz TKW and I have resolved most of these issues. The ones remaining are the two Belarusian terms and a couple of Wiktionary-space pages. Benwing2 (talk) 03:24, 7 September 2024 (UTC)Reply

@Benwing2 @Chuck Entz I'll deal with the Wiktionary-space ones. It because I disallowed links in IPA inputs, because they were overcomplicating the code and nothing was using them in mainspace. Theknightwho (talk) 13:55, 7 September 2024 (UTC)Reply

warnings

Latest comment: 8 days ago26 comments3 people in discussion

There are so many warnings, most of them trivial and inconsistent and often contradicted by Wiktionary:Entry layout, that they just become a pink blur. Like leaving or not leaving a blank line before the first header - how is anyone supposed to know when to and when not to? If I add a blank line, I get a warning not to add it, and if I remove it, I get a warning to add it. If I copy and paste the formatting from an article, I'll get masses of warning even though there are none when I edit the original. As for manually adding a generic boilerplate, shouldn't that be automated? Certainly the "cleanup" can be automated. kwami (talk) 05:02, 13 September 2024 (UTC)Reply

@Kwamikagami You are literally the only long-term user who struggles with this. Enough with the excuses. Adding {{auto cat}} is something newbies learn within a month (tops), yet somehow these basic skills manage to still evade you. The fact the layout confuses you is honestly baffling - you trigger those edit filters 10 times more than any other user, when the layout is not complicated, so you obviously just cannot be bothered to learn it. It says a lot about how sloppy your editing is, quite frankly. Theknightwho (talk) 05:07, 13 September 2024 (UTC)Reply

It's new to me. It wasn't used on the categories I learned on. It's easier than choosing them manually, actually, and I'll try to remember to it. I don't mind that. What's annoying are your constant passive-aggressive complains when simple corrections would be easier all around, e.g. "use {auto cat} rather than manual cats" or some such. kwami (talk) 05:15, 13 September 2024 (UTC)Reply

@Kwamikagami I'm not being passive-aggressive; I'm actively telling you I am annoyed. I'm not expecting perfection, but I'd be a lot less annoyed if you didn't argue about it every time. Theknightwho (talk) 05:20, 13 September 2024 (UTC)Reply

Whinging, then. If I'm doing something wrong, just tell me what is wrong. I'm perfectly happy to follow a simple, consistent instruction like "use {auto cat} rather than adding categories manually."

As for the layout confusing me, yes, if I get an auto-warning for leaving a blank line, so I remove it and then get a warning for not leaving a blank line, I give up. kwami (talk) 05:44, 13 September 2024 (UTC)Reply

@Kwamikagami I tried that several times, and it made absolutely no difference. This is precisely the kind of attitude I was referring to: there's always an excuse, it's always someone else's fault, and the only common theme is that you're never to blame. It's predictable and boring. Theknightwho (talk) 06:06, 13 September 2024 (UTC)Reply

I don't recall you ever telling me about this before. If you have, my bad. A reminder is now on my user page.

I'm often to blame. I'm often wrong. I just respond better if people correct me with a simple correction rather than pinging me with (what you say is not) passive-aggressive complaints of how I'm forcing them to do something, without ever telling me how I can do it myself -- e.g. you still haven't replied below on where I can find this list I've generated, so I can take responsibility for it. Complain that I'm making you do something, but don't provide me with the details so I can do it myself -- that's why I called it "passive-agressive". kwami (talk) 06:19, 13 September 2024 (UTC)Reply

@Kwamikagami I never said there was a list of issues you've caused (something you made up out of nowhere) - I said that you need to stop forcing the rest of us to clear up after you, because I have (repeatedly) found myself having to clear up large numbers of issues caused by you; something other people have pointed out as well, which is why you received a long block not that long ago. Theknightwho (talk) 06:25, 13 September 2024 (UTC)Reply

An assumption, based on the last time that you complained I was forcing you to do things but wouldn't tell me how to fix it myself. That time I found the list on my own and went through and cleaned up my mess. This time you said I "trigger those edit filters 10 times more than any other user". I assume that means there's a list somewhere to tell you that; otherwise I don't know how you could possibly tell. kwami (talk) 06:36, 13 September 2024 (UTC)Reply

@Kwamikagami All edit filters log hits. The publicly available logs are here: [18]. Theknightwho (talk) 15:35, 13 September 2024 (UTC)Reply

Thank you. I'll see what I can clean up. kwami (talk) 18:19, 13 September 2024 (UTC)Reply

Weird. Found a series of errors to correct, but they do not exist in the edit history of the article. So, if I get a warning that e.g. I misspelled a header, it generates an error even if I fix the error before I save? The way I was adding refs, I was using the warning to provide me with the text to post on the page as the quickest way to do that. kwami (talk) 18:37, 13 September 2024 (UTC)Reply

@Kwamikagami Oversimplifying things, there are three "levels" of abuse filter: level 1 just tags the edit in the edit summary (e.g. "new-L2"), and level 3 simply disallows the edit altogether (e.g. trying to add a raw replacement character �). Level 2 prevents the edit and issues a warning the first time the user presses save, but will allow the edit to go through if they press save again. That's the setting we use for the WT:NORM filters. The first and second save attempts both get recorded in the logs, but the first will say "Actions taken: Warn", and the second "Actions taken: None". The ones that need fixing will be the second kind. Theknightwho (talk) 19:21, 13 September 2024 (UTC)Reply

Thanks. I assume there's a way to filter the results for quicker review, but I'll leave that for another day.

I'm going down a rabbit hole with the errors caused by adding {auto cat}. I would think that phonetic symbols (IPA, UPA, NAPA etc.) would be under the 'communication' topic, but I think I should leave that to someone who knows what they're doing. kwami (talk) 19:26, 13 September 2024 (UTC)Reply

It's hard for me to be sure, since I see all the options that ordinary users can't, but I believe you should have a link from your contributions page for just your abuse-log hits. Chuck Entz (talk) 19:45, 13 September 2024 (UTC)Reply

Yes, that works, thanks. From there it's easy to filter out e.g. 'new/fewer L2s' by eye, since that's 90% them. kwami (talk) 19:53, 13 September 2024 (UTC)Reply

@Kwamikagami So the category tree is divided into two parts, which are (for reasons of convention) distinguished by their name formats: "NAME X" categories are sets of terms which are strictly described by the name of the category (e.g. Category:English adjectives contains all adjectives in English), whereas "CODE:X" categories are any terms that relate to that particular topic (e.g. Category:en:Adjectives contains English terms relating to adjectives, which don't have to be adjectives themselves - in fact, they're mostly types of adjective, i.e. nouns). This means Category:IPA symbols, Category:UPA symbols etc. would best fit under Category:Translingual symbols, in my view, since their members are literal symbols. Theknightwho (talk) 19:56, 13 September 2024 (UTC)Reply

That's the category I added to them manually, but in order for {auto cat} not to generate an error, I need to add them, with that category, under one of the few dozen general topics in the tree. I assume that would be 'communication'? I didn't want to do that without being sure. kwami (talk) 20:02, 13 September 2024 (UTC)Reply

@Kwamikagami What I mean is that Category:IPA symbols, Category:UPA symbols etc. should be categorised in Category:Translingual symbols. I can change the relevant submodule of Module:category tree to facilitate that. Theknightwho (talk) 20:17, 13 September 2024 (UTC)Reply

Okay. I'll remove the manual category. kwami (talk) 20:21, 13 September 2024 (UTC)Reply

It's weird that some pages, such as ᴫ, don't care if there's a blank line before the first header. kwami (talk) 20:38, 13 September 2024 (UTC)Reply

@Kwamikagami What do you mean? Theknightwho (talk) 20:59, 13 September 2024 (UTC)Reply

It's had a blank line since this edit in May, until I removed it today, and AFAICT has never generated a warning. I've noticed that elsewhere, e.g. when copying and pasting the structure/outline, one article will be fine, then the next will generate an error. kwami (talk) 21:40, 13 September 2024 (UTC)Reply

@Kwamikagami I significantly revamped that edit filter towards the end of 2023 - the old version would miss things like this. Checking that edit against the filter's regex using https://regex101.com/, the current version would have caught it. The WT:NORM filters also only trigger if it's a new issue (i.e. once the blank line is already there, it won't issue a new warning every time the page is edited). Theknightwho (talk) 22:06, 13 September 2024 (UTC)Reply

Okay. It'll take me some time to get used to which errors get generated every time and which don't, though I can always check my abuse filter. kwami (talk) 22:19, 13 September 2024 (UTC)Reply

And if you point me to a cleanup list that I generated, I can clean it up. kwami (talk) 05:18, 13 September 2024 (UTC)Reply

Proposal for Module:Unicode data

Latest comment: 8 days ago2 comments2 people in discussion

Hi, I see you're updating Appendix:Unicode, would you mind if I ask you to add new patterns for Appendix:Unicode/Variation_Selectors and Appendix:Unicode/Egyptian_Hieroglyphs_Extended-A at Module:Unicode data? Specifically something like this:

The purpose is so the names will be automatically generated and don't need to be manually defined at Module:Unicode_data/names. I have tested and added the patterns for Indonesian Wiktionary's version of the appendix and it's proven to work.

Thank you! Ekirahardian (talk) 16:10, 13 September 2024 (UTC)Reply

@Ekirahardian Sure - that's fine, since they're procedurally generated. Theknightwho (talk) 16:22, 13 September 2024 (UTC)Reply

"丟那媽" is a euphemistic misspelling?

Latest comment: 3 days ago2 comments2 people in discussion

You added '丟那媽" to the compounds section on the page for 媽. I wanted to ask if you knew where this phrase came from so that I could create a page for it calling it a euphemistic misspelling. But want I want to know is if you know it's a common misspelling or a common euphemism that needs its own entry. Dengzeren08 (talk) 23:12, 18 September 2024 (UTC)Reply

@Dengzeren08 It comes from here. Theknightwho (talk) 23:23, 18 September 2024 (UTC)Reply

Add topic

Theknightwho

Archives

huh?

Removing transliteration from reference templates

Good idea

Ordering

Solombala English

pagename breakage in Module:headword

Yo, why'd you do this revert?

In case you haven't noticed

[Untitled]

ᠬᠥᠭᠵᠢᠮ

(mock-)Persian σάτρα

changelog messages

'terms with audio links'

Shit

"you already know that we don't do hard redirects in mainspace"

Breath of Fresh Air

Invalid params in call to Template:cite-book: tr=; nocat=; termlang=; subst=; journal=

Orthography invented for Wikt?

"Use your brain"

Module timeouts at 噫

"you couldn't even be bothered to correct the lowercase form"

template parser error

Lua errors

Template parser errors

ancestor vs. parent terminology

Template parser fails in CAT:E

bug in Module:string

recent breakages

Thanks for the help

affix breakage

your snarky comments

strange transclusions

I

old lombard

Edit summary on don't tread on me

hot dog too?

Stalking/harassment, again

WhatLinksHere Weirdness

Enabling "oftext" parameter for {{doublet}}

Etymology of derogatory terms

questions about aliases of parameters with special properties

usage notes

Why removed templates?

Add NKD2

Luftgeschaeft's Pronounciation

Asaf's Pronounciation

defaulting `etym_lang = true` in Module:parameters?

Odd module error at 軍

bad deletion of fight city hall

Japanese errors

Edit request

Japanese errors and categories again

Do not ever restore personal attacks, for any reason

Emptying category

CJKV transclusion fun

Vandal

Hindustani

"Terms spelled with" categories

Template internals showing in module error from Module:category tree/poscatboiler/data/lang-specific/jpx

redirects

Why've you removed the prononciation /ɛː/ for the SQUARE vowel?

rfap bugfix bugfix

Error thrown but probably okay

Romanization of 炎（ほのお）

ζώω

"not comparable"

CAT:E again

warnings

Proposal for Module:Unicode data

"丟那媽" is a euphemistic misspelling?

Enabling "oftext" parameter for `{{doublet}}`